User: Password:
|
|
Subscribe / Log in / New account

Development

Using open-source tools for documenting research

January 18, 2006

This article was contributed by Carl Bolduc

Introduction

Getting published is a major concern for students conducting graduate studies in science. I'm a PhD student in molecular biology and I started using Linux at the beginning of my graduate studies. Public science research mostly looks like open-source software development. You work hard and give your methods and results to everyone through publications in scientific journals. Ironically, the majority of people working in the field of science use only proprietary software. I myself work in a Microsoft Windows environment.

A typical scientific article will require the use of several tools to reach its final published state. First, most researchers use Microsoft Word and Excel for text writing and tables. They also use EndNote to manage and create the bibliographies you will find in every scientific article. Finally, scientists use a graphics suite, such as Adobe's Photoshop, for figures and PDF creation. This software listing scales up to more than one thousand dollars. It's practically impossible for the regular student to purchase such a platform. In some laboratories, when the head researcher is kind enough, you will find a computer where most of these tools are installed and shared by all members of the team. But what if you could create your own open-source research writing box for free? In fact, you can. You can accomplish the entire array of tasks associated with scientific writing with any good Linux distribution.

The easiest step

One of the most popular open source application that has boosted the Microsoft to Linux transition is certainly OpenOffice.org. For anybody working in science reporting, it is a first and easy step that enables you to step out of proprietary software and remain compatible with Microsoft Office formats. In addition, several journals will ask that the submissions should be in the .doc or PDF format. OpenOffice.org saves you a lot of trouble with its useful PDF export tool.

Although OpenOffice.org can complete a fair portion of the job, it doesn't contain a bibliographic manager tool such as EndNote yet. Such a facility is necessary for academic writer and OpenOffice.org is supposed to fill the blank with some bibliographic extensions in its next version. For now, there is a commercial web-based tool called WriteNote which offers a 30 day free trial and enables you to produce a bibliography with RTF files created by OpenOffice.org.

LaTeX

While OpenOffice.org may be a first step toward writing scientific articles under Linux, the true power resides inside LaTeX. As it is mentioned on the latex project website: "LaTeX is a high-quality typesetting system, with features designed for the production of technical and scientific documentation". "LaTeX is the de facto standard for the communication and publication of scientific documents."

Some of the LaTeX features include insertion of tables and figures as well as the capacity to create complex mathematical equations. Additionally, there are tremendous advantages in learning to write with LaTeX. In fact, BibTeX could get you out of proprietary software tomorrow. You can gather your bibliographic references in a simple text file with the BibTeX syntax and easily insert quotations inside your LaTeX documents, automatically generating a bibliography at the end of your articles.

While the LaTeX format requires a minimum of learning, you can rely on the useful TeXmed web-based tool to query NCBI PubMed and generate BibTeX entries for you. You must specify, in your LaTeX document, a bibliography style to format it according to the journal's recommendations. In fact, many journals now offer their bibliography style on their website. If you can't find the format that you need on the web, you can use custom-bib to create the style you need.

LyX and friends

LaTeX basics can be learned quite easily, but you might need to read a lot from the web or buy some books (like I did) to use its full potential. But do you really need to go through all this trouble? That's where LyX comes into play.

LyX is a GUI front end to LaTeX. Though it has its own file format, it can import and export to LaTeX. LyX looks like a word processor while taking care of all the formatting, just like LaTeX. LyX is fully featured and let you insert figures, tables, mathematical equations and more. Though managing a bibtex text file is very easy, you can rely on graphical tools here too. Software like gBib and JabRef will help you deal with your numerous references and even let you insert them in LyX, just like EndNote does with Word.

Gnuplot

Continuing in your path to build an open-source research writing box, you need a powerful tool to generate plots from your precious experiment results. That is where Gnuplot enters the scene, with its almost limitless possibilities. Gnuplot is a command-line plotting utility with easy to learn commands that enable you to create high quality 2D and 3D plots suitable for scientific publications. It can output LaTeX and EPS code which can be inserted in your LaTeX documents. You can check out this demo that shows the wide variety of Gnuplot's capabilities.

Inkscape

One thing that was really missing in Linux in the past was a good vector graphics editor. I had to install Adobe Illustrator under Wine to be able to draw high quality figures showing various metabolic pathways. Now, with Inkscape, I have everything that I need to create high quality vector graphics which can be exported to EPS and inserted in my LaTeX documents. Inkscape can draw shapes, paths, text and can also export to PNG.

The Gimp

To complete your open-source research writing box, you need a powerful image manipulation program to process your photos and to generate figures from them. That's where The Gimp comes into play. With The Gimp, you can process gel photos, crop the area that you like, obtain negatives of your originals and add labels where you want, all with a few mouse clicks.

Linux drawbacks

While this path can be rewarding, a significant effort will be required. The first thing you need to do is to install a Linux distribution. This might seem frightening to the newcomer, but there are powerful Linux distributions such as Mandriva, Fedora and Ubuntu which are very easy to install and have packaged most of the tools mentioned it this article.

You also need to learn how to use new software. A few of the applications mentioned above only have a command-line interface, but most operations can be performed using GUI-based tools. There is plenty of documentation online, and you can always join an IRC channel to get live help. In a short time, you will become very functional, and you will reach new levels of productivity.

The worst drawback of using Linux in a Microsoft-based environment may involve compatibility issues with your coworkers. Since my boss insists on working with .doc files, I have to convert my papers to RTF using latex2rtf before I send him anything, even if PDF is the most portable format out there. But this doesn't stop me from benefiting of the LaTeX functionality.

Finally, you must rely on the Internet for support. Most of the system administrators in the research field don't know much about Linux (at least not in Quebec, where I'm working) and won't be able to support you if you have problems.

Linux superiority

Beside the fact that Linux and all this software is free, there are many advantages in building an open-source research writing box. Linux provides a robust environment that is a virtually virus-free. Interoperability among applications is quite good, all of the applications mentioned in this article can share data through the LaTeX and EPS file formats.

With little experience, you will start working faster and more efficiently. Serious page formatting issues found in Windows-based WYSIWYG software will be gone. Finally, you will be able to easily share your work by creating high quality PDF files.

An example screenshot of my desktop publishing environment can be seen here.

Comments (40 posted)

System Applications

Database Software

MySQL 5.1.5-alpha has been released

Version 5.1.5-alpha of the MySQL database has been released. "This is a new alpha development release, adding new features and fixing recently discovered bugs."

Full Story (comments: none)

New Event Feature in MySQL 5.1.6

Trudy Pelzer explains events under MySQL version 5.1.6. "In this article, I'll give a preliminary description of a new MySQL feature for scheduling and executing tasks. In version 5.1.6, MySQL has added support for events. That is, you can now say: "I want the MySQL server to execute this SQL statement every day at 9:30am, until the end of the year" -- or anything similar that involves any number of SQL statements, and a schedule. Note that events are new and still in alpha, so there is still a good chance that we'll have to make adjustments as people experiment with them. This article describes the state of affairs only for the 5.1.6 release of MySQL."

Comments (5 posted)

Embedded Systems

BusyBox 1.1.0 is out

Version 1.1.0 of BusyBox, a compressed collection of command line tools for embedded systems, has been released. "The new stable release is BusyBox 1.1.0. It has a number of improvements, including several new applets. (It also has a few rough spots, but we're trying out a "release early, release often" strategy to see how that works. Expect 1.1.1 sometime in March.)"

Comments (none posted)

KLone 1.0.1 released

Version 1.0.1 of KLone, a small embeddable web server, has been released. LinuxDevices.com is also running a review of KLone. (Thanks to Steven Dorigotti.)

Comments (none posted)

Web Site Development

Campsite 2.4 Released

Version 2.4 of Campsite, a multi-lingual content management system for news websites, is available. "Version 2.4 is a major feature release."

Full Story (comments: none)

Desktop Applications

Audio Applications

SilentJack version 0.1 announced

The initial version of SilentJack is available. "SilentJack is a silence/dead air detector for the Jack Audio Connection Kit. It monitors the peak levels on a single JACK input port, and checks to see if they are below a specified theshold. SilentJack then runs a command after silence has been detected for a given number of seconds. It then waits for the command the finish, and waits for a grace period before detecting silence again."

Full Story (comments: none)

Sweep 0.9.0 Released

Version 0.9.0 of Sweep, a graphical audio file editor, is out. This is the first release of a new unstable series, it includes a switch to GTK2, improved mp3 capabilities, translation work, and more.

Full Story (comments: none)

Data Visualization

Gmsh 1.62 announced

Version 1.62 of Gmsh, a 3D finite element grid generator, has been announced. "This release adds a new option to draw color gradients in the background, an enhanced perspective projection mode, a new "lasso" selection mode, a new snapping grid when adding points in the GUI, a new extrusion syntax and nicer normal smoothing. This release also contains various small bug fixes and enhancements."

Comments (none posted)

Desktop Environments

GNOME Software Announcements

The following new GNOME software has been announced this week: You can find more new GNOME software releases at gnomefiles.org.

Comments (none posted)

Formation of the KDE Technical Working Group in Progress (KDE.News)

KDE.News covers the formation of the KDE Technical Working Group. "The first Technical Working Group for KDE is now being formed, with elections due over the next few weeks. The Group will help the hundreds of KDE contributors come to technical decisions and smooth processes such as major releases. It will also provide technical guidance to KDE contributors."

Comments (none posted)

KDE Software Announcements

The following new KDE software has been announced this week: You can find more new KDE software releases at kde-apps.org.

Comments (none posted)

Desktop Publishing

LyX 1.3.7 is released

Version 1.3.7 of LyX, a GUI front-end to the TeX typesetting system, is out. Changes include support for the new file format 245 standard, improvements to the Windows version and more.

Full Story (comments: none)

Electronics

Covered 20060109 released

Version 20060109 of Covered, a Verilog code coverage analysis tool, is available. "It has been almost a year since the last development release of Covered, but in the meantime there has been a lot of work put into the score command of Covered during this time to fix bugs, add more coverage support for various Verilog constructs, simulate more accurately, remove memory corruption/estrangement and improve the run-time speed of the score command. I think that user's of Covered will appreciate the enhancements. Documentation updates have been made and build problems have been fixed (Covered now compiles cleanly for Fedora Core 3 builds)."

Comments (none posted)

Kicad 2006-01-13 released

Version 2006-01-13 of Kicad an electronic printed circuit board CAD package, is out with a bug fix and one new feature.

Comments (none posted)

GUI Packages

pyFltk-1.1RC2 announced

Release candidate 2 of pyFltk-1.1, a Python binding to FLTK 1.1, has been announced. "This release candidate has been tested with fltk-1.1.6 and requires Python2.4."

Comments (none posted)

SPTK 3.0.12 announced

Version 3.0.12 of SPTK, the Simply Powerful ToolKit, has been announced. "SPTK 3.0.12 adds support for the database driver messages. These messages may be sent to the driver by the database server on different occasions. These messages may include the extended error information, and the messages created by a stored procedure using (for MSSQL, for instance) PRINT statement."

Comments (none posted)

Mail Clients

Thunderbird 1.5 Released

Thunderbird 1.5 is out. Changes in this release include improvements to the automatic update system, smarter address auto-completion, on-the-fly spelling checking, better searching, some simple phishing detection (covered here last October), the ability to delete attachments, and much more; see the release notes for details. (As seen on MozillaZine)

Comments (none posted)

Office Applications

Beagle Newsletter

Issue 11 of the Beagle Newsletter has been published. "Beagle is a search tool that ransacks your personal information space to find whatever you're looking for. Beagle can search in many different domains."

Comments (none posted)

HylaFAX 4.2.5 released

Version 4.2.5 of HylaFAX, a fax-modem utility, has been announced. "The HylaFAX development team is pleased to announce our 4.2.5 patch level release! This fixes the problems users have been reporting in 4.2.4, which will be removed from the FTP and web sites. As always, our sincerest thanks go to all who participate and provide feedback." Several security fixes are included in this release.

Comments (none posted)

Office Suites

Moving to OpenOffice: Batch Converting Legacy Documents (O'Reilly)

Bob DuCharme writes about the conversion of legacy documents to OO.o in an O'Reilly xml.com article. "Like its Microsoft counterpart, OpenOffice has a macro language. You can start up OpenOffice from the Linux or Windows command-line prompt with instructions to to run a particular macro, and you can even pass a filename as a parameter to that macro. Adding the -invisible switch to the command line tells OpenOffice to start up without the graphical user interface (GUI). Put all these together, and you've got a command line that converts a Microsoft Office file to an OpenOffice file (or an Acrobat file) with no use of the GUI. To convert a hundred files, you can use a Perl script or other scripting language to create a batch file or shell script that has the hundred commands necessary to convert those files."

Comments (none posted)

Streaming Media

Gst-Python 0.10.2 announced

Version 0.10.2 of Gst-Python, has been announced. "Gst-Python provides Python bindings for the GStreamer project. These bindings provide access to almost all of the GStreamer C API through an object oriented Python API. This release allow fractions in structures and added vmethods for base classes. "

Comments (none posted)

Web Browsers

Minutes of the mozilla.org Staff Meeting (MozillaZine)

The minutes from the January 4, 2006 mozilla.org staff meeting have been announced. "Issues discussed include Upcoming Releases, Marketing, Thunderbird, 1.9 Roadmap, Firefox 2 Process and Calendar. The mintues have been posted to the new mozilla.dev.general newsgroup, which is accessible via news.mozilla.org."

Comments (none posted)

Languages and Tools

Caml

Caml Weekly News

The January 10-17, 2006 edition of the Caml Weekly News is out with new Caml articles. Topics include: Pickling for OCaml?, Marching Tetrahedra and A bunch of ocaml programs.

Full Story (comments: none)

Java

An Exception Handling Framework for J2EE Applications (O'ReillyNet)

O'Reilly is running an article on J2EE exception handling. "One common hassle in J2EE development is exception handling: many apps devolve into a mess of inconsistent and unreliable handling of errors. In this article, ShriKant Vashishtha introduces a strategy for predictably collecting your exception handling in one place."

Comments (none posted)

Secure Java apps on Linux using MD5 crypt (IBM developerWorks)

Vladimir Silva shows how to interface Java to the PAM system in an IBM developerWorks article. "If you are a security developer and need to interface a Java application with the local operating system user registry, what do you do? This article gives you the answer: UNIX/Linux PAM (Pluggable Authentication Module)-compatible systems that use authentication based on the GNU MD5 extensions to the crypt() system call. I'll describe these extensions and show you a Java implementation of MD5 crypt (using FreeBSD as my UNIX)."

Comments (none posted)

Lisp

GNU CLISP 2.37 released

Version 2.37 of GNU CLISP, a Common Lisp implementation, is out. "This version adds new options to SOCKET-SERVER, changes the way a proxy can be specified for EXT:HTTP-PROXY, treats named pipes correctly, and fixes a few bugs."

Full Story (comments: 1)

HEUTE 1.0 announced

The initial public release of HEUTE is available. "HEUTE (Hierarchical Extensible Unit Testing Environment for Common LISP) is a unit testing framework written in Common Lisp. It features a hierarchical approach to testing in which a test suite is represented by a CLOS class, with subclasses corresponding to sub-suites. A suite is considered passed only when its sub-suites also pass."

Full Story (comments: none)

Perl

What is Perl 6? (O'Reilly)

chromatic looks at the motivation for designing Perl 6 in an O'Reilly article. "Perhaps the biggest imperfection of Perl 5 is its internals. Though much of the design is clever, there are also places of obsolescence and interdependence, as well as optimizations that no one remembers, but no one can delete without affecting too many other parts of the system. Refactoring an eleven-plus-year-old software project that runs on seventy-odd platforms and has to retain backwards compatibility with itself on many levels is daunting, and there are few people qualified to do it. It's also exceedingly difficult to recruit new people for such a task."

Comments (none posted)

PHP

PHP 4.4.2 and 5.1.2 Released

Version 4.4.2 of PHP has been announced. "This release address a few small security issues, and also corrects some regressions that occurred in PHP 4.4.1. All PHP 4 users are encouraged to upgrade to this release."

Also, development version 5.1.2 of PHP is out. "This release combines small feature enhancements with a fair number of bug fixes and addresses three security issues."

Comments (none posted)

Python

pyPdf 1.1 released

Version 1.1 of pyPdf, a Python-based PDF toolkit, is out. Changes include a new page rotation capability, Improved PDF reading support and PDF 1.5 support.

Comments (none posted)

Dr. Dobb's Python-URL!

The January 18, 2006 edition of Dr. Dobb's Python-URL! is online with new Python articles and resources.

Full Story (comments: none)

Ruby

Ruby Weekly News

The January 15th, 2006 edition of the Ruby Weekly News looks at the latest discussions from the ruby-talk mailing list.

Comments (none posted)

Tcl/Tk

Dr. Dobb's Tcl-URL!

The January 17, 2006 edition of Dr. Dobb's Tcl-URL! is online with the latest Tcl/Tk articles and resources.

Full Story (comments: none)

Page editor: Forrest Cook
Next page: Linux in the news>>


Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds