Plotting tools for Linux: gnuplot
Gnuplot is a program for creating plots, charts, and graphs that runs on Linux as well as on a wide variety of free and proprietary operating systems. The purpose of a plot, in general, is to help to understand data or functional relationships by representing them visually. Some plotting programs, including gnuplot, may perform calculations and massage data, which can also be convenient.
Some data-plotting tools are complete solutions, standalone programs that can be controlled through a command line, a GUI, or both. Others exist as subsystems of various tools, or as libraries available for a specific programming language. This article will introduce a prominent example of the first type.
Gnuplot is one of the earliest open-source programs in wide use. It's free enough to be packaged with Debian, for example, but has an idiosyncratic license, with unusual restrictions on how modifications to the source code may be distributed. The name is not derived from the GNU project, with which it has no particular relationship, but came about when the original authors, who had decided on the name "newplot", discovered that this name was already in use.
You may already be using gnuplot without knowing it. The plotting facilities of Maxima, Octave, gretl, the Emacs graphing calculator, and statist, for example, all use gnuplot.
Most of gnuplot is written in C and is quite fast and memory-efficient. Its output is highly customizable, and can be seen in a multitude of scientific and technical publications. It's also a popular choice with system administrators who want to generate graphs of server performance, as it can be run from a script on a remote machine and forward its graphs over X11, without having to transfer the usually voluminous data sets. The same arrangement makes gnuplot useful for monitoring the progress of simulations running on remote machines or clusters.
Gnuplot has an interactive command-line prompt, can run script files stored on disk, can be controlled through a socket connection from any language, and has interfaces in everything from Fortran to Clojure. There are also several GUI interfaces for gnuplot, including an Emacs mode, that are not too widely used, since much of gnuplot's power arises from its scriptability.
Installation
Gnuplot is actively developed, with desirable new features added regularly. If you have Octave or Maxima installed, then you already have gnuplot somewhere, although you might not have a recent version. Binaries are probably available from your distribution's package management system, but they are likely to lag approximately one major version behind the shiniest.
The solution is to follow the Download link from gnuplot headquarters to get the source tarball of the latest stable release (or a pre-release version if you can't live without some feature in development). A simple ./configure and make will get you a working gnuplot, but you probably want to check for some dependencies first.
Having the right packages installed before compiling gnuplot will ensure that the resulting binary supports the "terminals" that you want to use. In gnuplot land, a terminal is the form taken by the output: either a file on disk or a (possibly interactive) display on the screen. Gnuplot is famous for the long list of output formats that it supports. You can create graphs using ASCII art on the console, in a canvas on a web page, in various ways for LaTeX and ConTeXt, as a rotatable, zoomable object in an X window, for Tektronix terminals, for pen plotters, and much else, including Postscript, EPS, PNG, SVG, and PDF.
Support for most of this will happen without any special action on your part. But you will want to make sure that you have compiled in the highest quality, anti-aliased graphics formats, using the Cairo libraries; this makes a noticeable difference in the quality of the results. You will need to have the development libraries for Cairo and Pango installed. On my Ubuntu laptop installation of the packages libcairo2-dev and libpango1.0-dev are sufficient for the latest stable (v. 4.6.6) gnuplot version. Pick up libwxgtk2.8-dev while you're at it: it will add support for a wxWidgets interactive terminal that's a higher quality alternative to the venerable X11 display. Finally, if you envision using gnuplot with LaTeX, you might want the Lua development package, which enables gnuplot's tikz terminal.
Using gnuplot
Gnuplot comes with extensive help. For extra information about any of the commands used below, try typing "help command" at the gnuplot interactive prompt. For more, try the official documentation [PDF], the many examples on the web, or the two books about gnuplot: one by Philipp K. Janert and one by me. The command stanzas here can be entered as shown at the gnuplot prompt or saved in a file and executed with: gnuplot file.
Here is how to plot a pair of curves:
set title 'Bessel Functions of the First and Second Kinds'
set samp 1000
set xrange [-.05:20]
set y2tics nomirror
set ytics nomirror
set ylabel 'Y0'
set y2label 'J0'
set grid
plot besy0(x) axes x1y1 lw 2 title 'Y0', besj0(x) axes x1y2 lw 2 title 'J0'
The set ytics etc. commands create independent sets of tics and labels on the two vertical axes. The final line illustrates the usual form of gnuplot's 2D plot command, and some of the program's support for special functions. The axes parameters tell gnuplot what axis to associate with which curve, lw is an abbreviation for "linewidth" (gnuplot's default is pretty thin), and each curve has an individual title assigned, which is used in the automatically generated legend. The sequence of colors used to distinguish the curves is chosen automatically, but can, of course, be specified manually as well.
Gnuplot also excels at all kinds of 3D plots. Here is a surface plot with contours projected on the x-y plane. There is a vector field embedded in the surface as well.
set samp 200
set iso 100
set xrange [-4:4]
set yrange [-4:4]
set hidd front
set view 45, 75
set ztics .5
set key off
set contour base
set style arrow 1 filled lw 3 lc 'black'
f(x,y) = x**2+y**2 < 2.0 ? x**2+y**2 > 0.5 ? besj0(x**2+y**2) : NaN : NaN
splot besj0(x**2+y**2), '++' using 1:2:(f($1,$2)):\
( -.5*sin(atan2($2,$1)) ):( .5*cos(atan2($2,$1)) ):(0)\
every 4:2 w vec as 1
The set hidd front command has the effect of making the surface opaque to itself but transparent to the other elements in the plot. The set style command is an example of gnuplot's commands for defining detailed styles for lines, arrows, and anything else that can be made into a plot element. After this command is entered, arrowstyle 1 (or as 1) can be referred to wherever we want a black arrow with a filled arrowhead.
This script defines a function, f(x,y), using gnuplot's ternary notation (with an embedded ternary form to implement two conditions) in concert with NaNs, to skip a range of coordinates when plotting. The function is used on the following line to plot the vector field over only part of the surface.
Two additional details may be worth noting in this example. First, in gnuplot, NaN (for "not a number") is a special value that you can use in conditional statements where you want to disable plotting, as we did here. You can also use "1/0" and some other undefined values, but using NaN makes the code easier to understand. Second, gnuplot's ternary notation is borrowed from C. In the statement
A ? B : C
B will be executed if A is true, otherwise C will be executed. In order to have two conditions, as we have here, B needs to be replaced by another ternary statement.
The splot command is the 3D version of plot. The part before the comma plots our Bessel function again, this time as a surface depending on x and y. The rest of it plots the vector field of a circular flow as an array of arrows originating on the surface. Vector plotting uses gnuplot's data graphing syntax, which refers to columns of data ($1 and $2 instead of x and y). There are six components per vector, for the three spatial coordinates on each side of the arrow. Finally, the every clause skips some grid points to avoid crowding, and we invoke our defined arrow style at the end.
LaTeX support
Gnuplot can integrate with the LaTeX document processing system in several ways. Most of these allow gnuplot to calculate and draw the graphic elements while handing off the typesetting of any text within the plot (including, of course, mathematical expressions) to LaTeX. This is desirable because, first, TeX's typesetting algorithms produce superior results, and, second, the labels that are typeset as part of the graph will harmonize with the text of the paper in which it is embedded. The results look like the figure here, which is a brief excerpt from an imaginary math textbook.
Notice that the fonts used in the figure labels and the text in the paragraph are the same — everything is typeset by LaTeX (even the numbers on the axes).
There is a two-step procedure to produce this result. First, we create the figure in gnuplot, using the cairolatex terminal:
set term cairolatex pdf
set out 'fig3.tex'
set samp 1000
set xrange [-4:4]
set key off
set label 1 '\huge$\frac{1}{\sqrt{2\pi}\sigma}\,e^{-\frac{x^2}{2\sigma^2}}$' at -3.5,.34
set label 2 '\Large$\sigma = 1$' at 0.95,.3
set label 3 '\Large$\sigma = 2$' at 2.7,.1
plot for [s=1:2] exp(-x**2/(2*s**2))/(s*sqrt(2*pi)) lw 3
set out
We've used LaTeX syntax for the labels. Running this through gnuplot creates a file called fig3.tex, which we include in the LaTeX document, listed in the Appendix.
The final step is to process the document with pdflatex. This is just one of several workflows for integrating gnuplot with LaTeX. If you use tikz to draw diagrams in your LaTeX documents, for example, you can extend it with calls to gnuplot from within the tikz commands.
Gnuplot and LaTeX share a family resemblance. They are both early open-source programs that demand a certain amount of effort on the part of the user to achieve the desired results, but that repay that effort handsomely. They're both popular with scientists and other authors of technical publications. Both programs are unusually extensively documented by both their creators and a cadre of third parties. And both systems, originating in an era of more anemic hardware, do a great deal with a modest amount of machine memory. Gnuplot has a good reputation for the ability to plot large data files that cause most other plotting programs to crash or exhaust the available RAM.
Analysis
Gnuplot can do more than just plot data and functions. It can perform several types of data analysis and smoothing — nothing like a specialized statistics platform, but enough to fit functions or plot a smoothed curve through noisy data. To illustrate, we first need to create some noisy data. The Appendix contains a little Python program that will write the coordinates of a Gaussian curve to a file, called rn.dat, with some pseudorandom noise added to the ordinates.
Suppose we are presented with this data and we want to fit a function to it. Since it looks bell-shaped to us, we'll attempt to fit a Gaussian. That kind of curve has two parameters, its amplitude and its width, or standard deviation. We could write a program to search the parameter space of these two numbers to optimize the fit of the curve to the data, or we could ask gnuplot to do it for us. Gnuplot's built-in fitting routine is invoked like this:
fit a*exp(-b*x**2) 'rn.dat' via a,b
After typing that command into gnuplot's interactive prompt, it will return its best guess for the free parameters a and b, as well as its confidence in its estimates. It also remembers the estimated values, so we can plot the fit function on top of the data:
plot 'rn.dat' pointtype 7, a*exp(-b*x**2) lw 5 lc 'black'
gets us this plot:
The pointtype specifier selects the style of marker used in the scatterplot of the data. There is a different list for every terminal type, which you can see by typing test at the gnuplot prompt. We've selected a thick line width (lw 5) and a black line color (lc 'black').
Gnuplot is endowed with some simple language constructs providing blocks, loops, and conditional execution. This is enough to do significant calculation without having to resort to external programs. Using looping, you can create animations on the screen. Try the following gnuplot script to get a rotating surface plot:
set term wxt persist
set yr [-pi:pi]
set xr [-pi:pi]
end = 200.0
do for [a=1:end] {set view 70, 90*(a/end); splot cos(x)+sin(y); pause 0.1}
The first line tells gnuplot not to delete the window after the script is complete, which it will otherwise do if these commands are not run interactively. The last line contains the loop that creates the animation. The pause command adds a tenth of a second delay between each frame.
Conclusion
Gnuplot in the wild is not a rare encounter. Its output can be found in many of the math and science entries on Wikipedia; my article about calculating Fibonacci numbers; the book Mechanics by Somnath Datta, an example of a complex text with closely integrated intricate plots, using LaTeX and gnuplot; the book Modeling with Data: Tools and Techniques for Scientific Computing by Ben Klemens, using gnuplot’s latex terminals; and the free online text Computational Physics by Konstantinos Anagnostopoulos, just to give a few examples. In the system administrator field, check out the articles on benchmarking Apache, graphing performance statistics on Solaris, and using gnuplot with Dstat.
Gnuplot is a good choice if you have large data sets, if you prefer a language-agnostic solution, if you need to automate your graphing, and especially if you use LaTeX.
| Index entries for this article | |
|---|---|
| GuestArticles | Phillips, Lee |
