LWN.net Logo

The R Project for Statistical Computing

The R project is building an open-source GPL-licensed language for statistical computing and graphics, R has its roots in the S language, which was originally developed by AT&T's Bell Labs. See the Evolution of S document for a complete history of the language. The R project was originally started at the University of Auckland, it now includes a lengthy list of contributors. R is being developed under the guidance of The R Foundation for Statistical Computing.

[R] The What is R? document describes R:

R can be considered as a different implementation of S. There are some important differences, but much code written for S runs unaltered under R. R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible. The S language is often the vehicle of choice for research in statistical methodology, and R provides an Open Source route to participation in that activity.

The R environment contains an integrated set of software tools including:

  • A data storage facility.
  • A suite of matrix and array calculation operators.
  • A collection of intermediate tools for data analysis.
  • On-screen and printed graphical output for data analysis.
  • An interpreted programming language for manipulating data.
To see R in action, take a look at some of the Screen Shots. The R project's manuals are available (in PDF format) on the project documentation page. Further information is available from the R FAQ document, including a lengthy list of add-on packages.

Version 2.0.0 of R was released this week. "This new release marks more a coming of age than a radical change of the product. Since the release of 1.0.0 on February 29, 2000, R has developed steadily and settled on a release cycle with a "dot-release" two times per year."

New features available in R 2.0.0 include:

  • Support for namespaces.
  • Exception handling constructs.
  • Support for formal methods and classes.
  • Improved garbage collection.
  • Generalized I/O objects.
  • A new grid subsystem for graphics.
  • A lattice package for producing multi-frame layouts.
  • A port to Mac OSX.
  • Support for Tcl/Tk-based GUI development.
  • The bundling of widely used packages.
  • Improved configuration scripts.
  • Bug fixes.
The CHANGES document has a more detailed list of information on the new version.

If you are looking for an extensive set of tools for visualizing data, R is certainly worth investigating. The source code for R is available from the The Comprehensive R Archive Network (CRAN).


(Log in to post comments)

The R Project for Statistical Computing

Posted Oct 7, 2004 16:27 UTC (Thu) by dougm (guest, #4615) [Link]

I'll just mention that there is a project called PL/R that allows you to write PostgreSQL server-side functions in R. This could be useful if you need to do extensive statistics calculations on data stored in PostgreSQL...

I have no connection with the project, not even as a user, but I've heard good things about it on the mailing lists.

http://www.joeconway.com/plr/

The R Project for Statistical Computing

Posted Oct 7, 2004 19:05 UTC (Thu) by Wills (guest, #1813) [Link]

I can highly recommend R; it is probably the best statistics software for Linux. It has lots of advanced stats techniques together with its own functional programming language and a huge range of add-on packages. It also includes

Another interesting package which I highly recommend is Albert Gräf's Q (http://q-lang.sourceforge.net/)>, a powerful functional/equational programming language which now has a set of Q multimedia examples including audio and MIDI with a KDE interface.

R and emacs

Posted Oct 8, 2004 14:16 UTC (Fri) by ecashin (subscriber, #12040) [Link]

If you are an emacs user and an R user, there's
the "Emacs Speaks Statistics" package for emacs.

It makes R even more convenient to use. It was
a great help in managing and visualizing test data
when I was a grad student.

The R Project for Statistical Computing

Posted Oct 8, 2004 15:46 UTC (Fri) by eddelbuettel (subscriber, #7053) [Link]

Small correction: The 'new features in R 2.0.0' segment in the article
confuses most of the achievements with thoses added in the 4.5 years since R
1.0.0 came out -- see the original press release by the R Core team for the
difference.

That said, R is a terrific system for all kinds of data work.

Odd writing style

Posted Oct 8, 2004 16:58 UTC (Fri) by dberkholz (subscriber, #23346) [Link]

In journalism, hard news stories are written in the inverted pyramid style, with the real news at the top and background farther down. This is the opposite -- we read through all the blah blah background about R, trying to dig for what's actually new about this (2.0.0 is released).

It would be great to see stories written like that in the future.

Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds