By Jonathan Corbet
June 22, 2011
It has been clear for some years now that static analysis tools have the
potential to greatly increase the quality of the software we write.
Computers are well placed to analyze source code and look for patterns
which could indicate bugs. The "Stanford Checker" (later commercialized by
Coverity) found a great many defects in a number of free software code
bases. But within the free software community itself, the tools we have
written are relatively scarce and relatively primitive. That situation may
be coming to an end, though; we are beginning to see the development of
frameworks which could become the basis for a new set of static analysis
tools.
The key enabling changes have been happening in the compiler suites.
Compilers must already perform a detailed analysis of the code in order to
produce optimized binaries; it makes sense to make use of that analysis for
other purposes as well. Some of that already happens in the compiler
itself; GCC and LLVM can produce a much wider set of warnings than was once
possible. These warnings are a good start, but much more can be done.
That is especially true if projects can create their own analysis tools for
project-specific checks; projects of any size tend to have invariants and
rules of their own which go beyond the rules for the source language in
general.
The FSF was, for years, hostile to the idea of making it easy to plug
analysis modules into GCC, fearing that a plugin mechanism would enable the
creation of proprietary modules. After some years of deliberation, the FSF
rewrote the license exception for its runtime modules in a way that
addressed the proprietary module worries; since then, GCC has had plugin
module support. The use of that feature has been relatively low, so far,
but there are signs that the situation may be beginning to change.
An early user of the plugin mechanism was the Mozilla project, which
created two modules (Dehydra and Treehydra)
to enable the writing of analysis code in JavaScript. These tools have
seen some use within Mozilla, but development there seems to have slowed to
a halt. The mailing list is moribund and the software does not appear to
have seen an update in some time.
An alternative is GCC MELT. This
project provides a fairly comprehensive plugin which allows the writing of
analysis code in a Lisp-like language. This code is translated to C and
turned into a plugin which can be invoked by the compiler. MELT is
extensively documented; there are also slides from a couple of tutorials on
its use.
MELT seems to be a capable system, but there do not appear to be a lot of
modules written for it in the wild. One does not need to look at the
documentation for long to understand why; the "basic hints" start with:
"You first need to grasp GCC main internal representations (notably
Tree & Gimple & Gimple/SSA)." MELT author Basile
Starynkevitch's 130-slide
presentation on MELT [PDF] does not get past the introductory GCC material
until slide 85. MELT, in other words, requires a fairly deep
understanding of GCC; it's not something that an outsider can pick up
quickly. The lack of easy examples to work from is not helpful either.
More recently, David Malcolm has announced
the release of a new framework which enables the creation of plugins as
Python scripts which run within the compiler. His immediate purpose is to
create tools for the development of the Python system itself; the most
significant checker in his code tries to ensure that object reference
counts are managed properly. But he sees the tool as potentially being
useful for a number of other projects and even for prototyping new features
for GCC itself.
At a first glance, David's gcc-python-plugin mechanism suffers from the
same difficulty as MELT - the initial learning curve is steep. It is also
a very young and incomplete project; David has, by his own admission, only
brought out the functionality he had immediate need for. The analysis code
seems more approachable, though, and the mechanism for running scripts
directly in the compiler seems more natural than MELT's compile-from-Lisp
approach. It may be that this plugin will attract more users and
developers than MELT as a result.
Or it may just be that your editor, being rather more proficient in Python
than in Lisp, naturally likes the Python-based solution better.
In any case, one conclusion is clear: writing static analysis plugins for
GCC is currently too hard; even capable developers who approach the problem
will need to dedicate a significant chunk of time to understanding the
compiler before they can begin to achieve anything in this area. The
efforts described above are a big step in the right direction, but it seems
clear that they are the foundations upon which more support code must be
built. It's hard to say when it will reach the tipping point that inspires
a flood of new analysis code, but it's easy to say that we are not there
yet.
GCC is not where all the action is, though; there is also an interesting static analysis
tool which has been built with the LLVM clang compiler. Documentation
of this tool is scarce, but it appears to be capable of detecting some
kinds of memory leaks, null pointer dereferences, the computation of unused
values, and more. Some patches have been posted to add a plugin feature to
this tool, but they do not seem to have proceeded very far yet.
Back in May, John Smith ran the checker on
several open source projects to see what kind of results would emerge.
Those results have been posted on the net;
they show the kind of potential problems that can be found and the nice
HTML output that the checker can create. Some of the warnings are clearly
spurious - always a problem with static analysis tools - but others seem
worth looking into. In general, the clang static analyzer seems, like the
other tools mentioned here, to be in a relatively early state of
development. Things are moving fast, though; this tool is worth keeping an
eye on.
Actually, that is true of the static analysis area in general. The lack of
good analysis tools has been a bit of a mystery - given the number of
developers we have, one would think that a few would scratch that
particular itch. Your editor would not have minded living in a world with
one less version control system but with better analysis tools. But the
nature of free software development is that people work on problems that
interest them. As the foundations of our static analysis tools get better,
one can hope that more developers will find those foundations interesting
to build on. The entire development community will benefit from the
results.
Comments (28 posted)
Brief items
If you think about it GIT actually promotes anti-social software
development; development in small, disconnected silos is not how
software is developed in the real world. Most software is developed
by teams whose members have a variety of skills who need to see
what each other is doing and that's the fundamental reason why GIT
is not a threat to Subversion in the enterprise. It's fine for the
development of the Linux kernel but that model doesn't work for
most companies.
--
David Richards
We clearly want machines that perform human-like tasks. We want
computers that recognize our language and motivations and can take
hints, rather than requiring instructions enumerated in
mind-numbingly tedious detail. But whether we want them to be
conscious and volitional is another question entirely. I don't want
my self-driving car to argue with me about where we want to go
today. I don't want my robot housekeeper to spend all its time in
front of the TV watching contact sports or music videos. And I
certainly don't want to be sued for maintenance by an abandoned
software development project.
--
Charlie
Stross
Comments (18 posted)
The Document Foundation has posted
a draft
proposal for a certification program built around LibreOffice.
"
TDF provides LibreOffice Certification and promotes the ecosystem
through his channels and with an aggressive marketing campaign targeted to
corporate users, in order to increase LibreOffice adoption based on
certified professional value added services for migration, integration,
development, support and training. LibreOffice Certification is fee based,
and fee might vary according to the value provided by the partner to The
Document Foundation." It's worth noting that this discussion is
just beginning; TDF is looking for comments on how such a program might
best be designed.
Comments (18 posted)
The MediaWiki 1.17.0 release is out. Changes in this release include a new
installer, the "ResourceLoader" framework, better category sorting, and
better Oracle database support. See
the release notes
for more information.
Full Story (comments: none)
Mozilla has
announced
the release of Firefox 5. "
The latest version of Firefox includes more than 1,000 improvements and performance enhancements that make it easier to discover and use all of the innovative features in Firefox. This release adds support for more modern Web technologies that make it easier for developers to build amazing Firefox Add-ons, Web applications and websites."
Comments (14 posted)
Pari/GP is "
a widely used
computer algebra system designed for fast computations in number theory
(factorizations, algebraic number theory, elliptic curves...), but also
contains a large number of other useful functions to compute with
mathematical entities such as matrices, polynomials, power series,
algebraic numbers etc., and a lot of transcendental functions."
Pari 2.5.0 is the first major release in five years. There are numerous
new features which will doubtless make sense to math-intensive people;
click below for the details.
Full Story (comments: none)
Pyro is "
a library that enables you to build applications in which
objects can talk to each other over the network, with minimal programming
effort. You can just use normal Python method calls, with almost every
possible parameter and return value type, and Pyro takes care of locating
the right object on the right computer to execute the method. It is
designed to be very easy to use, and to generally stay out of your way. But
it also provides a set of powerful features that enables you to build
distributed applications rapidly and effortlessly." The 4.7 release
is out; new features include AutoProxy, asynchronous method calls,
simplified server setup, and "part of" a new manual.
Full Story (comments: none)
"Harmattan Python" is a Python environment for the "MeeGo 1.2 Harmattan
Platform," as found in the newly-announced N9 phone. "
This release
is a culmination of several years of hard work and offers the most complete
and full-featured Python programming language support on any mobile
platform." More information can be found on
the Harmattan Python
page.
Full Story (comments: none)
Version 2.0 of the
Tornado web
server is out. "
The framework is distinct from most mainstream
web server frameworks (and certainly most Python frameworks) because it is
non-blocking and reasonably fast. Because it is non-blocking and uses epoll
or kqueue, it can handle thousands of simultaneous standing connections,
which means it is ideal for real-time web services." This release
includes templating changes, Python 3.2 support, and more; see
the
release notes for details.
Comments (none posted)
Newsletters and articles
Comments (none posted)
Over at Linux.com, Nathan Willis
explores the Media Explorer, a Linux-based media center application. "
The practical upshot is that Media Explorer is very lean, and runs very fast. The overview includes five tabs: one each for audio, video, and still images, plus a search tab and custom playlist tab where you can build up a queue. The interface is smoothly animated in Clutter, and is accessible with the keyboard, mouse, or an IR remote. There is even a handy set-up tool that walks you through the process of assigning keypresses on your remote to actions inside the app."
Comments (none posted)
On his blog, Jon Phillips
writes about free/libre hardware that implements the
802.15.4 wireless personal area network protocol. The devices come from
Qi Hardware; one fits into the Micro-SD slot of a
Ben Nanonote, the other fits into a standard USB slot. This is all part of the
Ben WPAN project, which describes itself as follows: "
Ben WPAN is a project to create an innovative patent-free wireless personal area network (WPAN) that is copyleft hardware. The primary protocol is 6LoWPAN, pronounced "SLoWPAN". The project lead is Werner Almesberger and it involves using the UBB, new testing software, and the Ben Nanonote to produce a next generation wireless personal area network." (Thanks to Paul Wise.)
Comments (none posted)
Page editor: Jonathan Corbet
Next page: Announcements>>