By Jonathan Corbet
November 19, 2007
Many programs - free and proprietary - offer a plug-in interface to make it
easy to add new functionality. In many situations, the existence of a
well-defined plugin interface has been a key driver for the success of the
system as a whole; imagine Firefox, for example, without its extension
mechanism. The GNU compiler collection (GCC) is an example of a complex
system which could benefit from such an interface, but which currently
lacks one. GCC developers have been talking about adding a plugin API, but
it is far from clear that this will be done; how this decision goes may
have major consequences for how GCC works with its wider development
community and the free software community as a whole.
GCC is designed as an extended pipeline of cooperating modules.
Language-specific front-end code parses code in a specific source language and
turns it into a generic, high-level, internal representation. Various
optimization passes then operate on that representation at various levels.
At the back end, an architecture-specific module turns the optimized
internal code into something which will run on the target processor. It's
a long chain of modules; at each point in the chain, there is an
opportunity to see the code in a different stage of analysis and processing.
There can be a lot of value in hooking into an arbitrary point in that
chain. Static analysis tools need to look at a program at different levels
to get a sense for what is going on and look for problems or opportunities
for improvement. New types of optimization passes could be added at
specific points, making the compiler perform better. Project-specific
modules could look for problems (violations of locking rules, perhaps) tied
to a given code base. Language-specific modules can provide tighter
checking for certain constructs. And so on.
Currently, adding this sort of extension to GCC is not a task for the faint
of heart. The GCC build system is known to be challenging, and GCC's
internal documentation is, one might say, not quite as complete as one
might like. Researcher Alexander Lamaison described it this way:
Out of the 6 months, 4 were spent learning the GCC internals and
fighting the GCC build process, 1 was spent writing up leaving 1
month of actual productive research... I fully understand that
this can seems strange to people who know GCC like the back of
their hand, but to a newcomer it is a huge task just to write a
single useful line of code. I'm sure many give up before ever
reaching that point.
Once they have overcome these problems, developers adding extensions to GCC run
into another problem: if they want to distribute their work, they end up in
the business of shipping a whole new compiler. Brendon Costa, who works on
the EDoc++ GCC extension, noted:
I approached the debian maintainers list with a debian package for
this project to see if they would include it in the official
repositories. It was not accepted and the reason for that is
because it includes another patched version of GCC which takes up
too much disk space. They don't want to accept these sorts of
projects because they all effectively require duplicates of the
same code(GCC)
Both of these problems could be addressed by adding a plugin mechanism to
GCC. A well-defined API would make it relatively easy for developers to
hook a new tool into the compiler without having to understand its
internals or fight with the build process. If an off-the-shelf GCC could
accept plugins, distributors could ship those plugins without having to
include multiple copies of the compiler. Given that we would all benefit from
a more capable GCC, and given the many examples of how other systems have
benefited from a plugin architecture, one would think that the addition of
plugins to GCC would not be a controversial thing.
It seems that one would be wrong, however. In a recent discussion on plugins, two concerns were
raised:
- Adding plugins to GCC would make it easy for people to create and
distribute proprietary enhancements.
- A plugin API would have to be maintained in a stable manner, possibly
impeding further GCC development.
There were also some suggestions that, if the effort put into a plugin API
were, instead, put into documentation of GCC internals, the overall benefit
would be much higher.
The proprietary extensions concern is clearly the big stumbling block,
though. Some participants stated that
Richard Stallman has blocked any sort
of GCC plugin mechanism for just this reason - though it should be noted
that Mr. Stallman has not contributed directly to this discussion. But,
given that GCC remains a GNU project, it is not hard to imagine anything
which could lead to proprietary versions of GCC would encounter a high
level of opposition.
The attentive reader may have spied some similarities between this
discussion and the interminable debate over kernel modules. The kernel's
plugin mechanism has certainly enabled the creation of proprietary
extensions. In the GCC case, it has been suggested that any plugins would
have to be derived products and, thus, covered by the GPL. This, too, is
an argument which has been heard in the kernel context. In that case,
concerns over the copyright status of proprietary modules have kept them
out of most distributions and, in general, cast a cloud over those
modules. Something similar would probably happen to proprietary GCC
modules: they would not be widely distributed, would be the subject of
constant criticism, and would be an impetus for others to replace them with
free versions. It is hard to imagine that there would be a thriving market
for proprietary GCC extensions, just like there is no real market for
proprietary GIMP extensions - even though Photoshop has created just that
kind of market.
It has also been pointed out that the status quo has not prevented the
creation of proprietary GCC variants. As an example, consider GCCfss - GCC
for Solaris systems. This compiler is a sort of Frankenstein-like grafting
of the GCC front end onto Sun's proprietary SPARC code generator. Back
when Coverity's static analysis tools were known as the "Stanford checker,"
they, too, were a proprietary tool built on top of GCC (the current version
does not use GCC, though). People wanting to do proprietary work with GCC
have been finding ways to do so even without a plugin mechanism.
The GCC developers could also look to the kernel for an approach to the API
stability issue and simply declare that the plugin API can change. That
would make life harder for plugin developers and distributors, but it would
make it even harder for any proprietary plugin vendors. An unstable API
would not take away the value of the plugin architecture in general, but it
would avoid putting extra demands onto the core GCC developers.
In general, GCC is at a sort of crossroads. There are a number of
competing compiler projects which are beginning to make some progress; they
are a long way from rivaling GCC, but betting against the ability of a free
software project to make rapid progress is almost never a good idea. There
is a pressing need for better analysis tools - it is hard to see how we
will make the next jump in code quality without them. Developers would
like to work on other enhancements, such as advanced optimization
techniques, but are finding that work hard to do. If GCC is unable to
respond to these pressures, things could go badly for the project as a
whole; GCC
developer Ian Lance Taylor fears the worst
in this regard:
I have a different fear: that gcc will become increasing
irrelevant, as more and more new programmers learn to work on
alternative free compilers instead. That is neutral with regard to
freedom, but it will tend to lose the many years of experience
which have been put into gcc. In my view, if we can't even get
ourselves together to permit something as simple as plugins with an
unstable API, then we deserve to lose.
Back at the beginning of the GNU project, Richard Stallman understood that
a solid compiler would be an important building block for his free system.
In those days, even the creation of a C compiler looked like an overly
ambitious project for volunteer developers, but he made GCC one of his
first projects anyway (once the all-important extensible editor had been
released). His vision and determination, combined with a large (for the
times) testing community with a high tolerance for pain, got the job done.
When Sun decided that a C compiler was no longer something
which would be bundled with a SunOS system, GCC was there to fill in the
gap. When Linus created his new kernel, GCC was there to compile it.
It is hard to imagine how the free software explosion in
the early 1990's could have happened without the GCC platform (and
associated tool chain) to build our code with.
The vision and determination that brought us GCC has always been associated
with a certain conservatism which has held that project back, though. In
the late 1990's, frustration with the management of GCC led to the creation
of the egcs compiler; that fork proved to be so successful that it
eventually replaced the "official" version of GCC. If enough developers
once again reach a critical level of frustration, they may decide to fork
the project anew, but, this time, there are other free compiler projects
around as well. Perhaps, as some have suggested, better documentation is
all that is really required. But, somehow, the GCC developers will want to
ensure that all the energy which is going into improving GCC doesn't wander
elsewhere. GCC needs that energy if it is to remain one of the cornerstones
of our free system.
(
Log in to post comments)