GCC unplugged
GCC is designed as an extended pipeline of cooperating modules. Language-specific front-end code parses code in a specific source language and turns it into a generic, high-level, internal representation. Various optimization passes then operate on that representation at various levels. At the back end, an architecture-specific module turns the optimized internal code into something which will run on the target processor. It's a long chain of modules; at each point in the chain, there is an opportunity to see the code in a different stage of analysis and processing.
There can be a lot of value in hooking into an arbitrary point in that chain. Static analysis tools need to look at a program at different levels to get a sense for what is going on and look for problems or opportunities for improvement. New types of optimization passes could be added at specific points, making the compiler perform better. Project-specific modules could look for problems (violations of locking rules, perhaps) tied to a given code base. Language-specific modules can provide tighter checking for certain constructs. And so on.
Currently, adding this sort of extension to GCC is not a task for the faint of heart. The GCC build system is known to be challenging, and GCC's internal documentation is, one might say, not quite as complete as one might like. Researcher Alexander Lamaison described it this way:
Once they have overcome these problems, developers adding extensions to GCC run into another problem: if they want to distribute their work, they end up in the business of shipping a whole new compiler. Brendon Costa, who works on the EDoc++ GCC extension, noted:
Both of these problems could be addressed by adding a plugin mechanism to GCC. A well-defined API would make it relatively easy for developers to hook a new tool into the compiler without having to understand its internals or fight with the build process. If an off-the-shelf GCC could accept plugins, distributors could ship those plugins without having to include multiple copies of the compiler. Given that we would all benefit from a more capable GCC, and given the many examples of how other systems have benefited from a plugin architecture, one would think that the addition of plugins to GCC would not be a controversial thing.
It seems that one would be wrong, however. In a recent discussion on plugins, two concerns were raised:
- Adding plugins to GCC would make it easy for people to create and
distribute proprietary enhancements.
- A plugin API would have to be maintained in a stable manner, possibly impeding further GCC development.
There were also some suggestions that, if the effort put into a plugin API were, instead, put into documentation of GCC internals, the overall benefit would be much higher.
The proprietary extensions concern is clearly the big stumbling block, though. Some participants stated that Richard Stallman has blocked any sort of GCC plugin mechanism for just this reason - though it should be noted that Mr. Stallman has not contributed directly to this discussion. But, given that GCC remains a GNU project, it is not hard to imagine anything which could lead to proprietary versions of GCC would encounter a high level of opposition.
The attentive reader may have spied some similarities between this discussion and the interminable debate over kernel modules. The kernel's plugin mechanism has certainly enabled the creation of proprietary extensions. In the GCC case, it has been suggested that any plugins would have to be derived products and, thus, covered by the GPL. This, too, is an argument which has been heard in the kernel context. In that case, concerns over the copyright status of proprietary modules have kept them out of most distributions and, in general, cast a cloud over those modules. Something similar would probably happen to proprietary GCC modules: they would not be widely distributed, would be the subject of constant criticism, and would be an impetus for others to replace them with free versions. It is hard to imagine that there would be a thriving market for proprietary GCC extensions, just like there is no real market for proprietary GIMP extensions - even though Photoshop has created just that kind of market.
It has also been pointed out that the status quo has not prevented the creation of proprietary GCC variants. As an example, consider GCCfss - GCC for Solaris systems. This compiler is a sort of Frankenstein-like grafting of the GCC front end onto Sun's proprietary SPARC code generator. Back when Coverity's static analysis tools were known as the "Stanford checker," they, too, were a proprietary tool built on top of GCC (the current version does not use GCC, though). People wanting to do proprietary work with GCC have been finding ways to do so even without a plugin mechanism.
The GCC developers could also look to the kernel for an approach to the API stability issue and simply declare that the plugin API can change. That would make life harder for plugin developers and distributors, but it would make it even harder for any proprietary plugin vendors. An unstable API would not take away the value of the plugin architecture in general, but it would avoid putting extra demands onto the core GCC developers.
In general, GCC is at a sort of crossroads. There are a number of competing compiler projects which are beginning to make some progress; they are a long way from rivaling GCC, but betting against the ability of a free software project to make rapid progress is almost never a good idea. There is a pressing need for better analysis tools - it is hard to see how we will make the next jump in code quality without them. Developers would like to work on other enhancements, such as advanced optimization techniques, but are finding that work hard to do. If GCC is unable to respond to these pressures, things could go badly for the project as a whole; GCC developer Ian Lance Taylor fears the worst in this regard:
Back at the beginning of the GNU project, Richard Stallman understood that a solid compiler would be an important building block for his free system. In those days, even the creation of a C compiler looked like an overly ambitious project for volunteer developers, but he made GCC one of his first projects anyway (once the all-important extensible editor had been released). His vision and determination, combined with a large (for the times) testing community with a high tolerance for pain, got the job done. When Sun decided that a C compiler was no longer something which would be bundled with a SunOS system, GCC was there to fill in the gap. When Linus created his new kernel, GCC was there to compile it. It is hard to imagine how the free software explosion in the early 1990's could have happened without the GCC platform (and associated tool chain) to build our code with.
The vision and determination that brought us GCC has always been associated
with a certain conservatism which has held that project back, though. In
the late 1990's, frustration with the management of GCC led to the creation
of the egcs compiler; that fork proved to be so successful that it
eventually replaced the "official" version of GCC. If enough developers
once again reach a critical level of frustration, they may decide to fork
the project anew, but, this time, there are other free compiler projects
around as well. Perhaps, as some have suggested, better documentation is
all that is really required. But, somehow, the GCC developers will want to
ensure that all the energy which is going into improving GCC doesn't wander
elsewhere. GCC needs that energy if it is to remain one of the cornerstones
of our free system.
