By Jake Edge
March 18, 2009
As GCC nears its 4.4 release, there are a number of criteria that need to
be met before it can be released. Those
requirements—regressions requiring squashing—have been met, but
things are still stalled. A number of issues were
raised with the changes to the
runtime library
exemption that have caused the
release, and a branch that will allow new development into the GCC tree, to
be delayed until that is resolved. In the
meantime, however, GCC development is hardly standing still, there are
numerous interesting ideas floating around for new features.
Changing the runtime library exemption was meant to allow the creation of a plugin
API for GCC, so that developers could add additional analysis or
processing of a program as it is being transformed by the compiler. The
Free Software Foundation has long been leery of allowing such a plugin
mechanism because they feared that binary-only GCC plugins of various sorts
might be the result. In January, though, the FSF announced that it would
change the exemption—which allows proprietary programs to link to the GCC
runtime library—in order to exclude code that has been processed by
a non-GPL "compilation process". It is a bit of license trickery that will
only allow plugins that are GPL-licensed.
Shortly after the new exception was
released, there were some seemingly substantive issues raised on the
gcc-devel mailing list. Ian Taylor neatly summarized the concerns, which break down into
three separate issues:
- Code that does not use the runtime library and its interfaces at all might
not be interpreted as included in the definition of an "Independent
Module", which would then disallow it from being combined with the GCC
runtime libraries. The code that fell outside of the "Independent Module"
definition would not be affected directly, but combining it with
other, compliant code that did use the runtime library would be
disallowed.
- There are questions about whether Java byte code should be
considered a "high-level, non-intermediate language". It is common to
generate Java byte code using a non-GCC compiler, but then process it with
gcj.
- There is also a hypothetical question about LLVM byte code
and whether it should be considered a "high-level, non-intermediate
language" as well.
Definitions of terms makes up the bulk of the runtime library exemption, so
it is clearly important to get them right. The first issue in Taylor's
summary seems like just an oversight—easily remedied—but the
last two are a little more subtle.
By and large, the byte code produced as part of a compiler's operation is
just an intermediate form that likely shouldn't be considered a "high-level,
non-intermediate language", but Java and LLVM are a bit different. In both
cases,
the byte code is a documented language, somewhat higher level than assembly
code, which, at least in the case of LLVM, is sometimes hand-written. For
Java, non-GPL compilers are often used, but based on the current exemption
language, the byte code from those compilers couldn't be combined with the
GCC runtime libraries and
distributed as a closed source program. Since LLVM is GPL-compatible,
there are currently no issues combining its output with the GCC runtime,
but Taylor is using it as another example of byte code being generated by
non-GCC tools.
In addition to laying out the issues, Taylor recommends two possible ways
forward. One of those is to clarify the difference between a compiler
intermediate form and a "high-level, non-intermediate language". The other
is to expand the definition of an eligible compilation process to allow any
input to GCC that is created by a program that is not derived from GCC.
Trying to make the former
distinction seems difficult to pin down in any way that can't be abused
down the road, so the second might be easier to implement. After all, the
GCC developers can determine what kinds of input the compiler is willing to
accept.
This may seem like license minutiae to some—and it is—but it is
important to get it right. The FSF has chosen to go this route to prevent
the—currently theoretical—problem of proprietary GCC plugins,
so they need to ensure that they close any holes.
As Dave Korn pointed out in another thread, releasing
anything using an unclear license could create problems down the road:
If there's a problem with the current
licence that would open a backdoor to proprietary plugins, and we ever release
the code under that licence, evaders will be able to maintain a fork under the
original licence no matter how we subsequently relicense it.
Meanwhile, GCC developers have been working on reducing the regressions so
that 4.4 can be released. Richard Guenther reported on March 13 that there were no
priority 1 (P1) regressions, and less than 100 overall regressions, which
would normally mean that a new branch for 4.4 would be created, with 4.5
development being added to the trunk.
But, because of the runtime library exception
questions, Richard Stallman asked the GCC Steering Committee (SC) to wait
for those to be resolved before branching.
The delay has been met with some unhappiness amongst GCC hackers. Without
a 4.4 release branch, interesting new features are still languishing in private
developer branches. As Steven Bosscher put
it:
But there are interactions
between the branches, and the longer it takes to branch for GCC 4.4,
the more difficult it will be to merge all the branches in for GCC
4.5. So GCC 4.5 is *also* being delayed, not just GCC 4.4.
What is also being held back, is more than a year of improvements since GCC
4.3.
Bosscher suggested releasing with the old exemption for 4.4 and fixing the
problems in the 4.5 release. While that could work, it would seem that
Stallman and the SC are willing to give FSF legal some time to clarify the
exemption. In the end, though, the point is somewhat moot as there is, as
yet, no plugin API available.
As part of the discussion of the new runtime library exception, Sean
Callanan sparked a discussion about a plugin
API by mentioning some of the plugins his research group had been
working on. That led to various thoughts about the API, including a wiki page for the plugin project
and one for the API
itself. Diego Novillo has also created a
branch to contain the plugin work.
The basic plan is to look at the existing plugins—most of which have
implemented their own API—to extract requirements for a generalized
API. In addition to the plugins mentioned by Callanan, there are others,
including Mozilla's Dehydra C++ analysis
tool, the Middle
End Lisp Translator (MELT), which is a Lisp dialect that allows the
creation of analysis and transformation plugins, and the MILEPOST self-optimizing
compiler. Once the license issues shake out, it would appear that a plugin
API won't be far behind.
There are other new features being discussed for GCC as well. Taylor has
put out a proposal to support "split
stacks" in GCC. The basic idea is to allow thread stacks to grow and
shrink as needed, rather than be statically allocated at a particular
size. Currently, applications that have enormous numbers of threads must
give each one the worst-case stack size, even when it might go unused
during the life of that thread. So, this could reduce memory usage, thus
allowing more threads to run, but it would also alleviate the need for
programmers to consider stack size for applications with thousands or
millions of threads.
Another feature is link-time optimization (LTO), which is much further
along than split stacks. Novillo put out a call for testers of the LTO branch in late
January. There are a number of optimizations that can be performed when
the linker has access to information about all of the compilation units.
Currently, the linker only has access to the object files that are being
collected into an executable, but LTO would put the GCC-internal
representation (GIMPLE) into a special section of the object file. Then,
at link time (but not actually implemented by the linker), various
optimizations based on the state of the whole program could be performed.
The kinds of optimizations that can be done are outlined in a paper [PDF] on "Whole
Program Optimizations" (WHOPR) written by a number of GCC hackers including
Taylor and Novillo.
While it is undoubtedly disappointing to delay GCC 4.4, hopefully the
license issues will be worked out soon and the integration of GCC 4.5 can
commence. In the interim, work on various features—many more
than are described here—is proceeding. The FSF has always had a
cautious approach to releases—witness the pace of Emacs—but
sooner or later, we will see GCC 4.4, presumably with a licensing change.
With luck, six months or so after that will come GCC 4.5 with some of these
interesting new features.
(
Log in to post comments)