May 12, 2010
This article was contributed by Manuel López-Ibáñez
Version 4.5 of the GNU Compiler
Collection was released
in mid-April with many
changes under-the-hood, as well as a few important
user-visible features. GCC 4.5 promises faster programs using the new
link-time optimization (LTO) option, easier implementation of compiler
extensions thanks to the controversial plugin
infrastructure, stricter standards-conformance for floating-point
computations, and better debugging information when compiling with
optimizations.
The GNU Compiler Collection is one
of the oldest free software projects still around. Version 1.0 of GCC was
released in 1987. More than twenty years later, GCC is still
under active development and each new version is adding important
features. Supporting these new features in such an old codebase often
requires major rewriting of substantial parts of GCC. GCC 4.0 was an important
milestone in this regard, and GCC internals are still evolving at
a rapid pace. However, these core improvements are sometimes not
clearly visible as improvements for users. This is not the case in GCC
4.5. This article describes four new features in GCC 4.5, and also
looks at an internal feature that may radically change how GCC
is developed in the future.
Link-Time Optimization
Perhaps the most visible of the new features in GCC 4.5 is the
Link-Time Optimization option: -flto. When source files are
compiled and linked using -flto, GCC applies optimizations as
if all the source code were in a single file. This allows GCC to perform
more aggressive optimizations across files, such as inlining the body
of a function from one file that is called from a different file, and
propagating constants across files. In general, the LTO framework
enables all the usual optimizations that work at a higher level than a
single function to also work across files that are independently
compiled.
The LTO option works almost like any other optimization
flag. First, one needs to use optimization (using one of the
-O{1,2,3,s} options). In cases where compilation and linking
are done in a single step, adding the option -flto is
sufficient
gcc -o myprog -flto -O2 foo.c bar.c
This effectively deprecates the old -combine option, which
was too slow in practice and only supported for C.
With independent compilation steps, the option -flto must
be specified at all steps of the process:
gcc -c -O2 -flto foo.c
gcc -c -O2 -flto bar.c
gcc -o myprog -flto -O2 foo.o bar.o
An interesting possibility is to combine the options -flto
and -fwhole-program. The latter assumes that the current
compilation unit represents the whole program being compiled. This
means that most functions and variables are optimized more
aggressively. Adding -fwhole-program in the final link step
in the example above, makes LTO even more powerful.
When using multiple steps, it is strongly recommended to use exactly
the same optimization and machine-dependent options in all commands,
because conflicting options during compilation and link-time may lead
to strange errors. In the best case, the options used during
compilation will be silently overridden by those used at link-time. In
the worst case, the different options may introduce subtle
inconsistencies leading to unpredictable results at runtime. This, of
course, is far from ideal, and, hence, in the next minor release, GCC
will identify such conflicting options and provide appropriate
diagnostics. Meanwhile, some extra care should be taken when using
LTO.
The current implementation of LTO is only available for ELF
targets, and, hence, LTO is not available in Windows or Darwin in GCC
4.5. However, the LTO framework is flexible enough to support those
targets and, in fact, Dave
Korn has recently proposed a patch that adds LTO support for Windows
to GCC 4.5.1 and 4.6, and Steven
Bosscher has done the same for Darwin.
Finally, another interesting ongoing project, called whole program
optimization [PDF], aims to make LTO much more scalable for very large
programs (on the order of millions of functions). Currently, when compiling
and linking with LTO, the final step stores information
from all files involved in the compilation in memory. This approach does not
scale well if there are many large files. In practice, there may be
little interaction between some files and the information required
could be partitioned and optimized independently, with little
performance loss, or at least gracefully degrading the effectiveness
of LTO depending on existing resources. The experimental
-fwhopr option is a first step in this direction, but this
feature is still under development and even the name of the option is
likely to change. Therefore, GCC 4.6 will probably bring further
improvements in this area.
Plugins
Another long-awaited feature is the
ability to load user code as plugins that
modify the behaviour of GCC. A substantial amount of
controversy surrounded the implementation of plugins. The
possibility of proprietary plugins was probably the main factor
stalling the development of this feature. However, the FSF recently
reworked the Runtime
Library Exception in order to prevent proprietary plugins. With the
new Runtime Library Exception in place, the development of the plugins
framework progressed rapidly. This, however, did not completely end
the controversy surrounding plugins, and while some developers think
that plugins are essential for the future of GCC and for attracting
new users and contributors, others fear that plugins may divert
efforts from improving GCC itself.
The plugin framework of GCC can work in principle on any system
that supports dynamic libraries. In GCC 4.5, however, plugins are only
supported on ELF-based platforms, that is, most Unix-like systems, but
not Windows or Darwin. A plugin is loaded with the new option
-fplugin=/path/to/file.so. GCC makes available a series of events for
which the plugin code can register its own callback functions. The
events already implemented in GCC 4.5 allow plugins to interact with
the pass manager to add, reorder and remove optimization passes
dynamically, modify the low level representation used by C and C++
front-ends, add new custom attributes and compiler pragmas, and other
possibilities described in the internal
documentation.
Despite plugins being a new feature in GCC 4.5, several projects are already
making use of the plugins support. Among these projects is Dehydra, the
static analysis tool for C++ developed by Mozilla; and MELT, a
framework for writing optimization passes in a dialect of LISP. Also, the ICI/MILEPOST
research project strongly relies on the new plugins framework in GCC
4.5.
Variable Tracking at Assignments
The Variable
Tracking at Assignments (VTA) project aims to improve debug
information when optimizations are enabled. When GCC compiles some
code with optimizations enabled, variables are renamed, moved around, or
even completely removed. When debugging such code and trying to
inspect the value of some variable, the debugger would often report
that the variable has been optimized out. With VTA enabled,
the optimized code is internally annotated in such a way that
optimization passes transparently keep track of the value of each
variable, even if the variable is moved around or
removed.
A small example of the differences between debug information in GCC
4.5 and previous releases is the following program:
typedef struct list {
struct list *n;
int v;
} *node;
node find_prev (node c, node w)
{
while (c) {
node opt = c;
c = c->n;
if (c == w)
return opt;
} return NULL;
}
Variable opt is removed when compiling with
optimization. Hence, in previous GCC versions, or when compiling
without VTA, one cannot inspect the value of opt even at
the highest debugging level. In GCC 4.5, however, VTA enables
inspection of the value of all variables at all points of the function.
The effect of VTA is even more noticeable for inlined
functions. Before VTA, optimizations would often completely remove
some arguments of an inlined function, making it impossible to inspect
their values when debugging. With VTA, these optimizations still take
place, however, appropriate debug information is generated for
the missing arguments.
Finally, the VTA project has brought another feature, the new
-fcompare-debug option, which tests that the code
generated by GCC with and without debug information is identical. This
option is mainly used by GCC developers to test the compiler, but it
may be useful for users to check that their program is not affected by a
bug in GCC, though at a significant cost in compilation
time.
Standard conforming excess precision
Perhaps the most reported bug in GCC is bug 323. The symptoms appear when
different optimization levels produce different results in
floating-point computations, and when two ways of performing the same
calculation do not produce the same result. Although this is an inherent limitation of floating-point numbers, users are still
surprised that different optimization levels lead to highly different
results. One of the main culprits of the problem is the excess
precision arising from the use of the x87 floating-point unit
(FPU). That is, operations performed in the FPU have more precision
than double precision numbers stored in memory. Hence, the final
result of a computation may significantly depend on whether
intermediate operations are stored in the FPU or in memory.
This leads
to some unexpected and counter-intuitive results. For example, the
same piece of code may produce different results using the same
compilation flags and the same machine depending on changes of
seemingly unrelated code, because the unrelated code forces the
compiler to save some intermediate result in memory instead of keeping
it in a FPU register. One workaround to this behavior is the option
-ffloat-store, which stores every floating-point variable in
memory. This has, however, a significant cost in computation time. A
more fine-grained workaround is to use the volatile qualifier
in variables suffering from this problem.
While this problem will never be solved in computers with inexact
representation of floating-point numbers, GCC 4.5 helps improve the
situation by adding a new option -fexcess-precision=standard,
currently
only available for C, that handles floating-point excess precision
in a way that conforms to ISO C99. This option is also enabled
with standards conformance options such as -std=c99. However,
standards-conforming precision incurs an extra cost in computation
time. Therefore, users more interested in speed may
wish to disable this behavior using the option
-fexcess-precision=fast.
C++ compatible
GCC 4.5 is the first release of GCC that can be compiled with a C++
compiler. This may not seem very interesting or useful at the moment
(but take a look at the much improved -Wc++-compat
option). However, this is only the first step of an ongoing project to use C++ as the
implementation language of GCC. Except for some front-end bits
written in other languages, notably Ada, most of GCC is implemented in
C. The internal structures of GCC are under a continuous improvement
and modularization aimed at creating cleaner interfaces, and many GCC
developers think that this work would be easier using C++ than
C. However, this proposal is not free of controversy, and it is not
clear whether the switch would occur in GCC 4.6, later, or ever.
Other improvements
The above are only some examples of the many improvements and new
features in GCC 4.5. A
few other features that are worth mentioning:
- GCC now makes better use of the information provided by the restrict
keyword, which is also supported in C++
as an extension, to generate better optimized code.
- The libstdc++
profile mode tries to identify suboptimal uses of the
standard C++ library, and suggest alternatives that improve
performance.
- Previous versions of GCC incorporated the MPFR library in order to consistently
evaluate math functions with constant
arguments at compile time. GCC 4.5 extends this feature to complex math functions by
incorporating the MPC
library.
- Many improvements have been made in the specific language front-ends, in particular from
the very active Fortran
front-end project. Also worth mentioning is the increasing support for the
upcoming ISO C++ standard (C++0x)
Conclusion
We are living interesting
times on the compiler front, and GCC 4.5 is an indication that we
can still expect new developments in the future. The release of GCC
4.5 brings to its users several important, and somewhat
controversial, features. It also includes the typical long list of
small fixes and improvements, where most will be able to find at
least one thing to their liking. GCC 4.5 may well be a transition
point, where the foundational work that has been done during the 4.x
release series is starting to show up in user-visible features that
would have been impossible in the GCC 3.x release series. It is
difficult to say at this moment what GCC 4.6 will bring us in a year
from now, as it will depend on what the contributors decide.
Anyone can contribute to
the future of GCC. This is free
software after all.
Acknowledgments
I would like to thank in general the community of GCC developers, and
in particular, Ian Lance Taylor, Diego Novillo, and Alexandre Oliva,
for their helpful comments and suggestions when writing this article.
(
Log in to post comments)