A new linker is not generally something that arouses much interest outside
of the hardcore development community—or even inside it—unless
it provides something especially eye-opening. A newly released linker,
called gold has just that kind of feature, though, because it runs
up to five times as fast as its competition. For developers who do a lot
of compile-link-test cycles, that kind of performance increase can
significantly increase their efficiency.
Linking is an integral part of code development, but it can be invisible,
as it is often invoked by the compiler. The sidebar accompanying this
article is meant for
non-developers or those in need of a refresher about linker operation.
For those who want to know even more, the author of gold, Ian Lance
Taylor, has a twenty-part series about linker internals on his weblog,
starting with this entry.
For Linux systems, the GNU Compiler
Collection (GCC) has been the workhorse by
providing a complete toolchain to build programs in a number of different
languages. It uses the ld linker from the binutils collection. With
that gold has been added to binutils, there are now two
choices for linking GCC-compiled programs.
A linker overview
For non-developers, a quick overview of the process that turns source code
into executable programs may be helpful.
Compilers are programs that turn C—or other high-level
languages—into object code. Linkers then collect up object
code and produce an executable. Usually the linker will not only operate
on object code created from a project's source, but will also reference
libraries of object code—the C runtime library libc for
example. From those objects, the linker creates an executable program that
a user can invoke from the command line.
The linker allows program code in one file
to refer to a code or data object in another file or library. It arranges
that those references are usable at run time by
substituting an address for
the reference to an object. This "links" the two properly in the executable.
Things get more complicated when
considering shared libraries, where the library code is shared by multiple
concurrent executables, but this gives a rough outline of the basics of
The intent is for gold to be a complete drop-in replacement for
ld—though it is not quite there yet. It is currently
lacking support for some command-line options and Linux kernels that are
linked with it do not boot, but those things will come. It also currently
only supports x86 and x86_64 targets, but for many linker
jobs, gold seems to be working well. The speed seems to be very
some developers, with Bryan O'Sullivan saying:
When I switched to using gold as the linker, I was at first a little
surprised to find that it actually works at all. This isn't especially
common for a complicated program that's just been committed to a source
tree. Better yet, it's as fast as Ian claims: my app now links in 2.6
seconds, almost 5.4 times faster than with the old binutils linker!
Performance was definitely the goal that Taylor set for gold
development. It supports ELF (Executable
and Linking Format) objects and runs on UNIX-like operating systems
only. Only supporting one object/executable format, along with a fresh
start and an explicit performance goal are some of the reasons that
gold outperforms ld.
Tom Tromey likes the
looks of the code:
I looked through the gold sources a bit. I wish everything in the GNU
toolchain were written this way. It is very clean code, nicely commented,
and easy to follow. It shows pretty clearly, I think, the ways in which C++
can be better than C when it is used well.
Because the implementation is geared for speed, Taylor used techniques that
may confuse some.
He has some concerns
about the maintainability of his implementation:
While I think this is a reasonable approach, I do not yet know how
maintainable it will be over time. State machine implementations can be
difficult for people to understand, and the high-level locking is
vulnerable to low-level errors. I know that one of my characteristic
programming errors is a tendency toward code that is overly complex, which
requires global information to understand in detail. I've tried to avoid it
here, but I won't know whether I succeeded for some time.
Overall, it seems to be getting a nice reception by the community, with
O'Sullivan commenting that he is "looking forward to the point where
gold entirely supplants the existing binutils linker. I expect that won't
take too long, once Mozilla and KDE developers find out about the
performance boost." Once gold gets to that point, Taylor
is already thinking about concurrent
linking—running compiler and linker at the same time—as
the next big step.
There are two other ongoing projects that are working with the greater GCC
ecosystem in interesting ways: quagmire and ggx. Quagmire is an effort to
replace the GNU configure and build system—consisting of autoconf,
automake, and libtool—with something that depends
solely on GNU make. Currently, that system uses
various combinations of the shell, m4, and portable makefiles to make the
building and installation of programs easy—the famous
"./configure; make" command line. The tools were written that way
to try and ensure that users did not need to install additional packages to
configure and build GNU tools.
Quagmire, which has roots in a
posting by Taylor
recognizes that GNU make is ubiquitous, so basing a
system around that makes a great deal of sense.
The ggx project is Anthony Green's step-by-step procedure to create an
entire toolchain that can build programs for a processor architecture that he is
creating as a thought
experiment. The basic idea is to design the instruction set based on
the needs of the compiler, in this case GCC, rather than the needs of the
hardware designers. He is using GCC's ability to be retargeted for new
architectures, along with its simulation capabilities to create a CPU that
he can write programs for. As of this writing, he has a "hello world"
program working, along with large chunks of the GCC test suite passing.
Well worth a look.
to post comments)