July 25, 2006
This article was contributed by John Richard Moser
Everyone wants their computer to start faster. Distribution maintainers
are thus always looking for things to decrease boot time and program load
time, using mechanisms such as readahead, boot reordering, even parallel
init to more efficiently distribute CPU and disk I/O and get the user to
the login screen faster.
One such tool used to decrease boot time and program load time is prelink [PDF], developed by Jakub Jelínek
at Red Hat and discussed in an
earlier LWN article. Prelinking is not the only way to improve start-up times, however; there
are also a number of existing and developing optimizations the GNU linker
can apply to give substantial gains. Besides existing GNU linker hash
table optimizations, Michael
Meeks has done substantial works on three
separate optimizations in the GNU linker in a quest to make OpenOffice.org load
quickly [PDF].
1.0 The -Wl,-O1 Linker Options
In typical operation, symbols are stored in hash tables in ELF binaries;
these hash tables are kept small, and symbols that fall into the same
bucket—hash to the same value—are compared by a simple string
comparison. Unfortunately, symbols in the same bucket with the same prefix
will need a long string comparison, which can be slow; if there are lots of
symbols in the same bucket, then the dynamic linker has to string compare
with each of them until it finds a match (Fig. 1.0-1).
|
Fig. 1.0-1: A hash table of symbols in a library. The bucket
0x00b1 contains three symbols; the red characters will have to be tested
against the symbol name before the symbol 'foodragon' is resolved.
|
It was recommended in a paper [PDF] by Ulrich Drepper to utilize a
GNU linker optimization that focuses more on producing short hash chains
than a small hash table size. Although the hash table may grow by a few
kilobytes, the shortened length of hash chains means that symbol look-ups
do not have to perform as many string comparisons. Further, the chances of
symbols sharing long prefixes landing in the same bucket are greatly
reduced, making walking the chains faster (Fig. 1.0-2).
|
Fig. 1.0-2: A hash table of symbols in a library. This table is optimized, so symbols hash to buckets containing shorter hash chains; this reduces the number of string comparisons done.
The table is a bit bigger, though.
|
This linker optimization can be activated by passing -Wl,-O1 to
gcc at link time. This flag can also be passed during any compilation;
however, optimizing hash tables can be slow, and there are no benefits to
doing this when creating object files that are just going to be linked into
a main executable or shared object.
Currently this linker optimization is used
by GNOME in its official builds. It has also been discussed on the Gentoo forums; and is
used by
the Ubuntu distribution, to name a few. An old but good technique.
2.0 -Bdirect Linking
Michael Meeks also proposed an
optimization
to GNU binutils and glibc which functions similar to direct binding in
Solaris. By passing -Bdirect at link time, the build process
can cause many symbols to
be directly linked, allowing the dynamic linker to severely decrease the
search space during lookup.
Libraries have unresolved symbols when they use functions or global
variables from other libraries; these symbols are resolved during dynamic
linking. The dynamic linker locates every unresolved symbol each library
needs by searching the symbol table of each loaded library in order from
first loaded to last loaded. This can become quite time consuming,
especially in cases such as GNOME or OpenOffice.org where 50-150 libraries
are loaded and up to 200,000 symbols must be resolved.
|
Fig. 2.0-1: Normal binding
is done by checking through all libraries until the symbol is found.
|
Direct binding shortens the path the linker has to take to resolve a symbol
by drawing it straight from whatever ELF object needs the symbol to the ELF
object containing the symbol. A section is added to the ELF header which
associates each symbol with an entry in the DT_NEEDED table—the list
of libraries needed by the ELF object. The linker uses this information to
go straight to the library containing the symbol and search only there,
instead of searching through each library sequentially.
|
Fig. 2.0-2: Direct binding
is done by checking the library that the symbol table says the binding is located in. This gives a much shorter path for any symbol lookup; three are shown above.
|
Meeks' -Bdirect linking method operates slightly differently from
Solaris direct binding. With the Solaris linker, direct binding binds all
symbols directly to the relevant libraries. With Michael Meeks'
implementation of -Bdirect linking, vague symbols are detected and
linked as normal. This is particularly important with C++, as vague C++
symbols are fairly common. It is not possible to guarantee that a
vague symbol is linked to a specific library, so direct binding of vague
symbols is not technically feasible.
-Bdirect binding can, to a degree, work with shared objects that are loaded with dlopen() rather than run-time linking; this is reported to greatly aid with OpenOffice.org and KDE start-up times. This is interesting because the performance improvements per library are comparable to but not quite as substantial as prelink, which does not affect libraries loaded in this manner.
The current implementation unfortunately has a few rough edges. Some areas could be better optimized; and there is an issue with certain libraries and programs breaking due to direct binding. Sometimes multiple libraries supply the same symbol, and the first loaded is the one used; this is called interposing, and is
the reason why Meeks' -Bdirect linking patch does not optimize vague symbols. When direct binding is used, the symbol is chosen by the GNU linker at link time rather than by the dynamic linker at run time. This can be both a good and a bad thing, depending on the situation.
In most cases direct binding is a good thing, as it guarantees that libraries dependent on other libraries will always find the proper symbols, even if multiple libraries are loaded with the same symbols. Unfortunately, sometimes this very functionality also changes the way existing programs link, which just happen to work fine based on using the symbol from the first library supplying it.
This is usually disastrous, as the symbol now used is typically a
variable or function meant for a different purpose; the program no
longer does what the programmer intended, and likely ends up simply
crashing.
Not everybody believes that the use of interposing is the right thing to
do. Still, getting things like this working would really require finding a
work-around that does not break compatibility with existing code. Michael
Meeks suggested during e-mail discussion a link-map extension to handle
direct binding per symbol:
I would instead say: writing code that explicitely relies on
interposing is not sensible. However - to make this work all
that is needed is to implement an extension to link-maps (as
done in the Solaris linker) to allow certain symbols to be
marked as interposers (and hence not direct linked). That's
most useful for (eg.) in glibc's use of pthreads - done by
interposing.
This would allow direct binding to be used with libraries that currently
fail with it, by only doing a long symbol search for vague symbols.
Further, Meeks' finterpose tool can be
used to find interposers between libraries; this not only shows where
-Bdirect binding may fail, but also has exposed a few bugs in
software such as GTK+
and gstreamer.
The unfortunately large number of interposers presents a slight roadblock,
but should not stop Meeks' -Bdirect linking from eventually
becoming fully functional.
This optimization has an uncertain future, however. Relevant glibc code has so far been
rejected
by Ulrich Drepper. Despite this, -Bdirect linking has already
found uses in Gentoo (with a portage overlay), Pardus Linux, and
Solaris/OpenSolaris.
The patch has also been committed to OpenSUSE, although it is uncertain if it will be used to build upcoming OpenSUSE releases.
3.0 dynsort
Another optimization posted by Michael Meeks adds the dynsort keyword to the GNU
linker. By passing -zdynsort at link time, the .dynsym
and .dynstr sections as well as relocations are all sorted by ELF
hash and by the position where they land in the hash bucket.
When symbols are looked up in an ELF object, a hash table has to be
searched. Hash collisions are effectively stored as a linked list, which
then has to be walked to find the appropriate entry. With
dynsort, the symbols that have to
be examined while walking a bucket are all adjacent to each other. This
reduces the number of L1 and L2 cache misses, allowing the CPU to utilize
its facilities much more efficiently during dynamic linking.
The dynsort optimization has gotten a good bit of attention since
it was
unveiled. Meeks is working on improving it in various ways, such as moving
data for undefined symbols away from the rest of the data. It was this
patch that got Meeks
offered a branch in binutils cvs, although as of this writing one
hasn't been set up. It can be
used in Gentoo with an overlay, but may soon
be available everywhere via the --hash-style option.
4.0 Precomputed Hash Values
A few weeks after the dynsort patch, Meeks posted another optimization the binutils mailing list. This one adds pre-computed ELF hash values to the elf header when the -hashvals switch is given. Normally the dynamic linker has to compute a
hash value of a symbol and then use that to find the index in the symbol tables
of the other libraries. The -hashvals optimization pre-computes these,
removing a series of cache misses and mathematical computations from the process.
The visible effect of the -hashvals switch is a large speedup in symbol look-up; Meeks measured dlopen() on libsvx as requiring 40% less time, and in some places a 51% reduction was realized. This is due not only to avoiding the calculation of symbol hash values; but also to much more efficient L2 cache utilization.
Only a single 32-bit value is accessed for each symbol, rather than a long string; and locality is increased, since only the .hashvals section is accessed instead of both .dynsym and .dynstr, which reduces L2 cache misses.
This patch has a very favorable future; it almost immediately gained support for inclusion in upstream binutils. Several months later, a rewrite done by Ulrich Drepper and Jakub Jelínek was posted, implementing sorting and precomputed hashvals via --hash-style. And of course, Gentoo users have had access to it for a while with various portage overlays.
Conclusion
There are a lot of optimizations possible to reduce dynamic linking time, creating substantial gains in application load time. Future versions of the GNU linker and glibc may allow distributions to boot and load programs much faster, getting users to the desktop and on their applications in substantially less time.
Comments (17 posted)
System Applications
Audio Projects
Version 0.9.69 of Rivendell, a radio station automation system, is out.
"
This release, featuring
the debut of full voicetracking and log customization capabilities, marks a
significant milestone for the project, with all modules now feature-complete."
Full Story (comments: none)
Database Software
Version 1.4.3 of pgAdmin III, a GUI PostgreSQL administration tool,
has been announced.
"
v1.4.3 is primarily a bug fix release".
Comments (none posted)
DNS Software
Version 1.4.0 of dnspython
has been announced.
"
dnspython is a DNS toolkit for Python. It supports almost all record types. It can be used for queries, zone transfers, and dynamic updates. It supports TSIG authenticated messages and EDNS0.
dnspython provides both high and low level access to DNS. The high level classes perform queries for data of a given name, type, and class, and return an answer set. The low level classes allow direct manipulation of DNS zones, messages, names, and records."
Comments (none posted)
Version 0.4.4 of smbind, a PHP-based tool for managing DNS
zones for BIND via the web,
is available.
"
A number of fixes have been included in this release for both smbind and smbind-slave. A problem with user delegation of zones was fixed (thanks to an anonymous SF user report), along with a minor fix for zone deletion, which wasn't getting committed to the config file as expected. Deleting a user was supposed to transfer ownership from that user over to the admin user, but was broken because of the expected admin user's id in the database. A few documentation updates and grammatical changes have also been made."
Comments (none posted)
Interoperability
Version 3.0.23a of Samba has been announced.
"
This is the latest stable release of Samba. This is the version
that production Samba servers should be running for all current
bug-fixes."
Full Story (comments: none)
LDAP Software
Version 1.1.5 of LAT, the LDAP Administration Tool, is available.
"
This release is the
6th of the 1.1.x development cycle which will eventually become v1.2. If
you need a stable release stick with the 1.0 branch."
Full Story (comments: none)
Printing
Version 1.2.2 of CUPS
has been announced.
"
CUPS 1.2.2 fixes several build, platform, notification, and printing bugs."
Comments (none posted)
Security
FTimes version 3.7.0
has been announced.
"
Version 3.7.0 is a minor release of FTimes, a system baselining and evidence
collection tool. The primary purpose of ftimes is to gather and/or develop
topographical information and attributes about specified directories and
files in a manner conducive to intrusion and forensic analysis."
Comments (none posted)
Version 0.26 of Sussen, a vulnerability and configuration file scanner,
is out with new features, code cleanup and bug fixes.
Full Story (comments: none)
Web Site Development
The Apache Software Foundation has
announced the release of Apache Geronimo Version
1.1, an open-source J2EE application server.
"
Along with many new features, Apache Geronimo Version 1.1 introduces
several structural changes designed to improve scalability, portability and
overall organization. An easy-to-use configuration and management console
provides access to the new innovative plug-in architecture, allowing
advanced control over the rich modularity of the Apache Geronimo server as
well as simplifying day-to-day operational management tasks."
Comments (none posted)
Version 2.0 Milestone 1 of OpenReports
has been announced.
"
OpenReports, the leading open source web reporting solution, is pleased to announce the availability of OpenReports 2.0 Milestone 1.
OpenReports 2.0 features new export formats, ChartReports, scheduling improvements, support for JasperReports 1.2.5 and Hibernate 3.1.3, and many other bug fixes and enhancements."
Comments (none posted)
Version 2.5 of the Plone Content Management System
has been announced.
"
With the addition of powerful caching technologies, Plone 2.5 enables websites to run 10 to 40 times faster than in previous versions. Plone 2.5 focuses on streamlining code, strengthening stability, and increasing flexibility. The release incorporates the latest generation of the underlying Zope application server, setting the groundwork for Plone Foundations anticipated 3.0 release, available early 2007. Plone 3.0 will substantially increase ease-of-use and efficiency through user interface improvements."
Comments (none posted)
Web Services
Version 1.3 of the Web Service Modeling Toolkit (WSMT),
a collection of tools for Semantic
Web Services intended for use with the Web Service Modeling Ontology,
is available.
"
The main aim of this release has been to improve the functionality of the WSML Text Editor and Reasoner Views with respect to syntax completion. In the previous release
only keywords where recommended and this keyword recommendation was not sensitive to the current location in the document. This release sees the addition of full context sensitive syntax completion."
Comments (none posted)
Desktop Applications
Audio Applications
The initial version 0.1 release of
Jokosher,
a multi-track audio studio application for GNOME,
has been announced.
"
The Jokosher team are proud to announce our very first 0.1 release of Jokosher, a simple, usability focused Open Source multi-track studio. Since the original design and conception of the project in February, a team of developers, documentation writers, artists, testers and packagers have worked together to create a compelling first release."
Comments (none posted)
Desktop Environments
GNOME 2.15.90 (otherwise known as the first GNOME 2.16 beta) is out. Click
below for download details and pointers to information on what is changing
in 2.16.
Full Story (comments: none)
The following new GNOME software has been announced this week:
You can find more new GNOME software releases at
gnomefiles.org.
Comments (none posted)
The following new KDE software has been announced this week:
You can find more new KDE software releases at
kde-apps.org.
Comments (none posted)
KDE.News
has announced
the July 23, 2006 edition of the
KDE Commit-Digest. Here is the content summary:
"
KDevelop gets new configuration framework functionality. The start of a Satellite tracks feature in KStars. Support for PDF data extraction, and speed optimisations in Strigi. New features in KPhotoAlbum (KPhotoAlbum is the new name for KimDaBa). Perspective grid support in Krita, with the implementation of a Bezier tool becoming feature-complete. More work on unit conversions in KRecipes. Porting of KRDC to KDE 4."
Comments (none posted)
Electronics
Version 0.4.6 of Covered, a Verilog code coverage utility,
has been announced.
"
This release contains several bug fixes".
Comments (none posted)
Games
The WorldForge game project
has announced
the creation of a new human male simulation.
"
Im trying to get a new version of Ember out, hopefully this week. In the meantime I want to show some screenshots of the new male mesh that Jayr has been working on. By using the model definition system in Ember, its possible to define different parts of the model, such as /torso/cloth/green or /torso/cloth/red. These parts use the same submesh (in the example the torso submesh) but with different materials."
Comments (1 posted)
Graphics
Version 1.1 of OpenSceneGraph, an open-source 3D scene graph project
based on OpenGL, is available with a long list of new capabilities.
Full Story (comments: none)
Mail Clients
Srinivasa Ragavan's
GNOME blog
covers changes to the Evolution mail client.
"
Four months, in GNOME 2.16 cycle, We have added a lot of UI improvements to Evolution to make it look much better. Not just features and lot of bug fixes too!!! I have blogged them in parts. Im summarising all of them."
Comments (none posted)
Music Applications
Marcos Guglielmetti has announced a new drum kit sample set for the hydrogen
drum machine.
"
Colombo drums are handcraft drums made in La Plata, Argentina by a man called Colombo."
Full Story (comments: none)
Office Suites
OpenOffice.org and other software is now available via Metalink using
aria2.
"
OpenOffice.org is using a new Web/P2P hybrid called Metalink to distribute
its free office suite. Other open source software and Linux ISOs are
available at Metalink Packages Resources."
Full Story (comments: none)
Languages and Tools
Python
Version 0.6 of Crunchy Frog
has been announced.
"
Crunchy Frog is an application that transforms an html-based Python
tutorial into an interactive session within a browser window."
Comments (none posted)
Tcl/Tk
The July 25, 2006 edition of Dr. Dobb's Tcl-URL! is online with new
Tcl/Tk articles and resources.
Full Story (comments: none)
Page editor: Forrest Cook
Next page: Linux in the news>>