Your editor has, once again, had the opportunity to add to his collection
of Ottawa Linux Symposium T-shirts. OLS2006 was a fun and interesting
event, a testament to the increasing professionalism of its organizers,
speakers, and attendees. And also, of course, to the energy and vitality
which drives the Linux community.
Interesting things can be seen by looking at the the people who attend an
event like this one. Not that long ago, the preferred attire was a shirt
from a Linux event - the older, the better. While those shirts are still
very much in evidence, shirts of the button-down variety are on the rise.
Fortunately, there are still very few neckties to be seen (James Bottomley
- next year's OLS keynote speaker - being the exception that proves the rule
in this regard). There were also quite a few attendees who had clearly
made the trip from Asia.
LinuxWorld may be the place to go to see what companies are doing, but OLS
has clearly established itself as the event to attend to learn about what
the development community - and the kernel development community in
particular - is up to.
This year's schedule reveals
some things about what the community is interested in. Virtualization
remains a hot topic, but the emphasis has changed: Xen, the king of
paravirtualization, was well represented, but was far from the whole
story. Ian Pratt's Xen talk was held in one of the smaller rooms this
year. The hotter topic appeared to be containers - lightweight
virtualization which runs under the same kernel as the host. There is a
lot of development activity around containers at the moment, and many of
the people involved were at OLS to talk about it.
schedule featured exactly one filesystem talk - an update on ext3.
This year, a quick scan shows no less than nine filesystem talks, plus a
few on related topics (shared subtrees, for example). Expect to see some
interesting development work in the filesystems area in the coming years.
This year's keynote speaker was Greg Kroah-Hartman. Greg has posted the text
of his talk along with the slides; it is such a clear representation of
what was said that your editor sees no point in writing up a separate
summary. The talk covered topics like hardware support (Linux is now
second to none, says Greg), the illegal and unethical nature of closed
source kernel modules, various aspects of the kernel development process,
and more. The talk is very much worth a read.
For those who have not seen the article by Arjan van de Ven mentioned in
Greg's talk: Arjan's doomsday
scenario is also worth reading.
For the curious, the slides from
LWN editor Jonathan Corbet's talk are available.
OLS has always been a kernel-oriented event, and the 2006 version was
perhaps the most kernel-heavy yet. A look at the schedule shows almost no
non-kernel talks - and most of the exceptions were concerned with the git
and mercurial source control systems. The Desktop Developers' Conference
was held immediately before OLS (at the same time as the Kernel Summit),
but speakers from that conference did not speak at OLS. Their
presence was very much felt, however, and there were some good
conversations held between developers responsible for various levels of the
full Linux system. Next year, however, it would be nice to hear more from the
desktop people at OLS.
The fact that such a small complaint is the first that comes to mind speaks
loudly. OLS remains a top-notch technical conference with
interesting speakers, good organization (even the traditionally late final
keynote almost started on time this year), great conversations, and
a murderous closing party. The annual Ottawa pilgrimage remains an
important event for many in the development community.
Comments (3 posted)
Dave Jones's OLS talk, titled "Why
user space sucks
," was certain to be
popular at a setting like this. So many of the people in the standing room
only crowd might well have wondered why this talk was not scheduled into
the larger room. Perhaps the powers that be feared that a non-kernel talk would not have
a large audience - even when it is given by a well-known kernel hacker.
Dave set out to reduce the time it took his Fedora system to boot. In an
attempt to figure out what was taking so long, he instrumented the kernel
to log certain basic file operations. As it turned out, the boot process
involved calling stat() 79,000 times, opening 27,000 files, and
running 1382 programs. That struck him as being just a little excessive;
getting a system running shouldn't require that much work. So he looked
further. Here are a few of the things he found:
- HAL was responsible for opening almost 2000 files. It will read
various XML files, then happily reopen and reread them multiple
times. The bulk of these files describe hardware which has never been
anywhere near the system in question. Clearly, this is an application
which could be a little smarter about how it does things.
- Similar issues were found with cups, which feels the need to open the
PPD files for every known printer. The result: 2500 stat()
calls and 400 opens. On a system with no attached printer.
- X.org, says Dave, is "awesome." It attempts to figure out where a
graphics adapter might be connected by attempting to open almost any
possible PCI device, including many which are clearly not present on
the system. X also is guilty of reopening library files many times.
- Gamin, which was written to get poll() loops out of
applications, spends its time sitting in a high-frequency
poll() loop. Evidently the real offender is in a lower-level
library, but it is the gamin executable which suffers. As Dave points
out, it can occasionally be worthwhile to run a utility like
strace on a program, even if there are no apparent bugs. One
might be surprised by the resulting output.
- Nautilus polls files related to the desktop menus every few seconds,
rather than using the inotify API which was added for just this
- Font files are a problem in many applications - several applications
open them by the hundred. Some of those applications never present
any text on the screen.
- There were also various issues with excessive timer use. The kernel
blinks the virtual console cursor, even if X is running and nobody
will ever see it. X is a big offender, apparently because the
gettimeofday() call is still too slow and maintaining time
stamps with interval timers is faster.
There were more examples, and members of the audience had several more of
their own. It was all great fun; Dave says he takes joy in
collecting train wrecks.
The point of the session was not (just) to bash on particular applications,
however. The real issue is that our systems are slower than they need to
be because they are doing vast amounts of pointless work. This situation
comes about in a number of ways; as applications become more complex and
rely on more levels of libraries, it can be hard for a programmer to know
just what is really going on. And, as has been understood for many years,
programmers are very bad at guessing where the hot spots will be in their
creations. That is why profiling tools so often yield surprising results.
Programs (and kernels) which do stupid things will always be with us. We
cannot fix them, however, if we do not go in and actually look for the
problems. Too many programmers, it seems, check in their changes once they
appear to work and do not take the time to watch how their programs
work. A bit more time spent watching our applications in operation might
lead to faster, less resource-hungry systems for all of us.
Comments (74 posted)
The GNU Compiler Collection
(GCC) is a
fundamental part of our free operating system. Licenses may make the
software free, but it's GCC which lets us turn that software into something
our computers can run. GCC's strengths and weaknesses will, thus,
influence the quality of a Linux system in a big way. GCC is, however, an
opaque tool for many Linux users - and for many developers as well.
It is a black box, full of compiler magic, which, one hopes, just works.
For those interested in looking a little more deeply into GCC, however,
Novillo's OLS talk
was a welcome introduction.
According to Diego, GCC has been at a bit of a turning point over the last
couple of years. On one hand, the software is popular and ubiquitous. On
the other, it is a pile of 2.2 million lines of code, initially
developed by "people who didn't know about compilers" (that comment clearly
intended as a joke), and showing all of
its 15 years of age. The code is difficult to maintain, and even harder to
push forward. Compiler technology has moved forward in many ways, and GCC
is sometimes having a hard time keeping up.
The architecture of GCC has often required developers to make changes
throughout the pipeline. But the complexity of the code is such that
nobody is really able to understand the entire pipeline. There are simply
too many different tasks being performed. Recent architectural
improvements are changing that situation, however, providing better
isolation between the various pipeline stages.
GCC has a steering committee for dealing with "political stuff." There is,
at any given time, one release manager whose job is to get the next release
together; it is, says Diego, a thankless job. Then, there is a whole set
of maintainers who are empowered to make changes all over the tree. The
project is trying to get away from having maintainers with global commit
privileges, however. Since building a good mental model of the entire
compiler is essentially impossible, it is better to keep maintainers within
their areas of expertise.
The (idealized) development model works in three stages. The first two
months are for major changes and the addition of major new features. Then,
over the next two months, things tighten down and focus on stabilization
and the occasional addition of small features. Finally, in the last two
months, only bug fixes are allowed. This is, Diego says, "where everybody
disappears" and the release manager is force to chase down developers and
nag them into fixing bugs. Much of the work in this stage is driven by
companies with an interest in the release.
In the end, this ideal six-month schedule tends to not work out quite so
well in reality. But, says Diego, the project is able to get "one good
release" out every year.
GCC development works out of a central subversion repository with many
development branches. Anybody wishing to contribute to GCC must assign
copyrights to the Free Software Foundation.
The compiler pipeline looks something like this:
- Language-specific front ends are charged with parsing the input
source and turning it into an internal language called "Generic."
The Generic language is able to represent programs written in any
language supported by GCC.
- A two-stage process turns Generic into another language called
Gimple. As part of this process, the program is simplified in a
number of ways. All statements are rewritten to get to a point where
there are no side effects; each statement performs, at most, one
assignment. Quite a few temporary variables are introduced to bring
this change about. Eventually, by the time the compiler has
transformed the program into "low Gimple," all control structures have
been reduced to if tests and gotos.
- At this point, the various SSA ("single static assignment") optimizers
kick in. There are, according to Diego, about 100 passes made over
the program at this point. The flow of data through the program is
analyzed and used to perform loop optimizations, some vectorization
tasks, constant propagation, etc. Much more information on SSA can be
found in this LWN article
- After all this work is done, the result is a form of the program
expressed in "register transfer language" or RTL. RTL was originally
the only internal language used by GCC; over time, the code which uses
RTL is shrinking, while the work done at the SSA level is growing.
The RTL representation is used to do things like instruction
pipelining, common subexpression elimination, and no end of
- The final output from gcc is an assembly language program, which can
then be fed to the assembler.
The effect of recasting GCC into the above form is a compiler which is more
modular and easier to work with.
Future plans were touched on briefly. There is currently a great deal of
interest in static analysis tools. The GCC folks would like to support
that work, but they do not want to weigh down the compiler with a large
pile of static analysis tools. So they will likely implement a set of
hooks which allow third party tools to get the information they need from
the compiler. Inevitably, it was asked what sort of license those tools
would need to have to be able to use the GCC hooks; evidently no answer to
that question exists yet, however.
Another area of interest is link-time optimization and the ability to deal
with multiple program units as a whole. There is also work happening on
dynamic compilation - compiling to byte codes which are then interpreted by
a just-in-time compiler at run time. Much more information on current GCC
development can be found on the GCC
This session was highly informative. Unfortunately, its positioning on
the schedule (in the first Saturday morning slot, when many of those who
participated in the previous evening's whiskey tasting event were notably
absent) may have reduced attendance somewhat. This was, however, a talk
worth getting up for.
Comments (6 posted)
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Security: Scatterchat; New vulnerabilities in kdelibs, mysql, ruby, ...
- Kernel: Reconsidering network channels; revoke() and frevoke(); A proposal for a new networking API.
- Distributions: Linux From Scratch and Beyond; Ubuntu "edgy eft" Knot 1; Pie Box Enterprise Linux 3 Update 8; Musix 0.50
- Development: Optimizing Linker Load Times,
new versions of Rivendell, pgAdmin, dnspython, smbind, Samba, LAT, CUPS,
FTimes, Sussen, Apache Geronimo, OpenReports, Plone, WSMT, Jokosher,
Covered, OpenSceneGraph, Crunchy Frog.
- Press: The Open Graphics Project, AMD acquires ATI, India rejects OLPC, OSCON coverage, IBM to support
SUSE, FOSS at the United Nations, the Nao robot, Xubuntu's Jani Monoses
interviewed, audio and 64-bit Linux, BIND configuration, Thunderbird tips,
Live migration of Xen domains, Apache Geronimo review, WiFi Radar review,
the Nokia 770 Internet Tablet.
- Announcements: DesktopSecure for Linux,
The Socialtext Open wiki, GnuPG and freenigma, OpenDocument 1.0 second edition,
Real-Time Linux Workshop CFP, ETel 2007 CFP, Akademy 2007 Call for Location,
OpenDocument Day at Akademy 2006, ESC Boston, LinuxWorld Healthcare Day,
OO.o conf program, Ireland PyPy sprint, the Linux Quilt.