User: Password:
|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for July 27, 2006

The 2006 Ottawa Linux Symposium

Your editor has, once again, had the opportunity to add to his collection of Ottawa Linux Symposium T-shirts. OLS2006 was a fun and interesting event, a testament to the increasing professionalism of its organizers, speakers, and attendees. And also, of course, to the energy and vitality which drives the Linux community.

Interesting things can be seen by looking at the the people who attend an event like this one. Not that long ago, the preferred attire was a shirt from a Linux event - the older, the better. While those shirts are still very much in evidence, shirts of the button-down variety are on the rise. Fortunately, there are still very few neckties to be seen (James Bottomley - next year's OLS keynote speaker - being the exception that proves the rule in this regard). There were also quite a few attendees who had clearly made the trip from Asia.

LinuxWorld may be the place to go to see what companies are doing, but OLS has clearly established itself as the event to attend to learn about what the development community - and the kernel development community in particular - is up to.

This year's schedule reveals some things about what the community is interested in. Virtualization remains a hot topic, but the emphasis has changed: Xen, the king of paravirtualization, was well represented, but was far from the whole story. Ian Pratt's Xen talk was held in one of the smaller rooms this year. The hotter topic appeared to be containers - lightweight virtualization which runs under the same kernel as the host. There is a lot of development activity around containers at the moment, and many of the people involved were at OLS to talk about it.

Last year's schedule featured exactly one filesystem talk - an update on ext3. This year, a quick scan shows no less than nine filesystem talks, plus a few on related topics (shared subtrees, for example). Expect to see some interesting development work in the filesystems area in the coming years.

[Greg KH] This year's keynote speaker was Greg Kroah-Hartman. Greg has posted the text of his talk along with the slides; it is such a clear representation of what was said that your editor sees no point in writing up a separate summary. The talk covered topics like hardware support (Linux is now second to none, says Greg), the illegal and unethical nature of closed source kernel modules, various aspects of the kernel development process, and more. The talk is very much worth a read.

For those who have not seen the article by Arjan van de Ven mentioned in Greg's talk: Arjan's doomsday scenario is also worth reading.

For the curious, the slides from LWN editor Jonathan Corbet's talk are available.

OLS has always been a kernel-oriented event, and the 2006 version was perhaps the most kernel-heavy yet. A look at the schedule shows almost no non-kernel talks - and most of the exceptions were concerned with the git and mercurial source control systems. The Desktop Developers' Conference was held immediately before OLS (at the same time as the Kernel Summit), but speakers from that conference did not speak at OLS. Their presence was very much felt, however, and there were some good conversations held between developers responsible for various levels of the full Linux system. Next year, however, it would be nice to hear more from the desktop people at OLS.

The fact that such a small complaint is the first that comes to mind speaks loudly. OLS remains a top-notch technical conference with interesting speakers, good organization (even the traditionally late final keynote almost started on time this year), great conversations, and a murderous closing party. The annual Ottawa pilgrimage remains an important event for many in the development community.

Comments (3 posted)

OLS: On how user space sucks

Dave Jones's OLS talk, titled "Why user space sucks," was certain to be popular at a setting like this. So many of the people in the standing room only crowd might well have wondered why this talk was not scheduled into the larger room. Perhaps the powers that be feared that a non-kernel talk would not have a large audience - even when it is given by a well-known kernel hacker.

Dave set out to reduce the time it took his Fedora system to boot. In an attempt to figure out what was taking so long, he instrumented the kernel to log certain basic file operations. As it turned out, the boot process involved calling stat() 79,000 times, opening 27,000 files, and running 1382 programs. That struck him as being just a little excessive; getting a system running shouldn't require that much work. So he looked further. Here are a few of the things he found:

  • HAL was responsible for opening almost 2000 files. It will read various XML files, then happily reopen and reread them multiple times. The bulk of these files describe hardware which has never been anywhere near the system in question. Clearly, this is an application which could be a little smarter about how it does things.

  • Similar issues were found with cups, which feels the need to open the PPD files for every known printer. The result: 2500 stat() calls and 400 opens. On a system with no attached printer.

  • X.org, says Dave, is "awesome." It attempts to figure out where a graphics adapter might be connected by attempting to open almost any possible PCI device, including many which are clearly not present on the system. X also is guilty of reopening library files many times.

  • Gamin, which was written to get poll() loops out of applications, spends its time sitting in a high-frequency poll() loop. Evidently the real offender is in a lower-level library, but it is the gamin executable which suffers. As Dave points out, it can occasionally be worthwhile to run a utility like strace on a program, even if there are no apparent bugs. One might be surprised by the resulting output.

  • Nautilus polls files related to the desktop menus every few seconds, rather than using the inotify API which was added for just this purpose.

  • Font files are a problem in many applications - several applications open them by the hundred. Some of those applications never present any text on the screen.

  • There were also various issues with excessive timer use. The kernel blinks the virtual console cursor, even if X is running and nobody will ever see it. X is a big offender, apparently because the gettimeofday() call is still too slow and maintaining time stamps with interval timers is faster.

There were more examples, and members of the audience had several more of their own. It was all great fun; Dave says he takes joy in collecting train wrecks.

The point of the session was not (just) to bash on particular applications, however. The real issue is that our systems are slower than they need to be because they are doing vast amounts of pointless work. This situation comes about in a number of ways; as applications become more complex and rely on more levels of libraries, it can be hard for a programmer to know just what is really going on. And, as has been understood for many years, programmers are very bad at guessing where the hot spots will be in their creations. That is why profiling tools so often yield surprising results.

Programs (and kernels) which do stupid things will always be with us. We cannot fix them, however, if we do not go in and actually look for the problems. Too many programmers, it seems, check in their changes once they appear to work and do not take the time to watch how their programs work. A bit more time spent watching our applications in operation might lead to faster, less resource-hungry systems for all of us.

Comments (74 posted)

OLS: GCC: present and future

The GNU Compiler Collection (GCC) is a fundamental part of our free operating system. Licenses may make the software free, but it's GCC which lets us turn that software into something our computers can run. GCC's strengths and weaknesses will, thus, influence the quality of a Linux system in a big way. GCC is, however, an opaque tool for many Linux users - and for many developers as well. It is a black box, full of compiler magic, which, one hopes, just works. For those interested in looking a little more deeply into GCC, however, Diego Novillo's OLS talk was a welcome introduction.

According to Diego, GCC has been at a bit of a turning point over the last couple of years. On one hand, the software is popular and ubiquitous. On the other, it is a pile of 2.2 million lines of code, initially developed by "people who didn't know about compilers" (that comment clearly intended as a joke), and showing all of its 15 years of age. The code is difficult to maintain, and even harder to push forward. Compiler technology has moved forward in many ways, and GCC is sometimes having a hard time keeping up.

The architecture of GCC has often required developers to make changes throughout the pipeline. But the complexity of the code is such that nobody is really able to understand the entire pipeline. There are simply too many different tasks being performed. Recent architectural improvements are changing that situation, however, providing better isolation between the various pipeline stages.

GCC has a steering committee for dealing with "political stuff." There is, at any given time, one release manager whose job is to get the next release together; it is, says Diego, a thankless job. Then, there is a whole set of maintainers who are empowered to make changes all over the tree. The project is trying to get away from having maintainers with global commit privileges, however. Since building a good mental model of the entire compiler is essentially impossible, it is better to keep maintainers within their areas of expertise.

The (idealized) development model works in three stages. The first two months are for major changes and the addition of major new features. Then, over the next two months, things tighten down and focus on stabilization and the occasional addition of small features. Finally, in the last two months, only bug fixes are allowed. This is, Diego says, "where everybody disappears" and the release manager is force to chase down developers and nag them into fixing bugs. Much of the work in this stage is driven by companies with an interest in the release.

In the end, this ideal six-month schedule tends to not work out quite so well in reality. But, says Diego, the project is able to get "one good release" out every year.

GCC development works out of a central subversion repository with many development branches. Anybody wishing to contribute to GCC must assign copyrights to the Free Software Foundation.

The compiler pipeline looks something like this:

  1. Language-specific front ends are charged with parsing the input source and turning it into an internal language called "Generic." The Generic language is able to represent programs written in any language supported by GCC.

  2. A two-stage process turns Generic into another language called Gimple. As part of this process, the program is simplified in a number of ways. All statements are rewritten to get to a point where there are no side effects; each statement performs, at most, one assignment. Quite a few temporary variables are introduced to bring this change about. Eventually, by the time the compiler has transformed the program into "low Gimple," all control structures have been reduced to if tests and gotos.

  3. At this point, the various SSA ("single static assignment") optimizers kick in. There are, according to Diego, about 100 passes made over the program at this point. The flow of data through the program is analyzed and used to perform loop optimizations, some vectorization tasks, constant propagation, etc. Much more information on SSA can be found in this LWN article from 2004.

  4. After all this work is done, the result is a form of the program expressed in "register transfer language" or RTL. RTL was originally the only internal language used by GCC; over time, the code which uses RTL is shrinking, while the work done at the SSA level is growing. The RTL representation is used to do things like instruction pipelining, common subexpression elimination, and no end of machine-specific tasks.

  5. The final output from gcc is an assembly language program, which can then be fed to the assembler.

The effect of recasting GCC into the above form is a compiler which is more modular and easier to work with.

Future plans were touched on briefly. There is currently a great deal of interest in static analysis tools. The GCC folks would like to support that work, but they do not want to weigh down the compiler with a large pile of static analysis tools. So they will likely implement a set of hooks which allow third party tools to get the information they need from the compiler. Inevitably, it was asked what sort of license those tools would need to have to be able to use the GCC hooks; evidently no answer to that question exists yet, however.

Another area of interest is link-time optimization and the ability to deal with multiple program units as a whole. There is also work happening on dynamic compilation - compiling to byte codes which are then interpreted by a just-in-time compiler at run time. Much more information on current GCC development can be found on the GCC wiki.

This session was highly informative. Unfortunately, its positioning on the schedule (in the first Saturday morning slot, when many of those who participated in the previous evening's whiskey tasting event were notably absent) may have reduced attendance somewhat. This was, however, a talk worth getting up for.

Comments (6 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Security: Scatterchat; New vulnerabilities in kdelibs, mysql, ruby, ...
  • Kernel: Reconsidering network channels; revoke() and frevoke(); A proposal for a new networking API.
  • Distributions: Linux From Scratch and Beyond; Ubuntu "edgy eft" Knot 1; Pie Box Enterprise Linux 3 Update 8; Musix 0.50
  • Development: Optimizing Linker Load Times, new versions of Rivendell, pgAdmin, dnspython, smbind, Samba, LAT, CUPS, FTimes, Sussen, Apache Geronimo, OpenReports, Plone, WSMT, Jokosher, Covered, OpenSceneGraph, Crunchy Frog.
  • Press: The Open Graphics Project, AMD acquires ATI, India rejects OLPC, OSCON coverage, IBM to support SUSE, FOSS at the United Nations, the Nao robot, Xubuntu's Jani Monoses interviewed, audio and 64-bit Linux, BIND configuration, Thunderbird tips, Live migration of Xen domains, Apache Geronimo review, WiFi Radar review, the Nokia 770 Internet Tablet.
  • Announcements: DesktopSecure for Linux, The Socialtext Open wiki, GnuPG and freenigma, OpenDocument 1.0 second edition, Real-Time Linux Workshop CFP, ETel 2007 CFP, Akademy 2007 Call for Location, OpenDocument Day at Akademy 2006, ESC Boston, LinuxWorld Healthcare Day, OO.o conf program, Ireland PyPy sprint, the Linux Quilt.
Next page: Security>>

Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds