|
|
Log in / Subscribe / Register

Distributions

LinuxCon: A tale of two bootcharts

August 25, 2010

This article was contributed by James M. Leddy

Attendees at LinuxCon 2010 were lucky enough to have not just one, but two presentations devoted to boot speed. The first was "How We Made Ubuntu Faster", by Upstart creator Scott James Remnant; the other was "Improving Android Boot-Up Time", by Tim Bird of Sony. As expected, Scott's talk was centered around netbooks running Ubuntu, while Tim focused on different development boards running Android. Nevertheless, there were some commonalities between both projects.

Ubuntu

No discussion of boot up speed would be complete without mentioning the 5 second boot achieved by Arjan de Ven and Auke Kok of Intel's Open Source Technology Center. In fact, a number of things from Scott's session assumed a knowledge of that effort by Intel.

[Ubuntu bootchart]

Good metrics are pivotal for improving boot time, and to get good metrics one must standardize the variables. The hardest of these is the machine, because everyone has different computers that have various components that are slower or faster than others. The Ubuntu team realized they would have to buy a whole bunch of "standard" computers. They chose the Dell Inspiron mini 10 netbook, dubbed the "touchpad from hell" by Scott because it was hard to use without the pointer jumping around. The laptop as a whole has the key requirement of being available in SSD and rotational media configurations, and is cheap enough to keep the project under budget.

The next important piece is to have a goal in mind. They chose 10 seconds, by "doubling the numbers that Arjan came up with". The kernel and initramfs get a total of two seconds. The same is allocated for platform initialization such as init scrips. The X server gets another two seconds, and the desktop environment, Gnome, gets four. It turns out these numbers weren't accurate predictors in the long run, but for some cases such as kernel, they were able to beat their deadline.

In order to create an automated system to measure the changes over time, the team threw together a pretty elaborate configuration where the system would reinstall the latest nightly builds, and then profile the resulting boot automatically. They compiled all the results and put them on Scott's people page.

One of the big portions of the Moblin kernel improvement was the early use of asynchronous kernel threads. They improved boot time by initializing the SATA controller, to handle storage, at the same time as the USB host adapters. Canonical built upon on this work by moving populate_rootfs(), the function responsible for unpacking the initramfs, to yet another asynchronous thread.

Though Intel claimed a speed boost from statically compiling modules into the kernel, the Canonical team has to be able to support more than just Intel netbooks. To achieve this, they cleaned up some of the slower parts of the init script, such as a replacement of a 10 millisecond poll of the blkid binary with an event based call to libudev. In the end the team was able to surpass their 2 second target, even with the requirement to use an initramfs.

Scott took some time here to plug Upstart. Though the Intel ultimately settled on a hand tuned invocation of the old System V init daemon to improve boot, Scott insisted that an event based system is better than "thousands of lines of shell script". This is even more true today because pretty much every system on the market has more than one CPU.

The Gnome environment took a bit of time to boot as well. Ubuntu uses Compiz by default, and this took almost half of the time allocated for the desktop environment. The audience asked if Compiz could be eliminated, but there are too many features of Ubuntu that depend on its inclusion. Other large offenders were gnome-panel and Nautilus. Altogether these components contributed to a 10 second Gnome start up, more than double their 4 second allotment.

Their research revealed that storage is the ultimate bottleneck. "Hard drives suck, but SSDs suck too" was the specific wording. To improve the situation, Scott used a well known tool called readahead. Initially developed at Red Hat, readahead is a tool that will log the filename for every instance of open() and execve() for the first 60 seconds of boot. Then on the next boot, a readahead process is spawned early that pulls all files in the list into the page cache, ensuring it's just a simple memory access when they're read later on.

Intel improved Red Hat's readahead with super-readahead, or sreadahead. This does the same thing, but was modified to only preload the blocks that will be read in, instead of the entire file. Since it's assumed to be a MeeGo system running on SSDs, and all SSDs have negligible seek time, the order of blocks on disk is not taken into account. Using an SSD, the Ubuntu system can read all blocks necessary for boot in 3 seconds.

[Readahead graph]

However, Ubuntu has to run on rotating media as well, so yet another iteration called über-readahead was created by Scott. The daemon was modified so that it reads blocks in order when using a rotating hard drive. The graph of this optimization shows a few random metadata reads, followed by a smooth linear path across the platter. For the rotational media, all pages necessary can be read into page cache in fewer than 7 seconds. Scott said that things can go even faster if the inital reads could be sorted and done in order prior to performing readahead on the file contents. There were a few file system patches sent to LKML, but inclusion does not seem likely at this point.

Scott concluded the discussion by admitting they didn't achieve their goal. The inability to reduce the desktop environment portion to fewer than ten seconds precluded a sub-ten second overall boot. Note this is on a Dell netbook, so the numbers will likely be better for systems with beefier processors and I/O subsystems, which includes almost all desktops and traditional laptops. In the presentation abstract, it is stated that some machines boot Ubuntu in as few as 5 seconds. The good news is that the kernel now takes fewer than 2 seconds to initialize, even with the initramfs requirement. And Scott did a lot of useful work that can be used by the larger community. Only time will tell if the other distributions take advantage of his work.

Android

Tim ran into a unique set of problems with Android handsets. Whereas Scott's problems were already well known in the much wider desktop Linux community, Tim is working with a suite of tools with names like Dalvik and Zygote, whose source code has rarely been modified outside of Google. As such, Tim's focus was about getting an initial performance profile to find what part of the boot process will yield the largest reduction in time, and in turn should get the most developer effort.

[Android board]

He profiled three different platforms, the ADP1 and Nexus 1 from HTC, and an EVM OMAP3 board from Mistral Solutions. The overall time for these machines to boot was 57, 36, and 62 seconds, respectively. Though these are all ostensibly development machines, that number still seemed huge compared to the netbook boot times, but it should be mentioned that these boards have a much slower processor and storage. By the same token, you can do a lot more with a fully functional Gnome desktop than a smart phone. Tim pointed out that "it's really sad that you can use a stopwatch to accurately measure a phone booting up".

The boot chart for the EVM board revealed a number of areas for improvement. Android uses a rewritten Java-esque virtual machine called Dalvik in all of its phones. For optimal user experience, all of the classes must be preloaded before the phone is used. Zygote, the utility responsible for doing this work, spends about 21 seconds in I/O wait. The application classes don't have to be preloaded, one can choose to load them on demand, but this is just pushing the problem back and causes longer application load times. Worse, there is a memory penalty for each class now has to be loaded in a different heap, so the memory for identical classes can't be shared.

[Android bootchart]

A potential solution is to figure out how to load every class into Zygote's heap, so that you can have something akin to shared libraries in a conventional OS. Another possible solution is to make Zygote threaded, and have one thread use the CPU while the other is reading from storage. A more far out possibility avoids reading in the classes at all and loads the heap as a binary blob, though this would take the most development effort and would require a rebuild when new classes are installed.

The other potential speed gain lies in the package scanning tool. The purpose of this tool wasn't exactly obvious to Tim, but he illustrated its complexity by showing the call tree. At the end of it all is the parseZipArchive() function, which is called 138 times. There is some low hanging fruit there, for example Tim shaved off a few seconds by commenting out a sanity check of the zip file headers. Just above that is a ZipFileRO::open() call which will mmap() the zip file into memory. The problem is that parseZipArchive() walks the mmaped region and builds a hash table to make subsequent accesses easier, causing page faults for the entire archive. All this is done just to extract one file, AndroidManifest.xml, so the time and memory spent to fault in all those pages and build the hash table is essentially wasted.

There is an emerging consensus within the Android development community that a lot of time can be shaved if readahead is used. But Tim thought it was masking the underlying problem, and that some of the blocks shouldn't be read at all, much less used to populate the page cache. Scott, who was in the audience at this discussion, noted that readahead isn't really about masking temporal locality of reference or "papering over problems", but it is about using the CPU while populating the page cache. Tim still felt that using readahead would make the problems with the code less noticeable, and developers wouldn't be as motivated to fix them. They both agreed when this ships on a device to consumers, it should have readahead enabled.

Unfortunately there were no significant speed ups in boot time yet, but there is still work to do. Interested readers are encouraged to sign up for the Android mailing lists, and check out the eLinux wiki.

Conclusions

Though Android and Dalvik are a departure from the traditional GNU userspace that Ubuntu uses, they do have some commonalities. Firstly, because the kernel is not impacted by user-space differences, kernel improvements will be available for any and all Linux devices. Tim didn't mention the kernel because there are already a lot of well known techniques to boot the kernel faster, so it was outside of the scope of his talk. Presumably, the techniques covered in the Ubuntu presentation would also help the Android system boot more quickly.

Some improvements in user space carry over as well. Readahead is a generic enough technique that it can be included in pretty much any environment. Similarly, profiling techniques like bootchart and ftrace can be run in both environments. However, generic GNU userspace has the advantage of more code sharing and reuse than third party environments like Android. Improvements to booting the X server, for example, will be felt across Ubuntu, MeeGo, and the other desktop Linuxes out there. That isn't the case for Android.

Even so, the developer community for Android is growing and Tim is evidence of that. The problem of slow Android boots has probably not been thought about much outside of Google's office walls, but that is changing. The potential for improvement is there, especially in Android-specific places like the package scanner and Zygote. For desktop distributions and even specialized distributions like MeeGo, the fast boot story may be largely coming to an end. For Android, it's only just beginning.

[ Bootcharts, graph, and photo courtesy of Scott James Remnant of Canonical and Tim Bird of Sony. ]

Comments (43 posted)

New Releases

Announcing the release of Fedora 14 Alpha

The Fedora 14 "Laughlin" Alpha release is available for testing. "We need your help to make Fedora 14 the best release yet, so please take a moment of your time to download and try out the Alpha and make sure the things that are important to you are working. If you find a bug, please report it -- every bug you uncover is a chance to improve the experience for millions of Fedora users worldwide."

Full Story (comments: 22)

Lunar Linux 1.6.5 (i686 & x86_64) ISO's released

Lunar Linux is one of those distributions with a rolling release model. Development is active but ISO images are rare. In fact the last ISO release (v1.6.4) was in December 2008. Now fresh ISO images of Lunar Linux 1.6.5 are available. New features include kernel 2.6.35.3 and glibc 2.11.2, hybrid ISO support, and support for EXT4. Click below for more details.

Full Story (comments: none)

Nexenta Core Platform 3.0 Released

The Nexenta project has announced the availability of the Nexenta Core Platform 3.0. "The move to NCP 4.0 will be in 2 phases. The first immediate change would be to move from OpenSolaris b134 to a recent Illumos build. With this the Nexenta project will change it's base from OpenSolaris to Illumos."

Full Story (comments: none)

Distribution News

Distribution quotes of the week

There is some irony in the fact that the most common questions I get asked are "What is new in Fedora $CURRENT?" and "What is coming in Fedora $CURRENT+1", and the most common complaint I hear is "Why did you change $FOO in Fedora $CURRENT?!?".
-- Tom "spot" Callaway

I predict that Ubuntu 11.10 will be named "Ostentatious Ocelot".

If I'm right, someone owes me a dollar. And maybe royalties.

-- Greg DeKoenigsberg

Poor openSUSE was nearly forgotten. Their fifth birthday fell on August 9 and only one Website remembered. OMG! SUSE! offered their birthday wishes and gave a few milestones. Mentioned was the initial release of openSUSE, 10.0, in October 2005 and there have been seven major releases since. The most recent was 11.3.
-- Susan Linton by way of Linux Journal.

Comments (none posted)

Debian GNU/Linux

Meeting Minutes for the IRC Release Team Meeting on August 23, 2010

Click below for the minutes of the August 23, 2010 meeting of the Debian release team. Topics include release architectures and the status of the release notes. The next release team meeting will be in Paris, October 2-3.

Full Story (comments: none)

Bits from the MIA team

Debian's Missing In Action (MIA) team tries to track maintainers who seem to be neglecting their duties. Click below for a report from the team.

Full Story (comments: none)

Bits from the Debian Women project

The Debian Women project is recruiting women to participate in Debian, whether that be as packagers, bug reporters, technical documentation writers, bug fixers, translators, artists or any other area that helps the development of Debian. "There have been at least 38 women that have contributed in packaging software for Debian, and there are currently 11 female Debian Developers and 1 Debian Maintainer. We'd like to raise those numbers to 50 packagers by the end of 2011, and 20 Debian Developers by the end of 2012."

Comments (none posted)

Fedora

Fedora Board Recap 2010-08-20

Click below for a recap of the August 20, 2010 meeting of the Fedora Advisory Board. Topics include Fedora 14 Alpha, a discussion regarding Fedora governance, and a review of open tickets.

Full Story (comments: none)

Red Hat Enterprise Linux

Red Hat Enterprise Linux Extended Life Cycle Support

Red Hat has announced Red Hat Enterprise Linux Extended Life Cycle Support (ELS), which allows customers to continue use of Red Hat Enterprise Linux (RHEL) major releases such as RHEL 3 beyond the regular 7-year life cycle. "Available as an add-on option, Extended Life Cycle Support complements the customers in-place Red Hat Enterprise Linux subscription and is available in single-year subscriptions that allow customers to extend the total use of given major releases by extending the overall supported life cycle from 7 years up to a total of 10 years. This new offering requires the customer to have an existing RHEL subscription with equivalent subscription terms and support level."

Comments (26 posted)

Ubuntu family

Allison Randal becomes the Ubuntu Technical Architect

Allison Randal, known to many for her work in the Perl and Parrot communities, has announced that she has taken on a new role as the Technical Architect for the Ubuntu distribution. "Right at the start, I should make it clear that I am not the SABDFL. I'm here to help turn his vision into reality. That's what architects do, translate between the potential for a building and carefully measured graphite on paper, then act as a resource for the whole crew as they work together to translate an abstract plan into hard steel, warm brick, and shining glass. I'm here to champion the community's vision for Ubuntu, to facilitate conversations as we integrate multiple perspectives and balance multiple needs, to ask good questions that help us find better solutions. I'm here to help smooth some of the bumps in the road, because no road worth traveling is ever completely easy."

Comments (3 posted)

Ubuntu drops support for ia64 and sparc

The Ubuntu tech board adopted a policy in June that the ia64 and sparc architectures would be dropped if nobody stepped forward to maintain them. Now comes the follow-through: "No one has stepped forward to take ownership. The Tech Board has subsequently voted (and approved) to decommission support for both these ports archs. The following set of patches drop support for ia64 and sparc from our Maverick kernel."

Full Story (comments: none)

Newsletters and articles of interest

Distribution newsletters

Comments (none posted)

DebConf and Debian

Richard Darst has a series of blog posts (start here) about DebConf, to document the planning process. "At DebConf10, there was a BOF entitled "DebConf & Debian", discussing about the similarities and differences of Debian and DebConf. At times this became a little bit heated as people wanted more integration and transparency. I don't think the differences are that great. I don't even see them as separate. I see DebConf as a Debian team, one that has to release every year on a short time frame. Things are rushed, things are not perfect, but we welcome anyone who wants to help." (Thanks to Paul Wise)

Comments (none posted)

Spotlight on Linux: Parsix 3.6 (RC) (Linux Journal)

Linux Journal takes a quick look at Parsix GNU/Linux. "Parsix is based on Debian Testing with elements of Kanotix and Knoppix still around here and there. It defaults to English, but supports several other languages such as Finnish, French, German, Italian, and, of course, Persian. The image ships as an installable live DVD and versions are available for 32-bit or 64-bit systems. The installer is easy-to-use, fast, and reliable. Handy but perhaps slightly outdated documentation is available such as the User's Guide and an Installation walkthrough. There is also an online Forum for chatting or seeking assistance."

Comments (none posted)

Page editor: Rebecca Sobol
Next page: Development>>


Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds