LWN.net Weekly Edition for September 15, 2011
LPC: Booting and systemd
At the 2011 Linux Plumbers Conference, it was not entirely unusual to hear complaints that the sessions were not as energetic and discussion-oriented as they were in previous LPCs. Chances are good that anybody talking that way did not attend the "Boot and Init" session, which was an occasion for vigorous - if good-natured - debate. This article will cover two of the topics discussed there: booting the system and the init process.
Reworking the boot sequence
According to Harald Hoyer, Linux does not lack for available boot loaders; indeed, we have far too many of them. These boot loaders are becoming more complex in unwelcome ways. GRUB and GRUB2, for example, contain reimplementations of a number of filesystems used in Linux. GRUB developers work hard to keep up, but they often find themselves one step behind what is being done in the kernel. GRUB2 has made things worse by turning its configuration file into a general-purpose scripting language; that adds a bunch of complexity to the bootstrap process. We also see battles between distributions (and non-Linux operating systems) over who controls the master boot record (MBR) on the disk.
Harald had a proposal for improving the situation: rather than add complexity to boot loaders in an attempt to keep up with the kernel, why not just boot a simple generic Linux kernel and let it deal with the rest? His idea is to create a /firstboot directory with a simple filesystem and populate it with a single Linux kernel and an initramfs image whose sole purpose is to find the real kernel and boot that. This kernel will naturally understand Linux filesystems, and it will support a user space with enough power to run whatever scripts are needed to find other bootable images on the system. Meanwhile, the initial boot loader can be made to be as simple as possible and distributions can stop messing with the MBR.
The idea has some clear appeal, but it was not universally accepted by the others in the room. To many, it looks like trying to solve the boot problem by adding an extra level of indirection. In the process, it adds another kernel bootstrap which, in turn, will make the boot process longer and, arguably, more likely to fail. It is safe to say that no consensus was reached in the room; the work will presumably continue and will be judged on its merits when it is more advanced.
Systemd
The bulk of the time in this session was spent discussing Systemd. Lennart Poettering talked at length about what has been accomplished in the last year and where Systemd can be expected to go in the future. Suffice to say that, as always, he does not lack ambition or a willingness to stir things up.
In the last year, Lennart said, we have seen the first release of a distribution using Systemd by default - Fedora 15. Mandriva has recently released a Systemd-based version, and others (including openSUSE and "a couple of others") are in the works. He seemed well pleased with the adoption of Systemd so far.
Systemd is now able to boot a system without invoking any shells at all. Under "ideal circumstances" it can get to a running user space less than one second after startup. Not everybody gets to run under ideal circumstances; the goal for the rest of us is less than ten seconds. There are some significant challenges in the way of getting there, though; for example, just loading the SELinux policy can take a few seconds by itself. Starting up the logical volume manager (LVM) can also take a while; Lennart proposes to fix that one by just removing LVM and using the volume management features in Btrfs instead.
Lennart paused here to make the point that Systemd is now a capable init system. But that's not where it stops; the plan is for Systemd to be a platform on which a number of interesting things can be built.
There was a bit of discussion over moving functionality into Systemd. For
example, not everybody is happy with moving the setting of the host name
into the program. Lennart's position is that this task is done with a
single system call; invoking a separate binary for that just isn't
worthwhile. Others disagreed with this assessment. There was similar
disagreement over setting the system clock directly instead of using
hwclock; once again Lennart thinks it is too simple a task to
require a separate program, especially one as filled with legacy cruft as
hwclock is. Scott James Remnant asserted that said cruft is what
makes hwclock actually work for all systems and asked whether
Lennart planned to refuse to support older machines. Lennart's response
was that
older hardware is fine; what he is not supporting is older kernels that
lack proper realtime clock drivers.
In general, though, he said that, while Systemd is trying to simplify common initialization tasks and make them fast, there is nothing preventing people from using external programs like hwclock if that is what they want to do. Systemd carefully avoids taking away the ability to use older tools if that's what's needed. When those tools are not necessary, though, Systemd aims to be able to completely boot a system with only a very small number of other packages (such as glibc, d-bus, and util-linux) installed. In the process, he hopes to standardize the boot process across distributions, getting rid of lots of little differences that do not need to be there.
A moment was spent defending Systemd against the charge of being bloated. It is not bigger than it needs to be, Lennart said, and embedded developers can use configuration options to trim it down considerably if there is functionality that they do not want. Systemd is being picked up by embedded distributions like Yocto and Ångström.
There are a number of interesting changes coming into Systemd in the near future. One of those is the elimination of getty processes at startup time. Instead, Systemd will start a getty on demand if and when the user switches to a virtual console. The user experience will be the same, but there will be fewer processes cluttering the system.
All services started by Systemd will have their standard output and error streams connected to syslog by default. That makes it easier to write services; it even supports the severity notation used by printk() in the kernel. The downside is that verbose processes can clog the system log, but, Lennart said, that should be fixed by shutting those processes up.
"Presets" are another upcoming feature. Each distribution has its own policy regarding whether services should be started by default and where the exceptions are. Fedora, for example, requires explicit administrator action to start any service, while Debian tends to assume that, if a service is installed, it is meant to be run. The preset feature allows the distributor to encapsulate that policy in a single file outside of the packages for those services. Spins or derivative distributions can use it to create a different policy without needing to modify the packages themselves, and administrators can impose their own policy if they wish.
Further in the future is the idea of using systemd to manage sessions. The problems encountered at that level, he said, are quite similar to those encountered at initialization time. It's really just a matter of starting a set of programs and keeping track of them. He had hoped to have session management ready for Fedora 16, but that didn't happen, so the current target is Fedora 17.
As part of this work, Lennart would really like the kernel to present a single view of an "application," which can involve any number of processes. For example, it would be nice to give specific applications access to certain ports through the firewall. Control groups handle this task reasonably well, so that is what Systemd is using. He is also trying to create a unified view of a "session" encompassing its control group, desktop, login information, PAM credentials, etc.
Specific goals for Fedora 17 include finishing this user session work. There should also be multi-seat support. Imagine plugging in a USB hub with keyboard, mouse, audio port, and frame buffer device; Systemd will pick it up, start a GDM session, and all of it will just work with no configuration work required at all. This feature will be nice for settings like schools where one system can easily handle multiple users; he also noted that it can be highly useful for debugging embedded systems. Once upon a time, all Unix systems were multi-seat; he is, he said, just bringing back a feature that was in Unix at the very beginning. One side effect of this work will be the removal of ConsoleKit.
There was some talk of removing the cron daemon, but that seems unlikely to happen. What may happen instead is a movement of all the standard system cron jobs to Systemd with the result that cron becomes an optional utility. There was some interesting talk of using wakeup timers to set up jobs that can actually power up the system to run. But cron itself is a useful tool with a nice-enough interface; there doesn't seem to be any real reason to replace it. But it will probably only be started if actual configuration files are found.
Finally, there was a bit of talk about Systemd's socket activation mechanism and security. Evidently "the SELinux folks" (not named in the discussion) do not like this feature because Systemd represents a third, uncontrolled process in the connection between client and server. But Lennart pointed out that Systemd never reads data from sockets; it simply uses them in the activation process. And, in any case, Systemd is charged with loading the SELinux policy in the first place; if it cannot be trusted, the system has larger problems.
The overall picture was of a project that is on a roll, gaining features and users at a fast rate. The Systemd view of the world has not yet won over everybody, but the opposition seems to be fading. Systemd looks like the init system of the future (and more) for a lot of high-profile distributions.
Transifex expands its offerings
Transifex, the web-based collaborative string translation hub, has rolled out a major update. There are several new tools and features aimed at developers and translation teams, but the most fundamental change is that the project is now offering paid accounts for those who wish to work on closed-source projects—providing funding that will help further development of the project.
Development projects can link Transifex to their existing version control system (VCS), and Transifex will pull in and parse supported file types that contain user-visible strings. On the Transifex site, translators can start language-specific translation efforts, entering translations for each string and (if the project managers allow it) pushing the results back to the original VCS. Project managers can leave their projects relatively free-form, or set up more structured translation teams, with approval required to check changes back in.
We covered Transifex in 2009, and the service has improved considerably since that time. The 1.0 release in 2010 brought the largest change set, adopting an internal storage engine that is agnostic to the upstream VCS used by the project. The Transifex server retrieves files over HTTP, so the upstream files to be translated must be publicly accessible in raw form.
Once files are fetched, the server parses the file, saving the original version as a template with its initial strings designated as the "source language." Translators are presented with the source language strings, and can enter their translated versions in a web-based editor — which was also new in 1.0. When changes are sent back to the upstream VCS, the Transifex server uses the template as a model, inserting the new strings where appropriate, and modifying the file metadata to indicate new language support and translator identities.
1.1 features
This level of automation requires building support for specific VCSes and new importer/exporter models for each new file type added. All of the major VCSes are supported now, but the number of file formats supported is still growing. The new 1.1 release, which landed in June 2011 — although the public Transifex.net server was not updated at that time — includes several new ones, most notably Freedesktop.org .desktop launcher files and the XML Localization Interchange File Format (XLIFF).
Most of the supported formats are for software development, such as .po and .pot files for Gettext, .strings files for Mac OS X and iOS, .resx for Windows, and the various formats for Android, Java, and Qt applications. It is surprising to some, but Transifex supports file formats designed for other purposes as well. Support for XHTML, PHP arrays, and YAML enables projects to work on translation of web content, and support for the .srt, .sub, and .sbv media subtitle formats enable video caption translations.
For translators, the biggest new feature of Transifex 1.1 is "translation memory." This is a database of other translations which can be used for reference when editing a new string. Older translations from the specific project are available as a sort of local phrasebook, but the more interesting development is that the translations of other projects are accessible as suggestions, too. That could be particularly helpful when starting out a new project — if there are several possibilities for an uncommon term, it would be useful to see what other projects chose. Transifex presents the translations of similar text culled from among the other hosted projects.
Picky or secretive project managers can disable the cross-project sharing feature, but still access previously-used suggestions from the individual project. A primitive version of this feature was available in Transifex 1.0, but it required making an explicit search query; the automatic suggestions are simpler to use. A spell-checker is also now built into the web-based editor, which is especially important because it auto-saves translation work.
Two new features are available from the project manager side of the interface. First, developers can enter comments on the translatable resources directly within the Transifex web editor. Those comments could include explanatory notes on specific terms, or general instructions for translators.
A bit more interesting is the "pseudo-file" auto-generation feature. Transifex can create translation files in a dummy language, which a developer can then run and use to spot any strings that somehow escaped into the interface without being marked for translation. This type of pseudo-file is called the "Dot language," which substitutes a period for every original character. There is also an option that inserts random characters into the file, which Transifex creator Dimitris Glezos said could prove useful in testing UI layouts.
#: addons/cla/handlers.py:58 msgid "License text" msgstr "Lïקïcéקénséקé téקéxt"
By "tall characters" Glezos primarily seems to mean accented letters, which are uncommon in English, the source language of most software projects. Extra-long and extra-tall strings have the potential to push interface widgets, menus, and column text out of vertical and horizontal alignment — a problem that can be difficult to test for without changing languages. Of course, a UI that breaks or becomes mis-aligned when the strings are too short is also a possibility, but in most cases the Dot language option would reveal those problems.
Finally, Transifex has also added an "Explore" interface to the site itself, which lets visitors browse featured and active projects, in order to foster community development. Right now, the explore feature highlights only the most-active and largest projects on the site. If encouraging new translators to join is part of the goal, hopefully other views will follow — such as which projects or languages are most in need of help. The individual project pages currently display this information, showing the completion-percentage for each language.
Freemium blend
Transifex was born out of Glezos's efforts to improve translation in the Fedora project, and the emphasis of the service is still squarely on open source projects. But 1.1 offers developers the ability to connect private and proprietary projects to the web service.
There are multiple pricing plans to choose from, which vary in the number of contributors and the number of source words that are allowed in private or proprietary projects. Free accounts can connect to a total of two users and amass 2,000 source words. The 30-Euro-per-month plan ups the numbers to five users and 10,000 words, and the 300-Euro plan to 20 users and 50,000 words. The 300-Euro plan also includes a distinct yourproject.transifex.net subdomain, and priority ticket and telephone support. All of the plans, free and premium, can host an unlimited number of open source projects, with no limits on the number of user accounts associated or the size of the content.
Placing premium-based restrictions on the number of user accounts and the number of words associated with private/proprietary projects seems like an odd choice — after all, one would have to count the total number of words in the strings of a project, which may be hard to predict if it is still in development. But then again, the Transifex server software is available under the GPLv2, so any entity interested in maintaining a large, closed translation effort could simply install the code on its own private server.
In the blog announcement heralding the arrival of 1.1, Transifex noted that in years past, self-hosting has been the project's answer when someone asked about running a private server. It can now offer an alternative, which parent company Indifex indicates will be used to fund further work on the main Transifex code base.
Last words
Compared to the Rosetta online translation editor offered by Launchpad, Transifex appears to be evolving at a faster pace. Rosetta supports a smaller set of of file formats (just Gettext and Mozilla .xpi at the moment), although Rosetta has supported translation suggestions based on other projects' strings for several years, and supports maintaining multiple translation efforts for concurrent branches of the same project. Unless I have missed it in the interface, that last feature is not yet supported in Transifex.
It will be interesting to watch how the tiered, freemium price plans affect Transifex's development. Paid users on high-end plans get guaranteed two-day turnaround of support tickets; the crux will be whether that includes feature requests as well as bugs. The Enterprise Edition already supports more file formats and "performance optimizations
".
Although Indifex says that revenue from the paid plans will be funneled back into the development of Transifex, we have all seen examples in the past where paid or "enterprise" users either begin to drive the development of new features, or get access to the new features before they are available to the free service or source code repository. Development is already underway for the next Transifex release, of course; given the large number of open source projects that now depend on it, hopefully it will be able to avoid the pitfalls of diverging free/paid interests.
LibreOffice and Apache OpenOffice.org one year later
It's hard to believe that it's been almost one year since The Document Foundation (TDF) came into existence. In that time, the foundation has made significant progress, Oracle has handed the OpenOffice.org keys to the Apache Foundation, and LibreOffice team has been working hard to improve the suite in the meantime.
OpenOffice.org has, itself, had a long strange trip. The suite began as a proprietary office suite called StarOffice developed and published by StarDivision. StarDivision was eventually snapped up by Sun Microsystems, which was ultimately swallowed by Oracle in 2010. After Oracle took over, little happened and it was unclear what plans (if any) the software giant had for OpenOffice.org.
Oracle's inaction, plus
impatience over promises to create a vendor neutral foundation for
OpenOffice.org, led to the decision to fork. Predictably, Oracle was
not pleased and showed
TDF members the door in October, 2010. Louis Suarez-Potts told the
members "your role in the Document Foundation and LibreOffice makes
your role as a representative in the OOo CC untenable and
impossible
", and gave them the option of disassociating themselves
from TDF or resigning. Very little else happened with OpenOffice.org in the
meantime until Oracle proposed
OpenOffice.org to Apache as an Incubator project on June 1st.
LibreOffice developers didn't sit on their hands after announcing the intent to fork. LibreOffice was put on an aggressive time-based release plan, with two major releases a year. The first stable release (3.3.0) landed just four months after the split, with a number of new features. Development has continued at a fair clip, and the LibreOffice team continues to push out point releases on a regular basis. Meanwhile, most if not all Linux distributions have made the transition from OpenOffice.org to LibreOffice without any major headaches.
LibreOffice Goals Met?
When LibreOffice launched, longtime OO.org developer Michael Meeks
talked to LWN about the goals for LibreOffice. Meeks said that he wanted
LibreOffice to have a "All Contributions Welcome and Valued
"
sign welcoming contributions, clean up LibreOffice code, and "target
tackling many of the problems that have traditionally made it hard to
develop with, such as the arcane and monolithic build system
".
In February 2011, the project started
fundraising to set up TDF as a legal entity. It took only
eight days to raise the €50,000 that the foundation sought to
incorporate the legal entity in Germany. More than 2,000 contributors
donated.
At six months, TDF member Florian Effenberger observed the milestone with a post tallying the project's accomplishments. More than 6,000 people subscribed to LibreOffice mailing lists, more than 150 new contributors checked in code for LibreOffice, and the project picked up more than 50 translators as well.
The foundation is having its first election with voting through October 10 to fill a board of seven board seats and three deputies.
How about contributions? A snapshot of contributors to LibreOffice 3.4.2 shows that about 25% came from SUSE, about 25% were brought in from OpenOffice.org (attributed to Oracle), and about 20% from Red Hat. Contributors not affiliated with one of the big vendors also account for about 25% of the contributions. According to the post, 3.4.2 received more than 23,000 commits from 300 contributors. This may not reflect all work on LibreOffice, but it does show a pattern of heavy contribution.
(Re)-Bootstrapping OpenOffice.org
While LibreOffice continues to churn out releases, the slow work of transitioning OpenOffice.org to Apache is continuing. The incubator site is up on Apache.org, and things like the mailing lists have been put in place. The project has more than 70 committers listed, and commits have started coming in as well.
However, according to the clutch status page for Apache Incubator projects the project has not added any committers since the project was established. The project also lacks an issue tracker. There are no releases for Apache OpenOffice — even a beta — though code is available in Apache's repository. This is not surprising, since much of the discussions on the list involve trying to successfully build AOO. The project blog has been relatively quiet, with only two posts. The first post in June, announces the addition of Apache OpenOffice.org to the incubator. The second on September 1st announcing a IRC-based developer eduction event for building OpenOffice.org on Linux.
The developer list for the AOO podling has been fairly active — though much of the recent conversation has been community governance problems that need to be solved with regards to moving from an established project to the Apache structure and new management.
The Future
Apache OpenOffice.org is still putting together its plans for builds and releases. The plans for the first Apache release include phasing out the old binary format for OpenOffice.org but not much in the way of new features. LibreOffice also will be doing away with the old binary StarOffice formats in the 4.0 timeframe. Assuming AOO.org does come online and start pouring out new features, they may be difficult to share with LibreOffice according to Meeks. This has been raised as an issue by Rob Weir on the AOO.org list.
The LibreOffice team recently had a hackfest in Munich. Some of the concrete features that came out of that include support for importing Visio format, a feature for editing headers and footers in Writer, and an initial Gerrit setup for code review. The project has also launched a extension and template repository for LibreOffice and compatible suites. The sites are in beta testing at the moment, put into place in cooperation with the Plone community.
In October, the first LibreOffice conference will take place in Paris. The conference will run from 12 to 15 October, and includes everything from media training for LibreOffice volunteers to a presentation about LibreOffice Online (LOOL) by Michael Meeks. Unfortunately, no details are provided regarding the plans Meeks has for the presentation. Perhaps we'll see a libre competitor to Google Docs at some point from the LibreOffice folks.
Coming in 3.5
The LibreOffice 3.5.0 release is planned for December. The work-in-progress release notes indicate some of the features that may appear in 3.5. Currently there's a plan to include two new numbering types for bullets (Persian words, and Arabic Abjad sequence) in Writer, and display non-printable characters at the end of a line if desired.
Calc may increase support to 32,000 sheets thanks to features from Markus Mohrhard, and users will be able to specify how many sheets are available in a new Calc document thanks to Albert Thuswaldner. There's also improvements to line drawing in Chart, and Kohei Yoshida has added some performance improvements for importing Excel documents.
Miklos Vajna has been improving import for RTF and DOCX formats, which should land in 3.5 as well. The proposed release notes also have a few GUI improvements, such as getting rid of the unused toolbar menus and sorting menus in a natural sort order (so Heading 10 would follow Heading 9, instead of Heading 1 in formatting as an example).
One year following the split, and LibreOffice looks like a fairly healthy and viable community. Apache OpenOffice.org may also grow into a viable project, though it's a bit too early to tell whether it has legs.
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Security: Linux Security Summit LSM roundtable; New vulnerabilities in chromium, cyrus-imapd, kernel, openssl, ...
- Kernel: Kernel development without kernel.org; Making the net go faster; Coping with hardware diversity; Bufferbloat update.
- Distributions: The state of Gentoo; CentOS, OpenIndiana, Scientific Linux, ...
- Development: FileTea; BlackHole, MongoDB, Nemiver, Veil, ...
- Announcements: Qt Project; FSF and the GPLv2 death penalty; Perens on Open Source Cooperation; GStreamer conference; linux.conf.au.