LWN.net Logo

LWN.net Weekly Edition for February 24, 2011

The Freedom Box gets off the ground

By Jake Edge
February 23, 2011

The Freedom Box is starting to roll, with a fundraising drive that met its goals in a few short days, along with a newly formed foundation to oversee its development. What started as an idea in a talk given by Eben Moglen just over a year ago has more recently gained a lot of momentum. What can we expect to see from this "personal server running a free software operating system, with free applications designed to create and preserve personal privacy", and when can we expect to see it?

The "when" question may have become somewhat clearer since the "Push the FreedomBox Foundation from 0 to 60 in 30 days" Kickstarter fundraising effort has clearly been a success. The fundraising drive was set up on February 17, with the goal of getting $60,000 in donations in 30 days, but it has exceeded that—and quickly. As of this writing, there are more than 650 supporters who have donated over $64,000 in just five or six days. Based on the Kickstarter appeal, reaching the goal (and quite possibly far surpassing it) should result in a software release in six months. With luck, that means we will see the first Freedom Box release in August or so.

It should be noted that, perhaps a bit oddly, the project is called "Freedom Box", but the foundation is the "FreedomBox Foundation".

Like the Diaspora fundraising drive last May, the FreedomBox effort shows that there is a pool of money available for privacy-respecting tools and applications. So far, Diaspora, which is an attempt to provide a privacy-respecting Facebook alternative, has delivered some code and is running a private alpha. Whether Diaspora gains any sort of traction remains to be seen, but it may fall flat because the vast majority of internet users do not seem to put privacy anywhere near the top of their priority lists.

But, clearly some internet users do have a privacy focus and are willing to fund projects they see as advancing that agenda. There are also a large number of people whose privacy may be more than just a preference and is, instead, a life or death matter. For those folks, what will the Freedom Box offer? The high-level goals are spelled out on the foundation's website; the basic idea is to decentralize web applications and services, so that governments, companies, and other organizations will find it difficult to disrupt or eavesdrop on Freedom Box users' communications. To accomplish that, the project's goals are quite ambitious.

The goals

Unlike some other projects, Freedom Box is not just a software solution. It is targeting various types of low-end hardware servers to run a Debian-derived Linux system that implements its plans. The current targets are so-called "plug computers" (or "plug servers"), which are small, low-cost, low-power computers that often have the form factor of a "wall wart" power supply. These devices would be always-on gateways to the internet, with an interface that allows them to be used by both technically savvy and less sophisticated users.

While providing "safe social networking" is one of the aims of the Freedom Box, it is only part of the picture. The project wants to protect users' data as well as their communications, including internet traffic, email, and voice. Beyond that, Freedom Box is specifically targeted at routing around ISPs' restrictions on the types of traffic they will carry, as well as attempts by governments to do similar traffic restrictions. In short, the goals of the Freedom Box live up to Moglen's original vision, as spelled out in his February 2010 talk at the New York branch of the Internet Society, as well as those outlined in a more recent talk at FOSDEM 2011: it is geared towards restoring users' freedoms.

Those freedoms are best guarded by keeping our data safe within the walls of our homes, because there are typically more legal protections there than there are when storing data on some company's servers. We have already seen that companies will often bow to governmental pressure in ways that would be more difficult to orchestrate when the data is spread out across the net. To that end, Freedom Box also plans to provide ways to securely back up encrypted data on friends' and neighbors' servers. In addition, it will provide ways for those under repressive regimes to anonymously publish information, such that those regimes will find it difficult to stop or track down the publishers. If the FreedomBox is going to handle all of these kinds of things, obviously the security of the device itself is paramount, but it is also targeted at protecting other systems in the home that live "behind" the Freedom Box.

Did we mention that it is an ambitious vision? It is that, without question, and will certainly not be fully delivered in the six-month time frame. One would guess it will be a few years before it fulfills all of its goals, but those goals are important.

Development

Development, or at least planning, has been taking place on the Debian wiki's Freedom Box project page. One would guess that the infusion of some funding will accelerate the process, but there is already a fair amount of information about the parts and pieces that could come together as the Freedom Box. As Moglen has said, almost all of those pieces needed for the project already exist in one form or another. In some sense, the project will be an integration effort for many different free software projects. That part will be tricky for sure, but fairly straightforward; the harder part will be getting the user interface "right".

The Debian Freedom Box "vision statement" describes that part of the problem well:

In order to bring about the new network order, it is paramount that it is easy to convert to it. The hardware it runs on must be cheap. The software it runs on must be easy to install and administrate by anybody. It must be easy to transition from existing services.

There are a number of projects working to realize a future of distributed services; we aim to bring them all together in a convenient package.

Making all of the envisioned functionality easy to configure and use will be an enormous challenge. Focusing on just a few—or even one—hardware platform(s) will help with that process, but there are a lot of disparate pieces to be integrated—and to be made to mostly "just work". It would appear that the planning for that part has barely started, but there has been some work done on defining and describing the underlying guts of the system.

The "Design and ToDos" page outlines the base system as well as the extensions—based on existing free software tools—that will replace various "cloud" services (Facebook, Twitter, Flickr, Dropbox, Google Calendar and Reader, and so on) that are in use today. It also has a list of issues that underscores the amount of work to be done.

The base system will be based on Debian (obviously) with encrypted filesystems (which immediately raises a question about key/password management for users), a web server, AppArmor for security, a configuration system possibly based on Config::Model, and Tor for anonymous communications. The server extensions that are listed cover all kinds of different services including web-based email (Roundcube, SquirrelMail, ...), blogging (Wordpress, Drupal), file sharing (Sparkleshare, ownCloud, ...), telephony (Asterisk, Yate), social networking a la Facebook (Appleseed, Jappix, Diaspora), and so on. The extension list seems to cover most or all of the web applications and services that folks are using today, but it's a little hard to say if, for example, SquirrelMail is truly an acceptable Gmail alternative.

The project mailing list starts back in August, but the posting volume trailed off late last year. Since the advent of the FreedomBox Foundation, along with Moglen's FOSDEM talk, things have rapidly picked back up. Discussions there have mostly centered on high-level requirements, thoughts, and plans.

Funding and the role of the foundation

One of the more interesting postings to freedombox-discuss, was a transcription of an IRC question and answer session with Ian Sullivan, who is helping to coordinate the activities of the foundation. The Q&A was held on February 18 on the #freedombox channel on OFTC, and outlined some of the goals of the foundation along with the plans for the funds that are being raised:

The biggest part of the work is getting a team together with solid integration and technical design skills so that we can start coming together on general design ideas and roadmaps. Coordinating that is the biggest role for the foundation at this step. But as we've all seen, there are so many different places to start and so many different angles, it is easy to get stymied and lose the initiative. So the kickstarter goal is to get the foundation enough resources to enable it to start filling that role.

Presumably, how the funds will be used will be dependent on how much is raised. The current plan is not to hire full-time developers—$60,000 wouldn't go very far in doing so anyway—but to use the funds as something of a seed to get more people involved. Sullivan mentioned the idea of "buying plug computers and sending them to developers who promise to work on the project" as one possibility for using the funds. But, part of the idea of the funding drive is to increase the visibility of the project and, hopefully, increase the enthusiasm of potential contributors:

There are a lot of people who have expressed interest in the project, and even more firm commitments of time and effort, but it is too easy for all of that to keep in a holding pattern with everyone thinking that they will move after person X has moved or milestone X has been reached. If we can raise this funding, it will enable us to get some full time support and will shake up a lot of people who have been interested, but who are not yet convinced that now is the right time.

Clearly the project and the foundation are in their early stages, with much left to be worked out—not just technically, but organizationally as well. The foundation's web page notes that "in coming weeks we will be announcing here the technical leads for Freedom Box and its component projects". The foundation is incorporated as a Delaware non-profit and will seek non-profit recognition by the US Internal Revenue Service (IRS) "as soon as the paperwork is ready", Sullivan said.

Sense of urgency

Recent unrest in the Middle East, along with Egypt and Libya governments' internet shutdowns, have clearly increased the sense of urgency in the need for a device like the Freedom Box, as the Kickstarter appeal makes clear:

What we need is the glue to hold all of that together, the architecture of which pieces stack together in which way to turn a collection of possibilities into an appliance so easy to use that you forget you even have one, at least until that moment when you really need it. The FreedomBox Foundation was built to put this all together. It was started by community leaders with long track records and lives as a community project. But the past few months have shown us all that there are millions of people around the world who need such a device now and we need to pick up the pace and get them made so that next time, our friends have some help.

In the end, $60,000 is not a lot of money for a project of this scope. Even if the amount doubles (or more) before the Kickstarter campaign ends, it's really just a drop in the bucket. Moglen was quoted in the New York Times as saying that "slightly north of $500,000" would be enough to develop Freedom Box 1.0 in a year, so one might guess that the foundation has some other fundraising plans—perhaps approaching well-heeled individuals, other foundations, or companies to make up the difference. The interest and enthusiasm shown by the Kickstarter effort may be enough to shake loose some bigger donations.

The problem that the Freedom Box is seeking to solve is real, and recent events have only helped clarify that. We will have to wait and see whether the project and foundation are successful in solving it. Even if they fail, which is an outcome few would hope for, all of the work that is done will be available to others who want to head down that path. That is just another example of the freedom inherent in free software.

Comments (22 posted)

Python 3.2: toward the future of the language

By Jonathan Corbet
February 23, 2011
The Python 3.2 release was announced on February 20, exactly 20 years after 0.9.0, which was the first public Python release. Given that Python 2.x remains the version of the language used by most programmers and most existing code, one might be tempted to write off this release as being relatively unimportant. But the 3.2 release has some changes which will be important to Python developers going forward, so, even if one isn't planning on moving to Python 3 right away, this release merits a quick look.

Since Python is under a moratorium on the addition of new language features, one might think that a new release - even a major release - would be relatively boring. But the moratorium only applies to the core language; the libraries - which is where much of the interesting action is to be found - are unaffected. A look at the What's new in Python 3.2 document indicates that the libraries are evolving quickly indeed. Some of the more significant changes include:

  • A new "argparse" module for the handling of command-line options. Those of us still using getopt have been left far behind; the current "optparse" module has also been deprecated as of version 2.7. Argparse would appear to go beyond mundane argument parsing into the creation of command-line languages. It can probably handle more details than most people will ever want to use.

  • There is an ongoing effort to gather concurrency-related modules under the "concurrent" namespace. The first addition there is concurrent.futures, a mechanism for the submission and management of tasks in multi-threaded and multi-process environments.

  • The handling of compiled .pyc files has changed to reflect an environment where multiple Python runtimes coexist. They now have the interpreter name and version built into their names and have been banished into a separate __pycache__ directory. There is a similar mechanism for the handling of shared libraries.

  • Many other modules have seen significant improvements; see the "what's new" document for details.

A couple of the most significant improvements may be elsewhere, though. One of those is the definition of a stable ABI for extension modules. Anybody who has been through a Python version update knows that the associated rebuilding of extension modules is not a lot of fun. As of version 3.2, modules which restrict themselves to a subset of the extension module ABI should continue to work indefinitely into the future. It's not yet clear how many real-world modules can live within the restrictions of this ABI; also unclear is how much that ABI could be extended without slowing further development of the language. But it's a step in the right direction toward the solution of a real problem.

Another partial solution to an ongoing problem can be found in the rewrite of the global interpreter lock (GIL). The GIL is Python's equivalent to the kernel's Big Kernel Lock; it ensures that only one thread can be executing in the bytecode interpreter at any given time. Since running bytecode is what Python programs do, the GIL can be seen as a rather significant constraint on how much concurrency is possible in a multi-threaded environment. Some extension modules release the GIL while they are doing extensive computations, and the GIL (like the BKL) is released while waiting for I/O, but that doesn't solve the real problem. The failure to remove (or at least reduce the role of) the GIL during the Python 3 development process is, for many developers, one of the biggest disappointments of Python 3.

The 3.2 GIL rewrite does not change the fundamental nature of the GIL, but it does reduce its impact somewhat. As described by Antoine Pitrou, the principal hacker behind this work, two significant changes have been made:

  • Previously, the GIL would be passed from one contending thread to the next after a certain number of opcodes had been executed. But opcodes do not execute in constant time, and some of them (such as calls into an extension module) can execute for a long time indeed. The new GIL is, instead, passed on after a bounded time period (5ms by default).

  • The GIL is implemented in an inherently unfair manner; once it has been released, any process which comes along can claim it. Prior to 3.2, that "any process" is often the process which just released the lock. That process is supposed to wait before attempting to reacquire the GIL, but the fact that it is running and cache-hot means it's still likely to get there first. The new GIL is still unfair, but it will at least force the releasing process to wait until a contending process has acquired the lock. That should fix some of the long latencies seen by Python programmers in some situations.

Given the scalability limitations inherent in a single, global lock, one might think that eliminating that lock would be a priority for the Python developers. The Python glossary suggests that this isn't the case:

Past efforts to create a "free-threaded" interpreter (one which locks shared data at a much finer granularity) have not been successful because performance suffered in the common single-processor case. It is believed that overcoming this performance issue would make the implementation much more complicated and therefore costlier to maintain.

The addition of fine-grained locking which did not hurt single-threaded code could certainly be a bit of work; it might well involve techniques like run-time patching of the interpreter. For a system which is supposed to run on many operating systems, such a solution could indeed be brittle and hard to maintain. In its absence, though, the scalability of multi-threaded Python programs will continue to be limited.

That said, Python 3 is clearly getting better. Over time, adoption appears to be on the increase; the number of distributions and modules which support the language is growing. Python 3 continues to be a sufficiently hard sell that a group of developers recently contemplated reopening feature-oriented development on version 2.x, but that idea fell by the wayside when it became clear that the developer interest wasn't there. Python 3 thus appears to be the future for those who want a language which continues to evolve. Based on what can be seen in the 3.2 release, that evolution is going full speed, even in the face of a moratorium on new core features.

Comments (31 posted)

Easy, powerful, stable: Pick two with OpenShot 1.3

February 21, 2011

This article was contributed by Joe 'Zonker' Brockmeier.

OpenShot is a video editor for Linux that aspires to be simple, powerful, and "the very best open source video editor." OpenShot 1.3, which was released on February 13, brings it a little closer to that goal. This release brings a theme for the UI, support for adding multiple clips, new 3D animations, and a wizard for uploading video directly to YouTube or Vimeo. It may be the best open source video editor, but only if one is willing to overlook some stability issues.

Video editing is an area where Linux has lagged somewhat behind Windows and Mac OS X. This isn't to say that Linux users have had no options for editing video on Linux, but the selection of tools is not as broad, nor in many cases as full-featured or well-polished. Mac users have tools like Apple's iMovie that are very easy to use — though inflexible and decidedly unfriendly to open formats like Ogg Theora. Professional and advanced amateur users have quite a few options on Mac and Windows, depending on what they'd like to achieve and how much they're willing to spend.

Linux, on the other hand, has just a handful of viable alternatives. There's Cinelerra and its offshoot Cinelerra-CV, which are very capable editors, but also extremely complex and likely to intimidate most hobbyists. Kdenlive is another effort for providing a free software alternative for video editing on Linux (as well as FreeBSD and Mac OS X), that's much easier to use than Cinelerra. It might be a bit more intimidating than, say, iMovie, but it's usable by mere mortals.

Another editor that aims to be intuitive, but full-featured, is PiTiVi (which was reviewed here in June 2009). This is an LGPLed effort sponsored in part by Collabora and developed around the GStreamer framework. It is relatively easy to use, and is currently the default video editor for Ubuntu. The development for PiTiVi seems somewhat slowish, and the developers seem to be struggling to find contributors.

There's also Kino, which does (or did) a fair job of balancing features and functionality — but its development seems to have slowed to a crawl if not entirely stopped. The last release came out in September of 2009.

Those are just a few of the standouts. You'll find quite a few video editors for Linux in various states of completion and competence, but the landscape is littered with half-baked editors that are not entirely suitable for "prime time" when it comes to usability or ability to produce professional-quality videos.

OpenShot is a relative newcomer. Development is led by Jonathan Thomas, a software and Web developer who had his first taste of Ubuntu Linux in 2008 and found no video editors he felt were easy, powerful, and stable. So Thomas started with Python, the Media Lovin' Toolkit, and set off to try to realize a easy, powerful, and stable editor with OpenShot.

Easy, Powerful, but Stable?

The 1.3 release of OpenShot is available for Ubuntu in a Personal Package Archive (PPA), so I installed it and started practicing with a handful of pictures, a few short movies shot with my phone, and an MP3 to provide a background track. The project is also on the AV Linux LiveDVD, but the 1.3 release is not yet included with the live DVD. There's also an installer for Fedora 11 through 13, but it's not clear if the packages will work with Fedora 14. Naturally, source is also available.

For background, I don't claim any great skill in the area of non-linear video editing beyond having pieced together a number of videos from conference interviews using Kino in 2006 and 2007. Many years ago, I spent about a year working for a small television station (KTVO) in Kirksville, Missouri — which included linear video editing for broadcast news using antiquated (even at the time) 3/4" U-Matic tape.

[OpenShot interface]

OpenShot has a very simple interface. The left-hand corner holds a set of tabs for files, transitions, effects, and a history tab. The files tab holds all the clips (pictures, video, and audio) used to create videos. Effects are filters to modify audio, still images, or video — this can be used to give a video clip a sepia tone, for example, or apply an echo effect to audio. The transitions are, as you'd expect, a way to provide transitions from one piece of video to another. OpenShot has everything from simple dissolve and clock transitions to more elaborate star wipes or fractals.

On the right-hand side OpenShot has a video preview, and the bottom part of the OpenShot window holds the timeline that shows clips that have been integrated into the working video from the project files organized by track and file, and a small selection of tools for manipulating clips.

OpenShot is particularly easy to get started with. Drop a few video clips, pictures, and/or sound files into the project files tab and start dragging them into the order you'd like in the clips pane. For very simple projects involving just a few clips and music, it's possible to whip something together in ten minutes even if you've never used OpenShot before.

The OpenShot 1.3 release features a new theme for the interface and uses the stock desktop icons. I haven't used OpenShot prior to the 1.3 release, but looking at screenshots of older releases, it does look like the new theme is an improvement.

But there's more than just a facelift for the project. This release is supposed to feature improved stability over previous releases of OpenShot, as well as auto saving. Using OpenShot 1.3 from the PPAs on Ubuntu 10.10 on a system with 4GB of RAM (and a Core 2 Duo processor), OpenShot was still a bit fragile. When I first started experimenting with OpenShot it crashed because I tried to apply the Resize tool (apparently meant for still clips) to a video. It also crashed when applying an effect to a video, but worked just fine on restart, and crashed a couple of times when exporting movies — though the export finished successfully before OpenShot simply stopped responding. OpenShot 1.3 may be more stable than prior releases, but it's certainly not bulletproof.

[OpenShot export dialog]

One usability enhancement in the 1.3 release is a simplified export dialog. This is a really intuitive dialog when using the Simple tab, but its Advanced tab exposes just about any option that one might want when exporting a project. On the Simple tab, you have the option of Blu-Ray, DVD, Device, or Web. Blu-Ray and DVD have several options that are reasonable for those formats, while Device has presets for Xbox 360, Apple TV, and Nokia nHD. The Web preset features options like Wikipedia (Ogg Theora), FlickrHD, and a few options for Vimeo and YouTube. The Advanced tab provides just about any option that most users would want. Certainly more than enough for the bulk of home users looking to edit family vacation videos or funny pet videos. It may not offer every option that professionals may want, but it's certainly a good start.

The 1.3 release also simplifies organizing and finding files. Since I was only juggling about 10 files at a time, I didn't really find it hard to keep track — but this would be a useful feature for more ambitious projects.

OpenShot 1.3 also adds an "Add to Timeline" feature for pulling in multiple files. For example, you can select a handful of still images, and use the Add to Timeline feature to pop it into the timeline at the precise start time you'd like, as well as setting transitions and/or fades between the clips. The tool also gives the option of re-arranging the order of the files, or just shuffling them if you'd prefer a random order. This would be a very nice tool for creating a video out of family photos.

There's very little that I miss about working with linear video editing equipment at KTVO, but I do miss the physical controls for working with video. The shuttle (dial) for moving back and forth through video frame by frame gives a lot more control than trying to use the mouse and a slider. It is a bit surprising, perhaps, but the scroll wheel doesn't work for single-stepping either. Thankfully, OpenShot has keyboard shortcuts for frame stepping, pausing, etc., that allow just as much control while editing (though they lack the feel).

One thing that OpenShot has done very well, that I missed in Kino, is create titles. Whether you need credits at the start and end of a video, or overlays (like captions) over a portion of the video, OpenShot makes it very easy to do. If you have Blender installed, OpenShot will let you create animated 3D titles. Unfortunately, Ubuntu 10.10 ships with Blender 2.49, while OpenShot 1.3 expects 2.56. Herein lies one of the strengths and weaknesses of open source video packages that I've encountered over the years — many video editing packages build on readily available libraries or supporting packages (like Blender), but tend to be fussy about versions. Getting all of the dependencies right is quite a headache for users who want to use the most recent releases. Waiting for downstream projects means being several months behind the curve — and given the amount of catching up that open source video editors have to do, is also undesirable. A feature like uploading to YouTube — which is new in OpenShot 1.3 — is expected in a commercial package.

For anyone who's going to be at the upcoming Southern California Linux Expo, Thomas will be providing an in-depth look at OpenShot covering basic video editing to advanced effects. There's also a comprehensive guide for those new to video editing or just new to OpenShot. Developers interested in becoming involved should see the Launchpad project and mailing list.

OpenShot 1.3 represents significant progress on the open source video editing front. While it has some work to do in terms of stability, its feature set is certainly at the "good enough" point for many users. OpenShot is worth a serious look by anyone who's interested in doing video editing on Linux.

Comments (38 posted)

New LWN feature: HTTPS-only access

We would like to announce a new LWN feature that readers may find useful: forcing HTTPS (i.e. SSL/TLS) connections for every page in the site. Enabling the feature will help prevent man-in-the-middle eavesdropping and session hijacking. If you want to give it a try, go to the "My Account" page, then to "Customize your account", and the "Force SSL" option is the first listed. You will have to log out and log back in for it take effect and, because of Google ads, you may get a browser popup complaining about insecure content the first time you access the site. If you run into any problems with the new feature, please let us know at lwn@lwn.net.

Comments (62 posted)

Page editor: Jonathan Corbet

Security

CentOS 5, RHEL 5.6, and security updates

By Jake Edge
February 23, 2011

CentOS is forever in catch-up mode. That's because it repackages Red Hat's Enterprise Linux (RHEL) for those who would prefer an enterprise distribution without the costs associated with RHEL. That regularly puts the distribution in something of a pinch, because Red Hat, quite reasonably, follows its own schedule for updates. That pinch is being felt strongly right now with two RHEL releases in quick succession (6.0 followed by 5.6). But it isn't just the distribution developers who are being pinched, as the security updates for CentOS 5 have also been held up by the ongoing work to release CentOS 5.6 and 6.0.

CentOS had already been struggling for a bit in its efforts to put out CentOS 6 after the release of RHEL 6 in November. Then, on January 13, Red Hat released its latest update for RHEL 5, 5.6. At that point, CentOS was faced with a bit of a dilemma: should it focus on 6.0 or work on 5.6 first? The decision was made to work on 5.6 and 6.0 in parallel more or less. That meant that CentOS had two fairly large jobs at hand.

With each Red Hat release, the CentOS developers need to go through the packages and remove any Red Hat-specific elements: artwork, trademarks, %description lines in RPM spec files, and so on. Once that's done, there is a QA process that the packages go through before a final release can be done. Turning RHEL 6 into CentOS 6 is a time-consuming process, but that's also true with 5.6. But there is an additional problem with 5.6: security updates.

Normally, CentOS follows along with Red Hat security updates, releasing its versions as quickly as it can after the RHEL update is released. But 5.6 (or any "point" release of RHEL) comes with a whole slew of updated packages, any of which might have a security update—or be a dependency of a package updated for security reasons. Since there are no CentOS 5.6 packages (yet), these security updates fall into a crack in the CentOS development process. CentOS can either backport the fixes into the 5.5 package, or release an updated 5.6 package along with all of its dependencies, some of which may not have passed the QA process yet.

Except for those updates that Red Hat has marked as "critical", CentOS has chosen to do neither of the above, according to lead developer Karanbir Singh. That may leave its users vulnerable to a number of potentially exploitable security holes. In email, Singh said that the CentOS team is looking at Red Hat's security updates to fix those that are deemed "remotely-exploitable", but that doesn't seem to jibe with what is getting released for CentOS 5. Since the release of RHEL 5.6, there have been no CentOS 5 security updates.

In fact, the last CentOS 5 security update was for the kernel on January 6, a week before RHEL 5.6 dropped. In the interim, Red Hat has released 22 updates, most with "low" or "moderate" impact, but a few that are "important" (two for the kernel, and one each for openoffice.org, krb5, and java-1.6.0-openjdk), and three "critical" bugs (java-1.5.0-ibm, flash-plugin, and java-1.6.0-sun). [Update: As pointed out in the comments (and by Singh), those last three packages are closed source and thus not distributed by CentOS.] There is also a pre-5.6 wireshark vulnerability that has yet to be patched. The full list can be seen here.

It may well be that some of those vulnerabilities only apply to the updated packages that came with RHEL 5.6, but it is extremely unlikely that's true of all of them. The critical Java updates are perhaps the most worrisome, since they come with vague vulnerability descriptions (e.g. "Unspecified vulnerability in the Swing component in Oracle Java SE and Java for Business 6 Update 21, 5.0 Update 25, 1.4.2_27, and 1.3.1_28 allows remote attackers to affect confidentiality, integrity, and availability via unknown vectors."). The critical flash-plugin update is also of concern, but one would guess that there aren't all that many CentOS users running browsers with Flash on a server-oriented distribution.

It's also not at all clear that none of these vulnerabilities are remotely exploitable, as there are, at least, some remote denial of service flaws (which can sometimes be turned into remote exploits). For CentOS installations with untrusted users, there are plenty of locally exploitable flaws in the list. Even without untrusted users, a flaw in a content management system or other web application, for example, may provide an attacker the local access they need to use a local exploit to potentially compromise the entire system.

CentOS is pretty clearly dropping the ball on security updates here, which is probably not what its users expect. While the project is understaffed and is always looking for additional contributors, CentOS 5 users may not be aware that nearly two-dozen security updates (so far) have gone by the wayside while the QA process for 5.6 is ongoing. The CentOS FAQ clearly states that the goal is to have updates available in 72 hours after Red Hat puts them out and, by and large, the project meets that goal—except during the point-release gap.

That gap has stretched longer than the project would like, as Singh notes:

Our goal is to meet the 2 - 4 weeks for a point release. And we have slipped a bit for the last couple of releases. There are plans underfoot to make sure that this sort of a thing is reduced as much as possible, but make changes within a framework that does not break user trust, process and machine integrity.

According to Singh, the 5.6 release is imminent ("within the next few days"), which will allow the project to release the updates soon after. There have been complaints in the past about updates that didn't exactly track the upstream RHEL release (i.e. changes for CentOS 5.5 that are not in RHEL 5.5), but doing so in previous releases (e.g. the 5.4 to 5.5 transition) "was the right thing to do" and when there is a "serious threat to user deployed machines, we would do it again", he said.

There has been some discussion of the problem on the centos-devel mailing list, with former CentOS developer Dag Wieers being particularly critical of the delays. He is concerned that users are being misled: "I don't think most of the users ever expected to be without security updates for 10 weeks or more when choosing CentOS, and that is an important characteristic." Singh and others agree that there are things that the project could be doing better, but do not see this as the right time to address those problems. As Singh puts it:

The fact that there are disfunctional setups in place is not something that anyone ( I for one ) are [denying]. But the fact that a call for help got zero traction for weeks is also worth considering. We could go back, stop everything that is going on at the moment and try to process engineer a better setup before we again start working on CentOS-6, I'd say a target of 2 to 3 months would be reasonable if we did that.

On the other hand, we can just get this done out of the door and then look at process engineering for the future. We are better, stronger as a group with a much larger contributor base than ever before - I see no reason why we could not strengthen that even further and split the roles out.

As part of that discussion, though, Singh muddies the waters further about which kinds of security fixes are actually being considered for CentOS 5:

all updates to the /5/ tree are monitored and anything which has a remote or local exploit will get pushed into the /5/ tree; things in 5.6 and against 5.6 that [don't] meet that criteria wait for 5.6 release. build order, linking, inheriting upstream testing etc etc to blame.

But the reality seems to be rather different, as all manner of vulnerabilities are still languishing in the CentOS 5 tree.

It is a difficult situation for the project. It must necessarily trail the Red Hat releases, and keeping up with security updates while trying to push two releases out is difficult. Doing so would likely push back the releases even further. On the other hand, though, CentOS users may well be unaware that there have been potentially significant updates while they wait for CentOS 5.6. Unless those users follow the RHEL update announcements, they don't even know that there are vulnerabilities they may need to be aware of.

While there are no guarantees about security updates for CentOS (or any other community distribution for that matter), enterprise distribution users tend to expect regular updates, without significant, somewhat arbitrary, gaps. The biggest problem here is really one of communication as the CentOS team should try to make it widely known that security updates are being held back. It probably also makes sense for the project to try to figure out a way to keep up with the update stream even in the point release gap.

Another alternative would be to put CentOS 6 on hold, while focusing on CentOS 5.6. There are, after all, no CentOS 6 users yet, while CentOS 5 has many. It would also be nice to see some of the companies that benefit from CentOS (like various hosting providers, for example) put some effort into helping the project. Those companies are getting an awful lot from CentOS without, visibly at least, putting much back in.

Comments (9 posted)

Brief items

Security quotes of the week

There was nothing I could do, and it was no help that I recommended a website where a knowledgeable chemist explains, in delightfully comedic detail, what it would take to manufacture a workable bomb from binary liquid ingredients, working for several hours in the aircraft loo, using copious quantities of ice, in relays of champagne coolers helpfully supplied by the cabin staff.

The prohibition against taking more than very small quantities of liquids or unguents on planes is demonstrably ludicrous. It started as one of those "Look at us, we're taking decisive action" displays, the ones designed to cause maximum inconvenience to the public in order to make the dimwitted Dundridges who rule our lives feel important and look busy.

-- Richard Dawkins at Boing Boing

But say a scientist from the facility uses a memory stick to carry data home at night, and that he plugs the memory stick into his laptop on occasion. You can now get a piece of custom spyware into the facility by putting a copy on the memory stick—if you can first get access to the laptop. So you tail the scientist and follow him from his home one day to a local coffee shop. He steps away to order another drink, to go to the bathroom, or to talk on his cell phone, and the tail walks past his table and sticks an all-but-undetectable bit of hardware in his laptop's ExpressCard slot. Suddenly, you have a vector that points all the way from a local coffee shop to the interior of a secure government facility.
-- ars technica looks more deeply into HBGary's government-sponsored activities

On the flip side, the difficulty of securing a complex enterprise hardly applies to specialized, well-funded security outlets: that one problem is easy to fix. These companies should have an abundance of expertise and resources to tightly manage and monitor their relatively small and self-contained networks. Similarly, their employees can be reasonably expected to exercise above-average restraint and a good dose of common sense. It is an uncomplicated matter of living up to your own bold claims.

From this perspective, the purported details of the attack on HBGary - a horribly vulnerable, obscure CMS; unpatched internal systems; careless password reuse across corporate systems and Twitter or LinkedIn; and trivial susceptibility to e-mail phishing - are a truly fascinating detail. These tidbits seem to imply either extreme cynicism of their staff... or an [unbelievable] level of cluelessness. And from a broader perspective, both of these options are pretty scary.

-- Michal Zalewski

Comments (5 posted)

New vulnerabilities

aptdaemon: security restriction bypass

Package(s):aptdaemon CVE #(s):CVE-2011-0725
Created:February 22, 2011 Updated:February 23, 2011
Description: From the Ubuntu advisory:

Sergey Nizovtsev discovered that Aptdaemon incorrectly filtered certain arguments when using its D-Bus interface. A local attacker could use this flaw to bypass security restrictions and view sensitive information by reading arbitrary files.

Alerts:
Ubuntu USN-1068-1 2011-02-22

Comments (none posted)

awstats: arbitrary command execution

Package(s):awstats CVE #(s):CVE-2010-4367
Created:February 21, 2011 Updated:February 23, 2011
Description: From the CVE entry:

awstats.cgi in AWStats before 7.0 accepts a configdir parameter in the URL, which allows remote attackers to execute arbitrary commands via a crafted configuration file located on a (1) WebDAV server or (2) NFS server.

Alerts:
Mandriva MDVSA-2011:033 2011-02-21

Comments (none posted)

bind: denial of service

Package(s):bind9 CVE #(s):CVE-2011-0414
Created:February 23, 2011 Updated:April 8, 2011
Description:

From the Ubuntu advisory:

It was discovered that Bind incorrectly handled IXFR transfers and dynamic updates while under heavy load when used as an authoritative server. A remote attacker could use this flaw to cause Bind to stop responding, resulting in a denial of service.

Alerts:
Pardus 2011-65 2011-04-07
SUSE SUSE-SR:2011:005 2011-04-01
Debian DSA-2208-1 2011-03-30
openSUSE openSUSE-SU-2011:0135-1 2011-02-25
Ubuntu USN-1070-1 2011-02-23
Gentoo 201206-01 2012-06-02

Comments (none posted)

gitolite: arbitrary code execution

Package(s):gitolite CVE #(s):
Created:February 22, 2011 Updated:April 11, 2011
Description: From the Fedora advisory:

Dylan Alex Simon discovered and reported a directory traversal flaw in the way Gitolite restricted access to admin defined commands ("ADC"). An authenticated attacker could execute arbitrary code with privileges of Gitolite server user using specially crafted command name.

Alerts:
Debian DSA-2215-1 2011-04-09
Fedora FEDORA-2011-1644 2011-02-16

Comments (none posted)

java: multiple vulnerabilities

Package(s):java-1.6.0-openjdk CVE #(s):CVE-2010-4448 CVE-2010-4450 CVE-2010-4465 CVE-2010-4469 CVE-2010-4470 CVE-2010-4472 CVE-2010-4471
Created:February 17, 2011 Updated:July 22, 2011
Description:

From the Red Hat advisory:

A flaw was found in the Swing library. Forged TimerEvents could be used to bypass SecurityManager checks, allowing access to otherwise blocked files and directories. (CVE-2010-4465)

A flaw was found in the HotSpot component in OpenJDK. Certain bytecode instructions confused the memory management within the Java Virtual Machine (JVM), which could lead to heap corruption. (CVE-2010-4469)

A flaw was found in the way JAXP (Java API for XML Processing) components were handled, allowing them to be manipulated by untrusted applets. This could be used to elevate privileges and bypass secure XML processing restrictions. (CVE-2010-4470)

It was found that untrusted applets could create and place cache entries in the name resolution cache. This could allow an attacker targeted manipulation over name resolution until the OpenJDK VM is restarted. (CVE-2010-4448)

It was found that the Java launcher provided by OpenJDK did not check the LD_LIBRARY_PATH environment variable for insecure empty path elements. A local attacker able to trick a user into running the Java launcher while working from an attacker-writable directory could use this flaw to load an untrusted library, subverting the Java security model. (CVE-2010-4450)

A flaw was found in the XML Digital Signature component in OpenJDK. Untrusted code could use this flaw to replace the Java Runtime Environment (JRE) XML Digital Signature Transform or C14N algorithm implementations to intercept digital signature operations. (CVE-2010-4472)

Note: All of the above flaws can only be remotely triggered in OpenJDK by calling the "appletviewer" application.

Alerts:
Gentoo 201111-02 2011-11-05
SUSE SUSE-SU-2011:0823-1 2011-07-22
SUSE SUSE-SR:2011:008 2011-05-03
CentOS CESA-2011:0281 2011-04-14
Mandriva MDVSA-2011:054 2011-03-27
SUSE SUSE-SA:2011:014 2011-03-22
Ubuntu USN-1079-3 2011-03-17
Red Hat RHSA-2011:0364-01 2011-03-17
Red Hat RHSA-2011:0357-01 2011-03-16
Ubuntu USN-1079-2 2011-03-15
openSUSE openSUSE-SU-2011:0155-1 2011-03-07
Ubuntu USN-1079-1 2011-03-01
SUSE SUSE-SA:2011:024 2011-05-13
SUSE SUSE-SA:2011:010 2011-02-22
openSUSE openSUSE-SU-2011:0126-1 2011-02-22
Red Hat RHSA-2011:0282-01 2011-02-17
Fedora FEDORA-2011-1645 2011-02-16
Fedora FEDORA-2011-1631 2011-02-16
Red Hat RHSA-2011:0281-01 2011-02-17
Red Hat RHSA-2011:0490-01 2011-05-05
Debian DSA-2224-1 2011-04-20

Comments (none posted)

java-1.6.0-sun: multiple unspecified vulnerabilities

Package(s):java-1.6.0-sun CVE #(s):CVE-2010-4422 CVE-2010-4447 CVE-2010-4451 CVE-2010-4452 CVE-2010-4454 CVE-2010-4462 CVE-2010-4463 CVE-2010-4466 CVE-2010-4467 CVE-2010-4468 CVE-2010-4473 CVE-2010-4475
Created:February 17, 2011 Updated:July 22, 2011
Description:

From the Red Hat advisory:

CVE-2010-4475 JDK unspecified vulnerability in Deployment component

CVE-2010-4473 JDK unspecified vulnerability in Sound component

CVE-2010-4468 JDK unspecified vulnerability in JDBC component

CVE-2010-4467 JDK unspecified vulnerability in Deployment component

CVE-2010-4466 JDK unspecified vulnerability in Deployment component

CVE-2010-4463 JDK unspecified vulnerability in Deployment component

CVE-2010-4462 JDK unspecified vulnerability in Sound component

CVE-2010-4454 JDK unspecified vulnerability in Sound component

CVE-2010-4452 JDK unspecified vulnerability in Deployment component

CVE-2010-4451 JDK unspecified vulnerability in Install component

CVE-2010-4447 JDK unspecified vulnerability in Deployment component

CVE-2010-4422 JDK unspecified vulnerability in Deployment component

Alerts:
Gentoo 201111-02 2011-11-05
SUSE SUSE-SU-2011:0823-1 2011-07-22
SUSE SUSE-SR:2011:008 2011-05-03
SUSE SUSE-SA:2011:014 2011-03-22
Red Hat RHSA-2011:0364-01 2011-03-17
Red Hat RHSA-2011:0357-01 2011-03-16
SUSE SUSE-SA:2011:024 2011-05-13
SUSE SUSE-SA:2011:010 2011-02-22
openSUSE openSUSE-SU-2011:0126-1 2011-02-22
Red Hat RHSA-2011:0282-01 2011-02-17
Red Hat RHSA-2011:0490-01 2011-05-05

Comments (none posted)

mailman: cross site scripting

Package(s):mailman CVE #(s):CVE-2011-0707
Created:February 21, 2011 Updated:May 17, 2011
Description: From the Debian advisory:

A cross site scripting vulnerability was discovered in Mailman, a web-based mailing list manager, that allows an attacker to retrieve session cookies via inserting crafted JavaScript into confirmation messages.

Alerts:
SUSE SUSE-SR:2011:007 2011-04-19
CentOS CESA-2011:0307 2011-04-14
openSUSE openSUSE-SU-2011:0312-1 2011-04-07
SUSE SUSE-SR:2011:009 2011-05-17
Fedora FEDORA-2011-2125 2011-02-24
Fedora FEDORA-2011-2102 2011-02-24
openSUSE openSUSE-SU-2011:0424-1 2011-05-03
CentOS CESA-2011:0307 2011-03-02
Red Hat RHSA-2011:0308-01 2011-03-01
Red Hat RHSA-2011:0307-01 2011-03-01
Mandriva MDVSA-2011:036 2011-02-23
Ubuntu USN-1069-1 2011-02-22
Debian DSA-2170-1 2011-02-18

Comments (none posted)

openafs: multiple vulnerabilities

Package(s):openafs CVE #(s):CVE-2011-0430 CVE-2011-0431
Created:February 17, 2011 Updated:February 23, 2011
Description:

From the Debian advisory:

CVE-2011-0430: Andrew Deason discovered that a double free in the Rx server process could lead to denial of service or the execution of arbitrary code.

CVE-2011-0431: It was discovered that insufficient error handling in the kernel module could lead to denial of service.

Alerts:
Debian DSA-2168-1 2011-02-16

Comments (none posted)

python-django: directory traversal

Package(s):python-django CVE #(s):CVE-2011-0698
Created:February 21, 2011 Updated:February 23, 2011
Description: From the Mandriva advisory:

Directory traversal vulnerability in Django 1.1.x before 1.1.4 and 1.2.x before 1.2.5 on Windows might allow remote attackers to read or execute files via a / (slash) character in a key in a session cookie, related to session replays.

Alerts:
Mandriva MDVSA-2011:031 2011-02-18

Comments (none posted)

telepathy-gabble: man-in-the-middle audio/video interception

Package(s):telepathy-gabble CVE #(s):CVE-2011-1000
Created:February 17, 2011 Updated:April 19, 2011
Description:

From the Debian advisory:

It was discovered that telepathy-gabble, the Jabber/XMMP connection manager for the Telepathy framework, is processing google:jingleinfo updates without validating their origin. This may allow an attacker to trick telepathy-gabble into relaying streamed media data through a server of his choice and thus intercept audio and video calls.

Alerts:
SUSE SUSE-SR:2011:007 2011-04-19
openSUSE openSUSE-SU-2011:0303-1 2011-04-07
Fedora FEDORA-2011-1903 2011-02-21
Fedora FEDORA-2011-1903 2011-02-21
Pardus 2011-46 2011-02-21
Ubuntu USN-1067-1 2011-02-17
Debian DSA-2169-1 2011-02-16

Comments (5 posted)

webkitgtk: multiple vulnerabilities

Package(s):webkitgtk CVE #(s):CVE-2010-4492 CVE-2010-4493 CVE-2011-0482 CVE-2010-4199 CVE-2010-4578 CVE-2010-4042
Created:February 18, 2011 Updated:August 23, 2011
Description: From the CVE entries:

Use-after-free vulnerability in Google Chrome before 8.0.552.215 allows remote attackers to cause a denial of service or possibly have unspecified other impact via vectors involving SVG animations. (CVE-2010-4492)

Use-after-free vulnerability in Google Chrome before 8.0.552.215 allows remote attackers to cause a denial of service via vectors related to the handling of mouse dragging events. (CVE-2010-4493)

Google Chrome before 8.0.552.237 and Chrome OS before 8.0.552.344 do not properly perform a cast of an unspecified variable during handling of anchors, which allows remote attackers to cause a denial of service or possibly have unspecified other impact via a crafted HTML document. (CVE-2011-0482)

Google Chrome before 7.0.517.44 does not properly perform a cast of an unspecified variable during processing of an SVG use element, which allows remote attackers to cause a denial of service or possibly have unspecified other impact via a crafted SVG document. (CVE-2010-4199)

Google Chrome before 8.0.552.224 and Chrome OS before 8.0.552.343 do not properly perform cursor handling, which allows remote attackers to cause a denial of service or possibly have unspecified other impact via unknown vectors that lead to "stale pointers." (CVE-2010-4578)

Google Chrome before 7.0.517.41 does not properly handle element maps, which allows remote attackers to cause a denial of service or possibly have unspecified other impact via vectors related to "stale elements." (CVE-2010-4042)

Alerts:
Ubuntu USN-1195-1 2011-08-23
SUSE SUSE-SR:2011:009 2011-05-17
Debian DSA-2188-1 2011-03-10
openSUSE openSUSE-SU-2011:0482-1 2011-05-13
Fedora FEDORA-2011-1224 2011-02-09

Comments (none posted)

Page editor: Jake Edge

Kernel development

Brief items

Kernel release status

The current development kernel is 2.6.38-rc6, released on February 21. Linus says:

Diff-wise, the most noticeable thing here is removal of the /proc interface from the target code (so that we don't make a release with deprecated interfaces). But ignoring that (and some arm mach/map.h cleanup patches), the diffs really are pretty small. But what is probably actually noticeable is a lot of small fixups, mostly in drivers. Nothing really exciting, I'm afraid. Or not afraid, since excitement at this stage in the -rc series is a bad thing.

The short-form changelog is in the announcement, or see the full changelog for the details.

Stable updates: the 2.6.32.29, 2.6.36.4, and 2.6.37.1 stable kernels were released on February 17. They contain lots of fixes all over the tree. Also, this will be the last of the 2.6.36.x stable kernels: "[...] you should move to the .37 kernel series as this is the last .36 kernel to be released. It's now 'end of life', 'dead', 'buried', 'pining for the fjords', or whatever term you and your company uses for things that are no more."

The 2.6.37.2 update is in the review process as of this writing. It contains 70 fixes, and should be released on or after February 24.

Comments (none posted)

Quotes of the week

The structure passed is the structure abused.
-- Al Viro

Distributed systems are tricky. That's one reason I work in security, it is so much simpler.
-- Casey Schaufler

Incidentally, many Linux filesystem implementations don't have especially robust error handling for failures during attempts to mount corrupt filesystems. As an example, I have a deliberately corrupted btrfs filesystem that triggers a BUG() if you attempt to mount it. I formatted a USB stick with this filesystem, so now I have a USB stick that will panic the kernels of distributions that support auto-mounting, in some cases even when the screen is locked.
-- Dan Rosenberg

Comments (3 posted)

A free GMA500 graphics driver

Intel's GMA500 graphics chipset has been a source of pain for a few years; unlike almost everything else from Intel, it lacks a free driver. That situation appears to be changing, though: Alan Cox has posted a new GMA500 driver for the staging tree. Alan says "Currently it's unaccelerated but still pretty snappy even compositing with the frame buffer X server." It seems that quite a bit of work is needed (the driver is going into staging for a reason), and it's not clear when (or how) proper 3D support will be added, but it's a step in the right direction.

Comments (26 posted)

Making FIEMAP and delayed allocation play well together

By Jonathan Corbet
February 22, 2011
The FIEMAP ioctl() command can be used to learn about how a file's blocks are laid out on the disk. It's useful for determining fragmentation, optimizing boot-time readahead order, and a number of other things. One of those other things, though, has turned up bugs in how a couple of important filesystems implement FIEMAP.

The cp application, it seems, has recently been taught to use FIEMAP to find holes in files. The idea is to optimize the copying of such files by not even reading the holes; that way, the need to zero-fill pages (in the kernel) and compare them against pages full of zeros (in user space) can be eliminated. It seems like a better way of doing things.

Somewhere along the way, Chris Mason got word that cp was corrupting files on btrfs filesystems. The problem, naturally enough, was that FIEMAP was reporting holes where none should exist. The root cause was that FIEMAP was not prepared to deal with regions of a file which have been written to, but which do not actually have blocks assigned yet. The delayed allocation mechanism used by most contemporary filesystems will create exactly that kind of situation, so this is not a theoretical concern.

Chris fixed the problem for btrfs, then decided to see how other filesystems handled the same situation. From his report, xfs handled things well, but ext4 had similar bugs in situations where delayed allocation and real holes came together in the same file. Certain types of bugs, it seems, are likely to turn up in more than one context.

Chris's fix should get into 2.6.38 before the final release; chances are good that an ext4 fix will be fast-tracked as well. Expect stable kernel backports too. In the meantime, be careful when copying recently-written files with new versions of cp on those filesystems.

Comments (6 posted)

Kernel development news

Notes from the block layer

By Jonathan Corbet
February 22, 2011
Over the last week or so a number of interesting topics have come up with regard to the low-level functioning of the block layer. This article will survey a few of these topics.

Enforcing read-only: The block layer has a mechanism by which a driver can mark a specific device (or partition) as being read-only. This flag may be set if the physical device is write-locked; it can also be set by higher-level code (the DM or MD layers, for example) when the administrator creates a read-only device. Tejun Heo discovered an interesting thing, though: this flag is not enforced within the block layer. An attempt to open a write-protected device for write access will succeed, and the block layer will happily issue write operations to a read-only device. That struck Tejun as wrong, so he put a patch for 2.6.38 which addresses part of the problem: an attempt to open a read-only device for write access will be blocked.

It turns out, though, that this check breaks things. Since enforcement of read-only status has never been done, developers have been careless about how they open block devices. So, with this patch in place, the loop device, device mapper, and MD all break when trying to open a read-only device, even if the ultimate goal is read-only access. Breaking things on this scale is not one of the stated goals of the 2.6.38 development cycle, so Chuck Ebbert has posted a patch reverting the change; some version of this patch is likely to be merged before the final 2.6.38 release.

In-kernel code which is careless about open permissions can easily be fixed, but fixing the user-space utilities will take rather longer. So this check probably cannot be put into the open() path anytime soon. Beyond that, as Linus pointed out, it may never really be the right thing to do; there are times when it may be necessary to open a read-only device for write access. Real enforcement of read-only status, if it is to be done in the block layer, probably needs to happen when operations are actually submitted to the device. How many things that would break remains to be seen.

Stable pages: Linux has had support for block data integrity checking since 2008. In short, this feature takes advantage of suitably-equipped hardware to ensure that data is not corrupted between the host and its destination in persistent storage. Before writing a block to a device, the kernel will calculate a checksum and send it with the data; if the data, once written by the device, no longer matches the checksum, the device will signal an error. This mechanism can increase overall confidence that the system is storing data without corrupting it.

There is one little problem, though. Imagine a sequence of events where the kernel calculates a checksum for a specific block, issues a write operation, then goes on to do more interesting things. Before the block controller gets around to acting on the request, some process comes along and changes the contents of the block. At this point, the checksum will no longer match, and the operation can fail. What is the best way to respond to (or, better, prevent) this outcome?

Darrick Wong has addressed this problem with a patch which takes a possibly heavy-handed approach: when integrity checking is in use, blocks will be copied before the checksum is calculated and the I/O operation initiated. The rest of the system can then do anything it wants with the original data; the data as it existed when the write operation was queued will be written to the device. This approach will certainly work, but the cost is clear: an extra copy operation is added to the write path. That is not a cost that sits well with all developers.

The proper way to solve this, for some value of "proper," is implementing "stable pages" within the filesystem code. In essence: a page which is under writeout becomes immutable; any process trying to change that page's contents will block until the write operation is complete. This solution is not universally popular either; it is said to have an adverse impact on at least one benchmark regardless of whether integrity checking is in use. As Jan Kara noted, the best-performing approach will not be the same for everybody:

In fact what is going to be faster depends pretty much on your system config. If you have enough CPU/RAM bandwidth compared to storage speed, you're better [off] doing copying. If you can barely saturate storage with your CPU/RAM, waiting is probably better for you.

Some people also like the fact that the block-copying approach puts the pain on users of the integrity-checking features while not hurting other users - assuming that the cost of all those page allocations and copies doesn't affect anybody else. That said, stable pages look like they will be the approach taken in the future; as Martin Petersen pointed out, there are a number of filesystem features - encryption, for example - which depend on it. Work is underway to add this capability to a number of filesystems; at the moment, only Btrfs has proper stable page support.

Comprehensive block I/O throttling coverage. Last week's Kernel Page featured hierarchical I/O scheduling; that work fills in an important feature, but the limitations of the (quite new) bandwidth controller don't stop there. One of its larger shortcomings is that it only really works with I/O submitted directly from process context. When I/O is initiated by the kernel (in particular, when the writeback code flushes dirty pages to disk), the controller is unable to associate the pages with the process that dirtied them. Since on many (or most) systems most block I/O writes are generated that way, it is easy to see that the block I/O controller's coverage is somewhat limited at the moment.

Andrea Righi has posted a patch set which is meant to lift that limitation by tracking the ownership of all dirty pages in the system. There is code in the kernel now which can do that ownership tracking; the memory usage controller needs that information to do its job. So Andrea's patch generalizes the ownership tracking code and makes it serve the I/O controller's purposes as well. Half of the existing flags field in struct page_cgroup are taken to hold an index describing which control group the page belongs to. That makes the block controller different from how the memory controller uses this structure - the latter stores a direct pointer to its mem_cgroup structure - but it does have the advantage of not increasing the size of the page_cgroup structure.

That advantage is not to be undervalued: struct page_cgroup shadows struct page, so one can exist for almost every page in the system. Even a little bit of overhead adds up quickly when such large quantities are involved. That overhead will be the biggest disadvantage of this new feature; anybody who wants to throttle block I/O bandwidth, and who is not also using the memory controller, will pay a significant cost in increased kernel memory use. The payback is that block I/O throttling actually works as intended; without page tracking, it can only give approximate results.

Comments (1 posted)

debugfs: rules not welcome

By Jonathan Corbet
February 22, 2011
The kernel's debugfs filesystem is meant to be a place where kernel developers can place any information which seems to be of value to somebody. Unlike the other kernel virtual filesystems (/proc, /sys), debugfs has an explicit "no rules" rule. Anything developers want to put there is fair game, without regard for taste, (hypothetically) ABI stability, or perceived usefulness. "No rules" does not extend as far as compromising the security of the system, though, which has led to an attempt to lock debugfs down.

Eugene Teo recently posted a request for CVE numbers for 20 separate vulnerabilities involving world-writable files in debugfs and sysfs. Some of the debugfs vulnerabilities would seemingly allow any local user to write arbitrary values into device registers - a situation from which little good can be expected to emerge. Expect yet another set of kernel updates in the near future as these holes are closed and fixes are made available to users.

In response to these vulnerabilities, Kees Cook posted a patch which would cause debugfs to be mounted with root-only access permissions. That way, any future mistakes in debugfs would be inaccessible to nonprivileged users and, thus, would not be a new vulnerability in need of fixing. The patch was not received well; it looks suspiciously like a rule in a land where there are supposed to be no rules. Greg Kroah-Hartman responded:

It's just stupid mistakes being made here, don't try to lock down the whole filesystem for just a handful of bugs.

Kees suggested that these mistakes could keep on happening, and that "no rules" might not be the best approach, but Alan Cox responded:

It's a debugging fs, it needs to be "no rules" other than the obvious "don't mount it on production systems"

There is one little problem with the idea of not mounting debugfs on production systems, though: there is useful stuff in that filesystem. At the top of the list must certainly be the control files for perf and ftrace; most of our nice, new tracing infrastructure will not work without debugfs. There are also knobs for tweaking scheduler features, interfaces for the "usbmon" tool, interfaces used by Red Hat's kvm_stat tool, and so on. There is enough useful stuff in debugfs that is it can be found mounted well outside of kernel debugging environments; it has reached the point that Greg challenges the idea that debugfs should not be mounted on production systems:

No, not true at all, the "enterprise" distros all mount debugfs for good reason on their systems.

"No rules" and "mounted on enterprise systems" seems like a bad combination; it would be nice to make things more secure. A number of proposals have been floated to do that, including:

  • Teach the checkpatch.pl tool to look for world-writable debugfs files and complain about them. This step has already been taken; the version of checkpatch.pl found in 2.6.38 will point out world-writable files in either debugfs or sysfs.

  • Disallow world-writable files in debugfs. A patch has been posted to this effect; so far, there have been few comments to indicate whether such a restriction would look too much like a rule for debugfs or not.

  • Move generally useful interfaces out of debugfs to a place with a bit less of a wild-west flavor, then leave debugfs unmounted on most systems. This is an idea which makes a lot of sense on the face of it, but it can also run into practical difficulties. Moving interfaces requires possibly cleaning them up, making a stronger commitment to ABI compatibility going forward and, importantly, breaking tools which depend on the current location of those interfaces.

The last concern could be a show stopper; it could force developers to maintain both the old and new interfaces in parallel for some years. Many developers, faced with that sort of task, may just decide to leave the interface where it is. Debugfs is not supposed to have any ABI guarantees, but, as has become clear in the past, such a policy does not necessarily prevent the creation of an ABI which must be maintained going forward.

So debugfs on production systems seems likely to be with us for some time. Given that, there is no alternative to making it more secure. The checkpatch.pl change is a good start, but it cannot take the place of proper code review. Reviewers have a tendency to skip over debugfs code, but, if that code is to run on important systems, that tendency must be fought. Debugfs code must uphold the security of the system just like any other kernel code.

Comments (13 posted)

Optimizing Linux with cheap flash drives

February 18, 2011

This article was contributed by Arnd Bergmann

Flash drives are getting larger and cheaper; as a result, they are showing up in an increasing number of devices. These drives are not the same as the rotating-media drives which preceded them, and they have different performance characteristics. If Linux is to make proper use of of this class of hardware, it must drive it in a way which is aware of its advantages and disadvantages.

This article will review the properties of typical flash devices and list some optimizations that should allow Linux to get the most out of low-cost flash drives. The kernel working group of the Linaro project is currently researching this topic as an increasing number of embedded designs move away from raw NAND flash devices to embedded MMC or SD drives that hide the NAND interface and provide a simplified linear block device. This drives down system design complexity and cost but also means that regular block-oriented filesystems are used instead of the Linux MTD layer that can talk to raw flash.

Most filesystems and the block layer in Linux are highly optimized for rotating media, in particular by organizing all accesses to avoid seeks. It has become clear that some of these optimizations are pointless or even counterproductive with solid-state storage media. In recent kernels, there is a per-device flag for non-rotational devices that treats these slightly differently, by assuming that all seeks are free, but is that really enough to get good I/O performance on solid state drives? High-end drives are getting fast enough to make optimizations for CPU load more interesting than optimizations for ideal access patterns. In contrast, the more common SD cards and USB flash drives are very sensitive to specific access patterns and can show very high latencies for writes unless they are used with the preformatted FAT32 file layout.

As an example, a desktop machine using a 16 GB, 25 MB/s CompactFlash card to hold an ext3 root filesystem ended up freezing the user interface for minutes during phases of intensive block I/O, despite having gigabytes of free RAM available. Similar problems often happen on small embedded and mobile machines that rely on SD cards for their file systems.

To understand why this happens, it is important to find out how the embedded controllers on these cards work. Since very little information is publicly documented, most of the following information had to be gathered using reverse engineering based on timing data collected from a large number of SD cards and other devices.

Pages, erase blocks and segments

All NAND flash chips are physically organized into "pages" and "erase blocks." A page is the smallest unit that can be addressed in a single read or write operation by the embedded microcontroller on a managed flash device, and it has an effective size between 2KB and 32KB in current consumer flash drives. This means that while a single 512-byte access is possible on the host interface (USB, ATA, MMC, ...), it takes almost the same time as a full page access inside of the drive.

Although it is usually possible to write single pages, the data cannot be overwritten without being erased first, and erasing is only possible in much larger units, typically between 128KB and 2MB. The controllers group these erase blocks into even larger segments, called "erase block groups," "allocation units," or simply "segments." The most common size for these segments is 4MB for drives in the multi-gigabyte class, and all operations on the drive happen in these units; in particular, the drive will never erase any unit smaller than a segment.

The drives have a single lookup table which contains a mapping between logical segments and physical segments. On a typical 8GB SD card using 4MB segments, this table contains a little under 2000 entries, which is small enough to be kept in the RAM of the card's microcontroller at all times. A small number of physical segments is set aside in a pool to handle wear leveling, bad blocks and garbage collection.

Ideally, the drive expects all data to be written in full segments, which is what happens when recording a live video or storing a music collection on a FAT32 filesystem.

[Bar chart] The way the physical characteristics of the card make themselves felt can be seen in the plot to the right (click on the thumbnail for the full-size version), which summarizes the results of a number of tests on an SDHC memory card. The best-case read throughput is 13.5MB/s, while the linear write throughput is 11.5MB/s. The results show that the segment size is 4MB; any properly-aligned, 4MB write will be fast. The smallest efficient block size for reads and writes is 64KB, all accesses smaller than that are significantly slower. Individual pages are 8KB; the costs of extra garbage collection caused by smaller writes can be seen. The card as a whole has been optimized for linear write operations; random writes are much slower. Additionally, only one segment can be open at a time; alternating between two segments will cause garbage collection at every access, slowing write speeds to a mere 33KB/s. That said, the FAT file table area (from 4MB to 8MB) is managed differently, enabling small writes to be done efficiently there.

[Performance plot] The second image to the right shows a plot of read access times, in page granularity, on the first 32MB of a Panasonic Class 10 SDHC card. This plot illustrates various properties of the card. The segment size of 4MB can clearly be seen from the various changes in performance at the boundaries between segments. All closed segments have the same read performance, as do have all erased segments, which are a little faster to read. The FAT area in the second segment is a bit slower when reading because it uses a block remapping algorithm. One segment has been opened for writing by writing a few blocks in the middle before the read test, that segment can be seen as being a little faster to read on this specific card. Also, an effect of multi-level-cell (MLC) flash is that it alternates between slightly slower and faster pages, which the plot shows as two parallel lines for some segments.

Wear leveling

When a segment that already contains data is written to, a new segment is allocated from the free pool and the drive writes the new data into that segment. Once the segment has been written to from start to finish, the lookup table will be updated to point to the new segment, while the old segment is put into the free pool and erased in the background.

By always allocating a new segment, the drive can avoid wearing out a single physical segment in cases where the host always writes to the same block addresses. Instead, all writes are statistically distributed to all the segments that get written to from time to time. The better memory cards and SSDs also do static wear leveling, meaning they occasionally move a logical segment that contains static data to a physical segment that has been erased many times to even out the wear and increase the expected lifetime of the card. However, the vast majority of cheap memory cards do not do this but, instead, rely on the host software to write to every segment of the drive at some time or other.

[Segment mapping diagram] The diagram to the right shows how this mapping works in a typical flash drive; click on it for an animated version.

To improve wear leveling, the host can also issue trim or erase commands on full segments to increase the size of the free pool. However, file systems in Linux do not know the segment size and typically issue trim commands on partial segments, which can improve write performance inside that segment but not help wear leveling across segments.

Garbage Collection

In real life, writing 4 MB segments at once is more the exception than the rule, so drives need to cope with partial updates of segments. While data gets written to a logical segment, the controller normally has an old and a new physical segment associated with it. In order to free up the extra segment, it has to combine all the logical blocks in that segment into physical blocks on only one segment and discard all the previously used physical blocks, a process called garbage collection. A number of garbage collection techniques can be observed in current drives, including special optimizations using caching in RAM or NOR flash and dynamically adapting to the access patterns.

Most drives however use a very simple garbage collection method, typically one of the following three. Each description below is accompanied by a diagram which, when clicked, will lead to an animated version showing how the technique works.

Linear-access optimized garbage collection. Drives that are advertised as being ideal for video storage usually expect long, contiguous reads and writes. They always write a physical segment from start to end, so, if the first write into a segment does not address the first logical block inside it, the drive copies all blocks in front of it from the old segment before writing the new data. Similarly, a subsequent write to a block that is not logically contiguous to the previously written one requires the drive to copy all intermediate blocks.

[Linear access diagram] Garbage collection simply fills the new segment up to the end with copies of the unchanged blocks from the old segment.

The advantage is optimum performance for all reads and for long writes, but the disadvantage is that the drive ends up copying almost an entire segment for each block that gets written in the wrong order, for instance when the block elevator algorithm writes the blocks in reverse order attempting to avoid long seeks. Also, writing linear data smaller than the minimum block size of the drive makes it write the same block twice, which forces an immediate garbage collection. The minimum block size that the drive expects here is normally the cluster size of the preformatted FAT32 filesystem, between 4KB and 32KB, but on SD cards, it can be even larger than that.

Drives that are hardwired to linear-access optimized segments are basically useless for ext3 and most other Linux filesystems because of this, because they keep small data structures like inodes and block bitmaps in front of the actual data and need to seek back to these in order to write new small files.

Block remapping. Fortunately, a significant number of flash drives support random access within a logical segment, by remapping logical blocks to free physical blocks [Block remapping diagram] as they get written. Since this requires maintaining another lookup mechanism, both read and write accesses are slightly slower than the ideal linear-access behavior, and a small amount of out-of-band data needs to be reserved to store the lookup table.

This method also does not allow efficient writing in any small units when the manufacturers optimize for larger blocks in order to keep the size of the lookup table small. Writing the same block repeatedly still requires a full garbage-collection, which makes this method unsuitable for storing an ext3 journal or any other data that frequently gets written to the same area on the drive.

Data logging. The best random-access behavior is provided by using the same approach that log-structured filesystems like jffs2, logfs or nilfs2 and block-remappers like UBI in Linux use. Data that is written anywhere in the logical segment always goes to the next free block in the new physical segment, and the drive keeps a log of all the writes cached. Once the last free block is used up, a garbage collection is performed using a third physical segment.

[Data logging diagram] In the end, writing this way is slower than the other two approaches in the best case, because every block is written at least twice, but the worst case is much better.

This approach is normally used only in the first few segments on the drive, which contain the file allocation table in FAT32 preformatted drives. Some drives are also able to use this mode when they detect access patterns that match writes to a FAT32 style directory entry.

Obviously, any such optimizations don't normally do the right thing when a different filesystem is used on the drive than it was intended for, but there is some potential for optimization, e.g. by ensuring that the ext3 journal uses the blocks that are designed to hold the FAT.

Restrictions on open segments

One major difference between the various manufacturers is how many segments they can write to at any given time. Starting to write a segment requires another physical segment, or two in case of a data logging algorithm, to be reserved, and requires some RAM on the embedded microcontroller to maintain the segment. Writing [SSD thrashing] to a new segment will cause garbage collection on a previously open segment. That can lead to thrashing as the drive must repeatedly switch open segments; see the animation behind the diagram to the right for a visualization of how that works.

On many of the better drives, five or more segments can be open simultaneously, which is good enough for most use cases, but some brands can only have one or two segments open at a time, which causes them to constantly go through garbage collection when used with most of the common filesystems other than FAT32.

When a drive reserves the segments specifically to hold the FAT, these will always be open to allow updating it while writing streaming data to other segments.

Partitioning

When a filesystem wants to optimize its block allocation to the geometry of a flash drive, it needs to know the position of the segments on the drive. On partitioned media, this also implies that each partition is aligned to the start of a segment, and this is true for all preformatted SD cards and other media that require special care for segment optimizations.

Unfortunately, the fdisk and sfdisk tools from util-linux make it particularly hard to do this correctly, because they try to preserve an archaic geometry of 255 "heads" and 63 "sectors" and, by default, align partitions to "cylinder" boundaries. None of these units have any significance on today's hard drives or flash drives, but they are kept for backwards compatibility with existing software. The result is that most partitions are as misaligned as possible, they start on a odd-numbered 512-byte sector, which defeats all optimizations that a filesystem can do to align its accesses to logical blocks and segments inside of the partition.

The same problem has been discussed a lot in the light of hard drives with 4KB sectors, but it is much more significant when dealing with flash media. Current versions of fdisk ask the kernel about physical sector (BLKPBSZGET) and optimum I/O size (BLKIOOPT), but currently these are rarely reported correctly by the kernel for flash drives, because the kernel itself does not have the necessary information. SDHC cards report the segment size in sysfs, but this is not used by any partitioning tools, and all cards currently seem to report 4MB segments, even those that actually use 2MB or 8MB segments internally.

The linaro-media-create tool (from Linaro Image Tools) has recently been changed to align partitions to 4 MB boundaries when installing to a bootable SD card, to work around this problem.

Future work

There is a huge potential for optimizing Linux to better deal with the deficiencies of flash media in various places in the kernel and elsewhere. With the storage and filesystem summit coming up this April, there is hopefully time to discuss these and other ideas:

  • All partition tools should default to a much larger alignment, e.g. 4 MB or what the drive itself reports, for flash media and ignore cylinder boundaries.

  • The page cache could benefit from the fact that larger accesses end up taking less time than accesses shorter than a flash page. When a drive reads 16KB, the kernel may as well add all of it to the page cache.

  • The elevator and I/O scheduler algorithms can do much better than they do today for drives that only do linear access. Ideally, all outstanding writes to one segment should be submitted in order within a segment before moving to another segment.

  • A stacked block device can be used to reorder blocks during write, creating a copy-on-write log-structured device on top of drives that can only write to one segment at a time. A first draft design for device is available on the FlashDeviceMapper page at Linaro.

  • The largest potential is probably in the block allocation algorithm in the filesystem. The filesystem can ensure that it submits writes in the correct order to avoid garbage collection most of the time. Btrfs, nilfs2 and logfs get this right to a certain degree, but could probably get much better.

Resources

More information about specific measurements can be found in the Linaro flash card survey. Readers are welcome to add data about their memory cards and USB drives to the list.

The tool that was used to do all measurements is available from git://git.linaro.org/people/arnd/flashbench.git.

Comments (93 posted)

Patches and updates

Kernel trees

Core kernel code

Development tools

Device drivers

Filesystems and block I/O

Memory management

Networking

Architecture-specific

Security-related

Benchmarks and bugs

Miscellaneous

Page editor: Jonathan Corbet

Distributions

Btrfs by default in Fedora 16?

By Jonathan Corbet
February 23, 2011
Btrfs is, by many accounts, the next-generation filesystem for Linux systems. It has, in fact, been the next-generation filesystem for a few years now; progress in the filesystem area often seems to be slow. But btrfs has been maturing to the point that it is highly usable in a number of environments. The list of btrfs users may grow significantly toward the end of this year if a proposal to make it the default filesystem for the Fedora 16 release is adopted.

That proposal was made by btrfs developer Josef Bacik. Josef would not only like to see btrfs as the default F16 filesystem; he would like to stop using the LVM volume manager in favor of the internal volume management built into btrfs. As he notes: "Fedora 16 is a very aggressive target, which is also why I'm bringing it up now. I think we will be ready by then." There are a few things that will have to happen, though, for that to be true.

For example, there is the little problem that there is still no filesystem checker for btrfs. That is, according to btrfs creator Chris Mason, the biggest missing feature in the filesystem at the moment. He also said that he is working on it full time, so this particular gap should be filled sometime in the near future. That said, filesystem checkers, like the filesystems themselves, require a certain amount of time to mature into the kind of rock-solid behavior that users tend to expect. The btrfsck program will certainly have to evolve over time as the various ways in which filesystems can become corrupted are discovered.

Josef noted that support in the GRUB bootloader is another open problem. Patches adding support to GRUB1 exist, but they would have to be carried by Fedora, since there is no functioning upstream for GRUB1. The alternative is to move the distribution to GRUB2, a tool which has the added advantage of an existing upstream development community. Fedora is already contemplating moving to GRUB2, but, with regard to btrfs, there is, naturally, a catch. As was discussed in this article last December, GRUB2 is licensed under GPLv3, while the btrfs code is GPLv2-only. As has been done with ZFS, getting btrfs support into GRUB2 would require relicensing some of the code to GPLv2+. There are now enough contributors to btrfs to make that relicensing an interesting problem, but it is probably still feasible at this point.

Another potential issue is that the developers of Anaconda (the Fedora installer) have complained that they already have a lot of things to work on for the Fedora 16 release. They don't relish the idea of more work, but, according to Chris Lumens, "perhaps we can find some time somewhere." Simply installing to btrfs should be a relatively easy change; reworking volume management to be done within btrfs sounds like rather more work.

Jon Masters opposed the idea of switching away from LVM, saying:

Yes, BTRFS can do a lot of volume-y things, and these are growing by the day, but I don't want my filesystem replacing a full volume manager and I am concerned that this will lead to less testing and exposure to full LVM use within the Fedora community.

He did not find much support on the list, though; most participants in the discussion seem interested in pushing forward and using the interesting features that btrfs has to offer. Lennart Poettering would like to take things further by splitting the installed drive into three subvolumes, one of which (holding the root filesystem) would be mounted read-only. This scheme would separate the system software from user files and protect the system from changes most of the time, but there would be no need to worry about filesystem sizing since btrfs can expand any of the subvolumes when needed.

That, of course, would be a significant change; having to remount the root filesystem for write access to install an update or make a configuration change could get old relatively quickly. But it could also improve the security of a running system and may be a good configuration for a number of environments. The ability to take snapshots of the root partition and roll the system back in case of trouble would be a nice added bonus.

The one other thing to be kept in mind is that btrfs, despite the speed with which it is maturing, will certainly have a surprise or two in store for its users still. Such is the nature of a new filesystem. But that is also the nature of free software: at some point widespread real-world testing is required to shake out the last round of bugs. Fedora seems like it could be a good place for this level of testing. Whether the ambitious Fedora 16 target will be met remains to be seen, but, if a btrfs default does not happen then, it can probably be expected soon thereafter.

Comments (11 posted)

Brief items

Distribution quote of the week

Dear FSF, thanks for your appreciation of Debian Squeeze achievements in getting rid of non-free firmware blobs. We still disagree on the overall freeness assessment of Debian, but I'm positive that steps like this one can further future collaboration, in the interest of both projects.
-- Stefano Zacchiroli

Comments (none posted)

Red Hat Enterprise Linux 4.9

Red Hat has announced the availability of Red Hat Enterprise Linux 4.9. "This is the final minor release for Red Hat Enterprise Linux 4. With this release Red Hat Enterprise Linux 4 will be entering Production 3 phase during which qualified security Errata Advisories of critical impact, as well as, selected urgent priority bug-fix errata may be released."

Comments (none posted)

Ubuntu 10.04.2 LTS released

The second maintenance release of Ubuntu/Kubuntu/Xubuntu 10.04 LTS is available. "Numerous updates have been integrated, and updated installation media has been provided so that fewer updates will need to be downloaded after installation. These include security updates and corrections for other high-impact bugs, with a focus on maintaining stability and compatibility with Ubuntu 10.04 LTS."

Full Story (comments: none)

Splashtop OS Now Available

Splashtop OS is a lightweight system optimized for notebooks and netbooks. It does not include any native applications beyond the Chromium-based browser, and boots directly into a start screen featuring a Bing-powered search box in seconds. "The new Splashtop OS, version 1.0, provides broad support for a wide array of PC models, as well as an updated version of the Chromium-based browser featuring one-click access to the Chrome Web Store for easy installation [of thousands] of web apps, games, extensions and themes."

Full Story (comments: 3)

Distribution News

Debian GNU/Linux

Event and merchandise handling in Debian

Luca Capello covers some changes in merchandising for Debian events. "Debian, as other players in the Free Software world, is often present at different meetings, expos or conferences around the world, in different manners. Beside having Debian attendees or speakers, most of the time there is a Debian booth, where you can usually find merchandise on sale. It can be a bit strange to read that Debian people are interested in merchandising. However, for a volunteer project like Debian handling merchandise has different goals."

Full Story (comments: none)

Fedora

Notes from the Fedora board - SQLninja is back in

The minutes from the February 21 Fedora board meeting have been posted. The big issue appears to be a disagreement with the GNOME project on appropriate wallpaper for the Fedora 15 release, but the board also, after talking with Red Hat's legal department, decided to lift the ban on SQLninja which was imposed last November.

Full Story (comments: 7)

openSUSE

Spread openSUSE DVDs

The openSUSE project has special promotional DVDs that contain live versions of both the KDE and GNOME desktops to try and install. "So today the openSUSE Project would like to encourage you make use of your second free software freedom and help us to spread promo DVDs! It's easy, think about where you could have the chance to spread them, then order them from us and do it."

Full Story (comments: none)

Ubuntu family

Ubuntu Wiki to be re-licensed to CC BY-SA

Canonical has decided to transition the content in the Ubuntu Wiki to the the CC-BY-SA license. "In the absence of a substantial number of objections, this change will be made to the Ubuntu wiki after approximately one month."

Full Story (comments: none)

Other distributions

Unity and Yoper, a tale of two distros...

The developers of Unity Linux and Yoper Linux are working on a new project called ubuild. "A collaboration effort of core developers of Unity Linux and Yoper Linux to produce a 'better' buildsystem than either the Unity Linux community or Yoper Linux currently have. It also sets out to lower the entrance barriers for users to contribute to development tasks for any distribution that chooses to use ubuild for its development."

Comments (none posted)

Newsletters and articles of interest

Distribution newsletters

Comments (none posted)

Minty Fresh KDE and LXDE: Linux Mint LXDE and KDE 10 (Linux.com)

Joe 'Zonker' Brockmeier reviews the release candidates of two Linux Mint 10 editions. "The biggest difference for Mint vs. Kubuntu is that the KDE edition comes with KDE 4.6. Since Kubuntu 10.10 shipped before 4.6 was ready, the release came with a 4.5 series release. Note that you can get 4.6 via a Personal Package Archive (PPA) for Kubuntu, though. At any rate, it's nice to see Mint shipping with the very latest KDE. And I do like what they've done with KDE. The theme is nice and easy on the eyes, and the selection of applications is spot on - though I wish they'd shipped Firefox 4 beta and LibreOffice instead of OpenOffice.org."

Comments (1 posted)

Review: CrunchBang Linux (TechWorld)

TechWorld has a review of #! (CrunchBang). "But then, CrunchBang is not a handholding Linux distribution decked with eye candy; it's a delightfully sleek distro which by default delivers a minimalist (but still useable) desktop operating system. CrunchBang is based on Debian, and the latest version is 10, or 'Statler' (after the Muppet). (For this review I used the 32-bit crunchbang-10-20110207-openbox-i686 ISO. An Xfce-based version of CrunchBang is also available.)"

Comments (none posted)

Page editor: Rebecca Sobol

Development

Rethinking interactive fiction games with Curveship

February 23, 2011

This article was contributed by Nathan Willis

Interactive fiction (IF) is one of the longest continuing computer game traditions — it dates back to the mid-1970s and Will Crowther's "Adventure." Over the years that followed, text-based games became popular because the lack of graphics made them playable on almost any computer system, from minicomputer to PC to mainframe. Just because graphical games have taken over the bulk of the commercial game market doesn't mean text-based games aren't evolving, however: the worlds within them grow ever larger and more complex, and the interpreters can parse increasingly complicated sentence structure. Still, Nick Montfort's recently-released Curveship IF system adds an entirely new wrinkle, the ability to re-flow the narration itself, changing the order in which events are retold, who the principal actor is, and other fundamental characteristics of the story.

Curveship versus existing IF

Traditionally, IF games are written to target a "virtual machine" that maintains the state of the game world, characters, and objects, and implements some base-level logic to simplify the writing process (e.g., defining global rules like "you cannot place any object inside itself" or "you cannot pick up an object that you are already holding"). The most famous is probably the Z-machine specified by Infocom and named after its early hit "Zork" series (which was based on "Adventure").

An actual game is distributed as "story code" files, which are run on a Z-machine interpreter, but game authors would create the game by writing in a higher-level language like Inform. Up through version 6, Inform used a syntax that resembled a structured programming language, complete with includes, object declarations, and conditional expressions. It might be easy to read Inform 6 code and understand what it describes, but it is hardly convenient to write.

In 2006, Inform 7 switched over to a natural-language-like syntax, so that the author could describe the game world in simple English statements like "The Billiards Room is a room. The Billiards Room is north of the Dining Room. Colonel Mustard is a man in the Dining Room." This new syntax makes it easier to design and write a game, but the gameplay is still limited to second-person, present tense narration of the player's actions and how the world responds to them.

Curveship throws out those limitations. The easiest change to understand is that author can create a world and then "play" it from the perspective of any character by starting the game with a different set of options, which Curveship calls a "spin." A Curveship game can also change the tense and tenor of the narration, without requiring the author to re-work the game.

For example, an action the player takes could shift the style of the narration from "You wake up in the Conservatory. A candlestick is lying on the floor." to "You recall seeing a candlestick lying on the floor. You remember waking up in the Conservatory." Presumably this marks a pivotal moment in the plot, not just a random change in style. The advantage for the author is that it takes place automatically, triggered by some change in state of an in-world object, a choice the player makes, or some other event. Changing the narration style on-the-fly, Montfort says, allows authors to tell "the same underlying events in different ways."

Exploring Curveship

Curveship has its own game-writing syntax and its own interpreter (written in Python). The code is hosted at Github, where you can download a tar archive of the initial release, time-stamped February 1st. The download includes a brief README, the interpreter and supporting Python libraries, six sample games in the "fiction" directory, and seven sample spins in the "spins" directory. The package is released under the BSD-like ISC License.

As distributed, the Curveship example games can only be played from the command line using the python interpreter. Four of the example games are simple demonstrations of Curveship's special functionality: Artmaking, Cloak of Darkness, Lost One, and Robbery. Lost One, for instance, involves a player character waiting for friend in an outdoor plaza. If the player simply stands and waits, looking around at the environment, the friend eventually arrives. But if the player wanders around too much, Curveship slowly begins shifting the narrative structure, making it third-person, past-tense, and increasingly vague.

Robbery is a non-interactive story; it describes a sequence of events involving a bank holdup. The fun comes from replaying the story with different spins. To launch the game as-is, run python curveship.py fiction/robbery.py. Appending one or more of the spin files to the end of the command runs the story with a different narrative structure:

  • python curveship.py fiction/robbery.py spin/prophecy.py tells it in future-tense.

  • python curveship.py fiction/robbery.py spin/prophecy.py spin/hesitant.py tells it in future tense and sprinkles the narration with interjections.

  • python curveship.py fiction/robbery.py spin/told_and_focalized_by_guard.py tells it in first-person from the bank guard's point of view.

The last example shows off another of Curveship's features: the world model keeps track of what each character in the story knows. The plain-and-straightforward account of the bank robbery starts with the bank robber putting on a "Dora The Explorer" mask, then entering the bank. The guard's account begins only when the teller laughs at the robber's mask, an event that wakes the guard up from his nap.

It is interesting enough that the game engine can do this in a non-interactive story, but imagine the possibilities when applied to a full-blown IF game. The author only needs to create the world once, but the player could experience very different gameplay, including starting at different points in the same story, just by specifying a different spin at run-time.

The last two sample games are Curveship adaptations of other IF games: Cloak of Darkness Plus (which expands the oft-repeated Cloak of Darkness IF demo), and Adventure In Style (a remake of Crowther's original "Adventure").

World structure and syntax

The price for Curveship's mutable narration is that you cannot author a game in the near-English natural language of Inform 7. Montfort describes Curveship's syntax as "strings-with-slots." It follows basic English sentence structure, but replaces subjects, objects, and verbs with templates that include the base word plus clues for the text generator to adapt when changing tense or style.

The principle (or player) subject is represented as [*/s]; other subjects replace the asterisk with a noun, such as [butler/s] or [hobbit/s]. Objects are likewise enclosed in brackets, but end with the /o closing tag, such as [pantry/o] or [shuttlecraft/o]. Basic variations to specify elements like possessive form are provided, so the construction:

    [*/s] notice the [cobra/o] in [*'s] [shoe/o]

Would be rendered as "You notice the cobra in your shoe" in generic first-person, present-tense. Running a spin from some other character's perspective would replace "your shoe" with "the detective's shoe" or whatever the appropriate designation for the principal character happens to be.

Verbs are a bit more complex. In a real game, the verb "notice" in the previous example would be written as [notice/1/v] — the "v" marks the word as a verb, the "1" indicates that the subject is singular. Verbs are generally written in present tense, so that the text generator can adapt them as necessary (e.g., "You noticed the cobra ... "), but if the circumstances of the story demand it, you can force the past tense or present participle by adding qualifiers like /ed — [notice/ed/1/v] would be rendered as "you noticed the cobra," for example.

Combining these techniques, you can construct sentences that work when rendered from any perspective, such as:

    [elf/s] [smack/1/v] [dwarf/o] because [elf/s] [notice/ed/1/v] [cobra/o]

The result could be generated as "The elf smacks the dwarf because the elf noticed the cobra," or "The dwarf is smacked by the elf because the elf noticed the cobra." In the latter case, you'll notice, the perspective of the sentence is turned around, but the structure of the action it describes stays the same — the same creature receives the smack each time.

Sadly, Curveship does not have a full syntax description or writer's manual at this time, but it does have a decent introduction to the game world's theoretical model and a general glossary of terms, which illuminate some of the assumptions that make it powerful.

For example, each "actor" (either a player or non-player character) is a top-level item in the world, as is each physical object or room in the environment. That's typical for an IF model. But each actor also has a concept, which tracks what portion of the world the character is aware of. That feature is what enables Curveship to re-orient the narrative around a different actor and come up with a different sequence of events. If an actor does not see an object or other actor at a particular point in the story, it is not in his or her concept.

If that seems nebulous, a good way to get more familiar with concepts is to print them out while running one of the example games. At the python prompt while in-game, typing "world tree" will list the entire item tree of the game, starting with "cosmos" at the top, followed by each room, actor, and all of the objects, nested by their locations. You can view an individual's concept by typing "concept @actor_name tree".

If you do that with the Robbery example game, you'll discover something interesting. The robber and the guard both know that the teller in the vestibule is holding a black bag. Running "concept @robber tree" displays the following:

    @cosmos: nature []
	@street: street outside the bank [of]
	@lobby: lobby [of]
	    @robber: twitchy man [in]
		@fake_gun: gun-shaped object [of]
		@mask: Dora the Explorer mask [on]
	    @guard: burly guard [in]
		@pistol: pistol [of]
	@vestibule: vestibule [of]
	    @slips: deposit slips [in]
	    @bag: black bag [in]
	    @teller: bank teller [in]

But running "concept @teller tree" reveals that the black bag doesn't contain what the robber is hoping for:

    @cosmos: nature []
	@vestibule: vestibule [of]
	    @slips: deposit slips [in]
	    @bag: black bag [in]
		@fake_money: fake money [in]
	    @teller: bank teller [in]
	@lobby: lobby [of]
	    @robber: twitchy man [in]
		@fake_gun: gun-shaped object [of]
		@mask: Dora the Explorer mask [on]
	    @guard: burly guard [in]
		@pistol: pistol [of]

It is a small touch, but one that gives authors an interesting tool with which to craft stories.

At the moment, the dearth of documentation — particularly on compiling Curveship source into executable Python code, and on the syntax for writing spins — limits what you can do with Curveship. For Curveship to take off as a full-fledged IF tool, it will need to build up its documentation, and possibly find support from other open source IF engines. Although Python is widespread and cross-platform, installing games currently requires unpacking them into multiple subdirectories of the Curveship distribution.

But then again, this public release is just three weeks old. Montfort has evidently been working on it as a graduate student for around five years, and the publication history indicates that a lot of IF theorists think it is interesting work. So far the development community is small; there is just one outside contribution — a spin called "mopey" that makes all of a game's narration downcast and negative. Montfort recommends interested users visit the intfiction.org IF forum to learn more, where it looks as if Curveship has its share of fans. IF has been going strong for 35 years so far; with the new POV and narrative techniques Curveship brings to the table, it will be interesting to see what the die-hard IF authors manage to come up with next.

Comments (6 posted)

Brief items

Quotes of the week

If we can create a database, someone can get libedit to work 100%! There is no excuse for this not being done, seeing that libreadline has been (viral) GPL forever and has changed APIs regularly and broken things for us.
-- Bruce Momjian

The second reason [for the delayed GIMP 2.8 release] is that we develop features directly on the main branch. That this is a problem is highly related to that developers come and go. In fact, it is the reason we have long development cycles overall. There is almost always a feature on the main branch that is incomplete. This has the sad side effect that if someone contributes a complete feature in the beginning of a development cycle, it will literally take years before that features reaches a wide audience.
-- Martin Nordholts

Comments (3 posted)

GNOME Shell 2.91.90 released

The GNOME Shell 2.91.90 release is out; this release, it is said, "just about concludes user interface changes anticipated before GNOME 3.0." Those changes include reworked workspace handling, a new PolicyKit authentication agent, the removal of the "minimize" and "maximize" buttons from window title bars (explanation here), and a lot more. This release represents the experience the GNOME developers are expecting to have for 3.0; see the announcement (click below) for details.

Full Story (comments: 74)

libchamplain 0.9.0 released

libchamplain is a clutter-based mapping library; the 0.9.0 release is now available. "We're a bit late as usual but I believe we'll be in time for the Gnome 3.0 release. In this release there have been many changes of layers, markers and polygons (now paths) which means that if your application uses one of these, it will most probably need an update." See the announcement for a list of changes.

Full Story (comments: none)

Mixxx 1.9.0

Version 1.9.0 of the "Mixxx" DJ system has been released. "Mixxx 1.9.0 adds several major new features including Shoutcast support, direct deck outputs for external mixers, and ReplayGain normalization. We've also added many enhancements to the library, a revamped default skin, and more."

Full Story (comments: none)

WordPress 3.1 released

The WordPress 3.1 release is out. "This release features a lightning fast redesigned linking workflow which makes it easy to link to your existing posts and pages, an admin bar so you're never more than a click away from your most-used dashboard pages, a streamlined writing interface that hides many of the seldom-used panels by default to create a simpler and less intimidating writing experience for new bloggers (visit Screen Options in the top right to get old panels back), and a refreshed blue admin scheme available for selection under your personal options." Strangely, the code name for this release is "Django". (Update: the code name has since been changed to "Reinhardt".)

Comments (7 posted)

Newsletters and articles

Development newsletters from the last week

Comments (none posted)

Tom's Definitive Linux Software Roundup: Audio Apps (Tom's Hardware)

Tom's Hardware has an extensive guide to Linux audio applications. The article has mini-reviews of a number of different kinds of audio applications, including music managers, audio players, CD players, CD rippers, tag editors, and so on. "Originally, we intended to create a single article on Linux-based audio applications. However, it soon became apparent that the sheer number of audio production apps would not permit this. So that we don't bore casual users with audio production jabber, this article is split in two: content consumption and content creation. Most end-users will be more interested in this article, while musicians and audio professionals should look to the next one for their Linux audio needs."

Comments (6 posted)

Page editor: Jonathan Corbet

Announcements

Articles of interest

How to root a Nook Color to transform it into an Android tablet (ars technica)

Here's a detailed article on rooting the Nook Color reader device to expose its full Android capabilities. "There are two different approaches to turning the Nook Color into a tablet: you can root the Nook Color's default software environment and extend it with third-party applications, or you can run a conventional Android environment by booting a custom ROM image from a microSD card. The custom ROM images are an appealing option because they offer the ability to get relatively close to the stock upstream user experience. Unfortunately, the custom ROMs are still highly experimental and aren't quite yet ready for day-to-day use."

Comments (5 posted)

Building the Technology Stack for Internet Freedom (Gigaom)

Gigaom reviews an attempt by the New America Foundation to obtain government funding to create a more freedom-friendly net using a number of existing projects. "[Commotion] is a fairly new project that seeks to make distributed communications easier by turning any device from a phone to a router into a node on a mesh network. This can be used to create a wireless LAN for Serval-enabled handsets to run on top of, or it can be used to create an access network in general. The point here is that it's distributed, as opposed to every connection going back to a central wireless or wireline provider."

Comments (16 posted)

Haldar: The Cognitive Style of Unix

Vivek Haldar examines research that suggests that making software too easy-to-use impedes a user's ability to learn. "One of the most deeply held beliefs in the culture of *nix (and everything that springs from it) is that the steep learning curve pays off. Yes, the tools seem cryptic and "hard-to-use", with hardly any crutches for the beginner. But if you stick with it and keep learning you will be rewarded. When you grok the power of economical command lines, composability and extensibility, you're glad you didn't run back to the arms of the GUI on the first day. It was worth it. There is another belief that goes deeper, and it is the reason that after decades of existence and millions of newbie-suffering-hours, the learning curve has not become any easier, or gone away. That belief is: the learning curve has value, it is essential for learning, and it needs to be preserved, not whittled away in the name of "ease-of-use."" (Thanks to Jay Ashworth)

Comments (120 posted)

Microsoft bans free software from Windows Phone Marketplace (The H)

The H looks at the terms of Microsoft's Windows Phone marketplace and finds that it bans software released under various free software licenses. "The ban, in section 5.e of the terms, forbids any software which is subject to an "Excluded Licence"; it defines that in section 1.l as any licence which requires, as a condition of distribution, that the source code for the application be made available, or allow the creation of derivative works or redistribution at no charge. It specifically names GPLv3 licences and includes the General Public [License] (GPL) version 3, the GNU Affero GPL version 3, and the GNU Lesser GPL version 3 as examples of excluded licences." Microsoft's "open source friendly" stance only goes so far, it seems.

Comments (45 posted)

German Foreign Office drops Linux (The H)

The H provides some background on the German Foreign Office decision to migrate its desktop and notebook computers back to Windows. "How did Linux get into the German Foreign Office in the first place? In 2001, the authority began to set up a secure intranet to connect the more than 200 German embassies with their headquarters in Germany. At the time, the decision to build a VPN using free software was based on financial considerations. In 2004, the government authority began to introduce open source solutions on desktop computers, at first with OpenOffice, Firefox and Thunderbird under Windows. In 2005, Linux was introduced as the only operating system on mobile computers, and in a dual-boot configuration with Windows on desktop PCs. This decision was made for security reasons."

Comments (58 posted)

Best Practices in Open Source Foundation Governance - Part I (The Standards Blog)

Andy Updegrove looks at the importance of foundations for open source projects. "For some time now, I have been meaning to write a series of blog posts setting forth my views on best practices in forming and governing open source foundations. Why? Because despite the increasing reliance of just about every part of our modern world (government, finance, defense, and so on) on open source software (OSS) and Free and Open Source Software (FOSS), there has been very little written on the subject. That means that neither a community nor a corporation has much to refer to in creating the kind of governance structure most likely to ensure that the intentions of the founders are carried out, that the rights of contributors are respected, and that the code upon which end users will rely is properly maintained into the future."

Comments (none posted)

Education and Certification

LPI Exam Labs with FOSSFA/ict@innovation in South Africa

Linux Professional Institute (LPI) has announced promotional exam labs for their Linux Professional Institute Certification (LPIC) with the Free and Open Source Software Foundation for Africa (FOSSFA), to take place March 5, 2011 in Johannesburg, South Africa.

Full Story (comments: none)

Calls for Presentations

Linux Plumbers Conference looking for more track proposals

The Linux Plumbers Conference, which will be held in Santa Rosa, California September 7-9, is looking for more microconference track proposals. So far there are five proposals (init/boot, audio, filesystem/storage, mobile, and development tools) but several more are needed to round out the schedule. For more information on submitting a proposal, see the Participate page and the FAQ.

Full Story (comments: none)

OpenCms Days 2011 - Call for papers

OpenCms Days 2011 will take place in Cologne, Germany, May 9-10, 2011. "Session proposals are welcome. Each session will be 60 minutes long. A typical presentation session should be 45 minutes, followed by a 15 minutes discussion. Other session formats, such as workshops, roundtable discussions and BOF meetings, are also possible." The call for papers closes March 1.

Full Story (comments: none)

PyCon Australia 2011 - Call for Participation

The second PyCon AU will be held in Sydney, Australia, August 20-21, 2011. "We are looking for proposals for Talks on all aspects of Python programming from novice to advanced levels; applications and frameworks, or how you have been involved in introducing Python into your organisation. We're especially interested in short presentations that will teach conference-goers something new and useful." The deadline for proposal submission is May 2, 2011.

Full Story (comments: none)

Upcoming Events

Events: March 3, 2011 to May 2, 2011

The following event listing is taken from the LWN.net Calendar.

Date(s)EventLocation
March 5 Open Source Days 2011 Community Edition Copenhagen, Denmark
March 7
March 10
Drupalcon Chicago Chicago, IL, USA
March 9
March 11
ConFoo Conference Montreal, Canada
March 9
March 11
conf.kde.in 2011 Bangalore, India
March 11
March 13
PyCon 2011 Atlanta, Georgia, USA
March 19 Open Source Conference Oita 2011 Oita, Japan
March 19 OpenStreetMap Foundation Japan Mappers Symposium Tokyo, Japan
March 19
March 20
Chemnitzer Linux-Tage Chemnitz, Germany
March 21
March 22
Embedded Technology Conference 2011 San Jose, Costa Rica
March 22
March 24
OMG Workshop on Real-time, Embedded and Enterprise-Scale Time-Critical Systems Washington, DC, USA
March 22
March 24
UKUUG Spring 2011 Conference Leeds, UK
March 22
March 25
Frühjahrsfachgespräch Weimar, Germany
March 22
March 25
PgEast PostgreSQL Conference New York City, NY, USA
March 23
March 25
Palmetto Open Source Software Conference Columbia, SC, USA
March 26 10. Augsburger Linux-Infotag 2011 Augsburg, Germany
March 28 Perth Linux User Group Quiz Night Perth, Australia
March 28
April 1
GNOME 3.0 Bangalore Hackfest | GNOME.ASIA SUMMIT 2011 Bangalore, India
March 29
March 30
NASA Open Source Summit Mountain View, CA, USA
April 1
April 3
Flourish Conference 2011! Chicago, IL, USA
April 2 Texas Linux Fest 2011 Austin, Texas, USA
April 2
April 3
Workshop on GCC Research Opportunities Chamonix, France
April 4
April 5
Camp KDE 2011 San Francisco, CA, USA
April 4
April 6
SugarCon ’11 San Francisco, CA, USA
April 4
April 6
Selenium Conference San Francisco, CA, USA
April 6
April 8
5th Annual Linux Foundation Collaboration Summit San Francisco, CA, USA
April 8
April 9
Hack'n Rio Rio de Janeiro, Brazil
April 9 Linuxwochen Österreich - Graz Graz, Austria
April 9 Festival Latinoamericano de Instalación de Software Libre ,
April 11
April 13
2011 Embedded Linux Conference San Francisco, CA, USA
April 11
April 14
O'Reilly MySQL Conference & Expo Santa Clara, CA, USA
April 13
April 14
2011 Android Builders Summit San Francisco, CA, USA
April 16 Open Source Conference Kansai/Kobe 2011 Kobe, Japan
April 25
April 26
WebKit Contributors Meeting Cupertino, USA
April 26
April 29
OpenStack Conference and Design Summit Santa Clara, CA, USA
April 28
April 29
Puppet Camp EU 2011: Amsterdam Amsterdam, Netherlands
April 29 Ottawa IPv6 Summit 2011 Ottawa, Canada
April 29
April 30
Professional IT Community Conference 2011 New Brunswick, NJ, USA
April 30
May 1
LinuxFest Northwest Bellingham, Washington, USA

If your event does not appear here, please tell us about it.

Page editor: Rebecca Sobol

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds