LWN.net Weekly Edition for March 8, 2012
Getting multiarch dpkg into Debian
"Multiarch" is Debian's name for its approach to supporting multiple architecture ABIs on a single system; LWN recently covered the multiarch work in some detail. Multiarch support is one of the goals for the upcoming "wheezy" release; since that release is getting closer to its freeze time, Debian developers naturally want to have all the necessary code merged into the distribution and ready for testing. A disagreement over the uploading of one of the crucial pieces of the multiarch puzzle recently threatened that goal and provided an interesting view of the higher levels of the Debian project's governance mechanisms in action.Clearly, a functioning multiarch implementation will require extensive support from the packaging mechanism. To that end, a number of developers have been working on extending the dpkg utility for some time; patch sets adding multiarch support to dpkg were posted as early as 2009. Much of this work has been supported by Canonical and Linaro, so it is not entirely surprising that it shipped first with the Ubuntu distribution; the 11.04 release included basic multiarch support. Until very recently, though, the necessary dpkg changes had not found their way into Debian - not even into the experimental repository.
A number of attempts have been made over time to push the dpkg changes into experimental. The holdup each time appears to have been the same: dpkg maintainer Guillem Jover blocked the push until he could do a full review of the relevant code. The problem, it seems, is that this review never happened; Guillem is a busy developer and has not been able to find the time to do that work. So the multiarch-compatible dpkg remained out of the repository; only those willing to build it from source were able to test the multiarch changes. Playing with fundamental packaging system changes is scary enough without having to build the tools, so this situation can only have led to less testing of multiarch support in Debian as a whole.
In October, 2011, Philipp Kern, representing the release team, expressed concern about the delay, saying:
Currently nobody can test multiarch with in-archive software. The multi arch patches did not even land in experimental, despite some pokes from fellow project members.
Philipp requested that the new dpkg find its way into experimental immediately, with a move to unstable two weeks later. But things did not happen that way; instead, Guillem blocked another attempt to upload the work into experimental. Debian project leader Stefano Zacchiroli responded with some concerns of his own:
Guillem did not respond to this message until March 3, 2012 - over four months after it was sent. Meanwhile, the dpkg work languished outside of the Debian repositories. In January, 2012, Cyril Brulebois pushed the new dpkg using the Debian non-maintainer update (NMU) process. An NMU can be a way to route around an unresponsive or distracted maintainer, but the process has its limits; in this case, Guillem simply reverted the NMU, leaving the situation as it was before.
At the beginning of February, Stefano apparently decided that the situation had gone on for too long. The powers of the project leader are famously limited, though; there was nothing Stefano could do on his own to force a solution to the problem. The Debian Technical Committee, however, is a different story; it is empowered by the Debian Constitution as the final decision maker in the case of otherwise unresolvable technical disputes. So Stefano referred the problem to the committee, asking that it decide whether one of the dpkg maintainers can block progress indefinitely in such a situation.
The committee did not need long to reach a decision; it concluded unanimously that Guillem's blocking of the dpkg upload should be overridden. Guillem was given a set of deadlines by which to get the code into experimental (then unstable); if those deadlines were not met, one of the other dpkg maintainers (Raphaël Hertzog) was given the power to do the upload instead. As it happens, Guillem met the deadline and Debian now has a multiarch-capable package manager. This move has spawned another massive email thread - but this is Debian we're talking about, so it's really just business as usual at this point. Multiarch in wheezy is back on track.
While the Technical Committee resolved this particular conflict, it did not (and probably could not hope to) resolve the larger question. Debian, like many free software projects, tries to make its decisions by consensus whenever possible. But the project also gives developers something close to absolute power over the packages that they maintain. Occasionally, those two principles will come into conflict, as was the case here. Resolving such conflicts will never be easy.
In this case, everybody was working toward the same goals. Even those who were most critical of Guillem's behavior stopped far short of suggesting that he was deliberately trying to delay the multiarch work. He was simply trying to ensure the highest level of technical excellence for a package he maintains - a package that is critical to the functioning of the distribution as a whole. The problem is that he ran afoul of a project that has been trying - with significant success - to bring a bit more predictability to the release process in recent years. Overriding him was probably necessary if the release goals were to be met, but it must still feel harsh to a developer who has put a lot of time into improving the project's core infrastructure.
A large project like Debian, in the end, will have occasional conflicts involving developers who, for whatever reason, are seen as holding up important changes. So there needs to be some sort of mechanism for avoiding and dealing with such blockages. Debian's approach, which includes having teams of developers (instead of a single person) working on important packages and the Technical Committee as a court of last resort appears to work pretty well. The benevolent dictator model used by a number of other projects can also be effective, depending on the quality of the dictator. In the end, our community does not often have serious problems in this area; we are able to manage the interactions of thousands of dispersed developers with surprisingly little friction. But it's good to know that these conflicts can be resolved when they do arise.
The Perl 5 release process
LWN recently published an article, "The unstoppable Perl release train?", that raised some questions about the Perl 5 release process. In particular, the article questioned how the Perl 5 Porters handled a recent bug report from Tom Christiansen about Unicode support. This article aims to provide the reader with a better understanding of the current Perl release process, how it benefits Perl 5 users, and how it has helped invigorate Perl 5 development.
First, a disclaimer. I am decidedly pro-Perl. I am paid to program in Perl, and I've been doing so for over ten years now. I've contributed to the Perl core, mostly in the form of documentation, and I was also the release manager for the December 2011 development release of Perl, Perl 5.15.6.
Who are the Perl 5 Porters?
I'll be mentioning this group regularly throughout the article, so you should know who they are. The Perl 5 Porters are the volunteers who develop and maintain the Perl 5 language. They consist of anyone who cares enough about Perl 5 to participate in discussions on the perl5-porters email list. There's no formal membership, and influence is based on a combination of hacking skill and the ability to not be a jerk. The more you contribute and the more you communicate, the more control you have over the future of Perl 5. This should sound familiar if you know how other FOSS projects work.
While there are many people on the list, there are not that many people who actively contribute to the Perl core's C code. Currently, there are two paid developers, Nicholas Clark and Dave Mitchell, who are funded by grants from the Perl Foundation. In addition, there are usually another two to six people actively hacking on the C core at any time. Many more people contribute in the form of small patches, documentation, tests, and updates to core Perl modules.
Perl 5 and Perl 6
There's a reason this article's title specifically says Perl 5. Larry launched the Perl 6 project over ten years ago. When it was first being discussed, many Perl users thought that Perl 6 would supplant Perl 5 in much the same way that Perl 5 supplanted Perl 4.
Since then, that expectation has changed. Perl 5 and Perl 6 are both being actively developed, and there's no expectation that Perl 6 will replace Perl 5 any time soon. We now say that both Perl 5 and Perl 6 are part of the Perl language family.
The two languages co-exist peacefully, and ideas cross-pollinate between the two communities regularly.
A brief history of Perl 5
Larry Wall released Perl 1 in December 1987. He released Perl 5.000 in October 1994. Perl 5 is the Perl most of us know and love (or hate) today. It introduced features like object-orientation, the ability to dynamically load extensions written in C (using the "XS" mechanism), modules, closures, and more.
After the Perl 5.000 release, other Porters started making releases of Perl. Larry was no longer the sole keeper of Perl, and Perl development adopted the notion of a "Pumpking". Initially, the Pumpking (there has not been a Pumpqueen yet, unfortunately) was the person responsible for shepherding the development of each stable release. However, Larry was still Benevolent Dictator for Life, and Perl development still followed Rules 1 and 2, which can be paraphrased as "Larry has the final say on Perl, and can contradict himself whenever he feels like it."
As Larry started spending more time on Perl 6 development, he became less involved with Perl 5. These days, his focus is almost exclusively on Perl 6. While Rules 1 and 2 still apply in theory, in practice Larry does not make decisions involving Perl 5. Instead, the Pumpking is in charge of the language. The first Pumpking to take on this responsibility was Jesse Vincent. Recently, Ricardo Signes took over the Pumpking position.
For a long time Perl's release schedule was "release it when it's done". There was no specific list of features for a given release, and it wasn't always clear what should block a release from going out. In some cases, major releases ended up coalescing around a big feature, like threads (5.6.0) or Unicode (5.6.0 and 5.8.0), and that drove the release. In other cases, releases were repeatedly delayed due to a vicious cycle of adding a feature, fixing its bugs, and then adding a new feature during the bug fixing stage.
The development process also failed to clearly distinguish between major and minor releases. There was a nearly two year gap between the release of 5.10.0 and 5.10.1. In hindsight, the 5.10.1 release was not a minor release, and deserved to be called 5.12.0.
The current release process and support schedule
After the long gap between 5.10.0 and 5.10.1, many Porters wanted to streamline the development process. When Jesse Vincent took over as Pumpking for Perl 5.12.0, the Porters adopted a "timeboxed" release schedule. The theory is that in April of each year, the Pumpking will package up the code in the "blead" branch (Perl's name for "master") and ship it as the new major stable release.
Each major release is followed by a series of minor releases. Minor releases are minor! They include a small number of changes that are focused on fixing the most critical bugs. New features or incompatibilities are never introduced in a minor release.
Each major release series is supported for two years, except for critical security fixes, which are supported for three years. David Golden has a great write up of this support policy on his blog.
All release numbers start with "5.". The second number represents the major version number. The last number is the minor release number. Even major version numbers indicate stable releases. Odd major versions indicate development releases (e.g. 5.11.x). Minor development releases are made each month for community review and testing.
Minor stable releases are always backward compatible with other minor releases in the same stable series, down to binary compatibility for XS (C-based) extensions. Major releases may break backward compatibility, but Perl 5 is conservative, and backward incompatibilities are not introduced lightly.
The first release to follow this schedule was Perl 5.12.0, released on April 12, 2010.
Aiming for stability
The release schedule includes three well-defined freeze points. The first is the "contentious" code freeze, which happens in December before the stable release. This means that any changes that don't have widespread agreement from the Porters at this time must wait until after the stable release to be merged. The next freeze point is for user-visible changes, such as API changes, new features, backward incompatibilities, etc. This freeze occurs in February. In March, a "full" code freeze goes into effect in order to finalize the April Release Candidate and subsequent stable release.
There's a lot of work that goes into making this possible. While the number of people with commit bits has expanded, there is much stronger pressure to do work in per-feature or per-bugfix branches. This lets blead stay stable enough for releases to actually happen on schedule.
The Perl release train is only somewhat unstoppable. It stops for release blocking issues. Deciding whether an issue should block a release is more of an art than a science. Some questions that can help determine whether an issue is a blocker are:
- Is the issue a regression as compared to the last stable release of Perl?
- Does the issue represent an unintentional break in backward compatibility?
- Does the issue break a huge number of CPAN modules, or a few really important ones?
- Does the issue cause a major performance problem with commonly occurring code?
- Does the issue break the ability to install modules from CPAN, or the ability to install Perl itself?
- Is someone willing to commit to fixing this before the scheduled release?
The exact decision as to whether an issue is a blocker is made by the Pumpking in consultation with the Porters. Typically, a few months before the stable release, the Pumpking will ask the Porters to trawl through the Perl 5 issue tracker and nominate issues for blocker status. After discussion on the mailing list, the Pumpking comes up with a "final" list of issues. This list isn't really final. If someone found a major regression the day before the release, that could be declared a blocker.
So why weren't the bugs that Tom Christiansen found declared to be release blockers? This is answered by the first question above. These bugs exist in the current stable release of Perl. They are not regressions. Releasing Perl 5.16.0 with these bugs does not make Perl's Unicode support any worse than it already is. However, delaying Perl 5.16.0 does mean that users would not have access to many other useful bug fixes and features.
Timeboxed releases ensure that users can expect improvements and bug fixes on a predictable schedule, without the long gaps that were common in Perl's earlier history.
As for the security bug discussed in the same thread, a fix is being worked on. It's possible that the fix will be ready for the 5.16.0 release. If it's not ready in time for 5.16.0 then there will be a 5.16.1 release when the fix is ready, as well as 5.14.3 and 5.12.5 releases.
Perl 5 has come a long way in the last few years. Every month, a new development release comes out, shepherded by a different release manager. Once a year, a new major version is released. Releasing a new major version is no longer a months-long ordeal. When the new release schedule was first proposed, many of us wondered if such an ambitious plan could work. Now, it almost seems routine.
The life story of the XInput multitouch extension
The XInput multitouch extension provides for multitouch input events to be sent from the device to the appropriate window on the desktop. Multitouch events can then be used for gesture recognition, multi-user interactions, or multi-point interactions such as finger painting. While the general concepts behind delivering multitouch events through a window server are fairly well defined, there are many devils hiding in the details. Here, we provide a look into the development of the multitouch extension and many of the issues encountered along the way.
Motivations
For Henrik Rydberg, it began as an attempt to make the trackpad on his Apple Macbook work just as well on Ubuntu as it does on OS X. For Stéphane Chatty, it began as a research project to develop new user interface paradigms. For your author, it began as a quest to enable smooth scrolling from an Apple Magic Mouse.
Like many undertakings in open source, multitouch on the Linux desktop is the culmination of the many efforts of people with disparate goals. With the release of the X.org server 1.12, we now have a modern multitouch foundation for toolkits and applications to build upon.
The kernel
The beginning of multitouch support for Linux can be traced back to the 2.6.30 kernel. Henrik had just merged additions to the evdev input subsystem for multitouch along with support for the trackpads found in all Apple unibody Macbooks. Stéphane then added multitouch support to some existing Linux drivers, such as hid-ntrig, and some new drivers, such as hid-cando.
Some developers started playing around with the new Linux multitouch support. Over time, the libavg and kivy specialized media and user interface toolkits added multitouch support based on the evdev interface. However, there was a glaring issue: the absence of window-based event handling. Applications had to assume that they were full-screen, and all touch events were directed to them exclusively. This was a fair assumption for games, which was the main impetus for libavg touch support. However, it was clear we needed to develop a generic multitouch solution working through the X window system server.
The X.org server and the X gesture extension
Discussions began on how to incorporate touch events into the X input model shortly after kernel support was present. Initial work by Benjamin Tissoires and Carlos Garnacho extended XInput 2.0's new multi-device support for multitouch. Each time a touch began, a new pointing "device" would be created. Alternatively, a pool of pre-allocated touch "devices" could be used. However, this approach broke many assumptions about how devices and events should be handled. As a simple example, a "touch begin" event would appear to the client as though the pointer had moved to a new location. How would the client know that the previous touch hadn't simply moved, as opposed to a new touch starting? At this point Peter Hutterer, the X.org input subsystem maintainer, decided we needed completely new semantics for touch input through X.
Around the same time, Canonical was interested in adding multitouch interfaces to the Linux desktop. The uTouch team, of which your author is a member, was formed to develop a gesture system that could recognize and handle system-level and application-level gestures. Since X did not have touch support yet, the team focused on providing gestures through the X.org server using a server extension. The result was shipped in Ubuntu 10.10 and the extension was proposed for upstream X.org.
While many developers were enthusiastic about the potential for gesture support through the X.org server, it was not meant to be. X.org as a foundation holds backward compatibility in high regard. Applications written over 20 years ago should still function properly today, in theory. Though backward compatibility has benefits, it is a double-edged sword. Any new functionality must be thoroughly reviewed, and inclusion in one X.org release means inclusion in all future releases. Even to this day, gesture support is not a settled technology. It is highly probable that an X gesture extension created a year and a half ago would not be sufficient for use cases we are coming up with today, let alone potentially years from now. So the X developers are reluctant to include gesture support at this time.
XInput multitouch was born
Those concerns notwithstanding, the need for touch through the X server grew stronger. Peter and Daniel Stone developed a first draft of the XInput 2.1 protocol, which later became XInput 2.2, where touches send separate events from traditional pointer motion and button press events. Three event types ("touch begin," "update," and "end") were specified. However, the need to support system-level gestures added a requirement for a new method of event handling: the touch grab.
X11 input device grabs allow for one client to request exclusive access to a device under certain criteria. A client can request an active grab so that all events are sent to it. A client can also request a passive grab, where events are sent to it when a button on the mouse is pressed while the cursor is positioned over a window, or when a key is pressed on a keyboard while a window is focused. Passive grabs allow for raising a clicked window in a click-to-focus desktop environment, for example. When the user presses the mouse button over a lower window, the window manager receives the event first through a passive grab. It raises the window to the top and then replays the event so the application can receive the button press. However, X only allows for a passively grabbing client to receive one event before it needs to make a decision on whether to accept it and all future events until a release event, or to request that the server replay the event to allow another client to receive the events.
This mechanism has been adequate for decades, but doesn't quite work for system-level gestures. Imagine that the window manager wants to recognize a three-touch swipe. It is impossible to know if a three-touch swipe has been performed if the window manager can only view touch begin events; it must be able to receive the subsequent events to determine whether the user is performing a swipe or not. The idea behind touch grabs is that the grabbing client can receive all events until it makes a decision about whether to accept or reject a touch sequence. Now, the window manager can listen for all touches that begin around the same time and watch them as they move. If there are three touches and they all move in the same direction, the window manager recognizes a drag gesture and accepts the touch sequences. No one else will see the touch events. However, if the touches don't match for any reason, the window manager rejects the touch sequences so other clients, such as a finger painting application, can receive the events.
This works great for system-level gesture recognition. However, it necessarily imposes lag between a physical touch occurring and an application receiving the touch events if the the system is attempting to recognize gestures. At the X Developer Summit 2010, your author presented an overview of the vision for an XInput multitouch-based uTouch gesture stack. One afternoon, while eating lunch and discussing things over beer, the issue of the potential for lag came up. Between those at the table, including Peter, Kristian Høgsberg, and myself, the solution was elusive. However, at some point later in the conference the issue came up again on IRC. Keith Packard made the suggestion that touch events be sent to all clients, even before they become the owner of touch sequences. With the idea at hand, your author scurried home and drafted up the beginning of what would later become ownership event handling.
As Nathan Willis explained in his overview of the XInput 2.2 protocol, a client may elect to receive events for a touch sequence before it becomes the owner of the sequence by requesting touch ownership events alongside touch begin, update, and end events. The client will receive touch events without delay, but must watch for notification of ownership. Once a touch ownership event is received for a sequence, the client owns the sequence and may process it as normal. Alternatively, if a preceding touch grab is accepted, the client will receive a touch end event for the touch sequence without ever receiving a touch ownership event. This mechanism allows for a client to perform any processing as touch events occur, but the client must take care to undo any state if the touch sequences are ultimately accepted by some other client instead.
With the basic concepts hammered out, your author, with an initial base of work contributed by Daniel Stone, began a prototype implementation that shipped in Ubuntu 11.04 and 11.10. The uTouch gesture system based around XInput multitouch began to take form. This was enough to prove that the protocol was reasonably sound, and efforts began in earnest on an upstream implementation for the X.org server 1.12 release.
It is interesting to note how XInput multitouch compares to other window server touch handling. On one end of the spectrum are phones and tablets, which run most applications full screen. This, and the lack of support for indirect touch devices, e.g. touchpads, means mobile device window manager multitouch support is much simpler. On the other end of the spectrum are desktop operating systems. Windows 7 shipped with multitouch support, but only for touchscreens. For an unknown reason, Windows also only supports raw multitouch events or gesture events on a window, but not both. As an example of the consequences of this shortcoming, Qt had to build their own gesture recognition system so it could support both raw multitouch events and gestures at the same time. OS X only supports touchpads, but this simplification alone ensures that touches are only ever sent to one window at a time. The event propagation model they chose would not work for touchscreens. In comparison, the XInput multitouch implementation allows for system- and application-level gestures and raw multitouch events at the same time across both direct and indirect touch devices. In your author's biased opinion, this is a key advantage of Linux on the desktop.
A few bumps in the road
Although development of multitouch through X took more time than anyone wanted, it was shaping up well for the 1.12 X.org server release. Many complex issues, such as pointer emulation for touchscreens, were behind us. However, touchpad support had yet to be finalized. Two large issues surfaced involving scrolling and other traditional touchpad gestures.
The first issue involved the ability to scroll in two separate windows while leaving one finger on the touchpad at all times. Imagine there are two windows side by side. The user positions the cursor over one window and begins a two-touch scroll motion on the trackpad. The user then lifts one finger and uses the remaining finger to move the cursor over the second window. The second finger is then placed on the trackpad again, and a second scroll motion is performed. Under the XInput multitouch protocol, a touch sequence is locked to a window once it begins. If two-touch scrolling is performed through gesture recognition based on XInput touch events, the touch that began over the first window could not be used for a scroll gesture over the second window because the touch events would remain locked to the first. In order to resolve this difficulty, it was decided that, when only one touch is active on a touchpad, no touch events are sent. In order to not send two events for one physical action, pointer motion was also prevented when more than one touch was present on a touchpad.
This fix resolved pointer motion, but other traditional touchpad gestures are even more problematic. Particularly troublesome is two-finger scrolling. When mice with scroll wheels were first introduced, they had discrete scroll intervals. The wheels often clicked up and down. This led to an unfortunate API design for scroll events in the X server. The X core protocol cannot send pointer events with arbitrary values, such as a scroll amount. To provide for scrolling through the X core protocol buttons 4, 5, 6, and 7 were redefined from general purpose buttons to scroll up, down, left, and right. When the user scrolls up using a scroll wheel, the X server sends a button 4 press event and then a button 4 release event. As an aside, this is the reason why we don't yet have smooth scrolling on the Linux desktop.
The problem for multitouch lies in the possibility of sending two separate events for one physical action. This would occur if we sent touch events at the same time we sent scroll button events. It was decided that touch events may not be sent while the server is also sending other events derived from touches. This means that if the user enables two-finger scrolling, touch events are inhibited unless three touches are active on the touchpad. Likewise, if the user performs a two-finger tap to emit a right click, touch events are also inhibited unless three touches are active on the touchpad, and so on.
Many workarounds were considered, but nothing provided an air-tight solution. The double-edged sword of backward compatibility prevents X from supporting scroll events, click emulation, and touch events at the same time. Your author hopes this situation will end up hastening support for traditional trackpad gestures on the client side of X instead of the server side.
Wrapping up
The development of the multitouch extension finished with the release of the X.org server 1.12 on March 5th, 2012. Many upcoming distribution releases, including Ubuntu 12.04 LTS, will be shipping it soon. Although this is the end of the X.org multitouch story, it is only the beginning for toolkits and applications. GTK+ recently merged an API for handling raw touch events for 3.4, and your author hopes to merge raw touch support for Qt in the near future. Next on the roadmap will be gesture support included in standard toolkit widgets and APIs for application developers. There is still plenty of work to do, but the will of those hoping to bring smooth scrolling support to the Apple Magic Mouse and many other multitouch features is quite strong.
Security
GitHub incidents spawns Rails security debate
On March 4, a GitHub user attracted considerable attention with a controversial attempt to provoke the Rails project to change an easily-exploitable setting in Rails's default configuration. He did it by demonstrating the problem in the wild, granting himself commit privileges to the Rails master repository. Within a few hours, the hole was patched on GitHub and a fix deployed in Rails, but the debate rolls on about which parties are responsible, and about what other sites remain vulnerable.
Mass assignments
At the center of the trouble is Rails's ActiveRecord::Base#attributes= method, which is widely used in applications for updating all sorts of database records. The method accepts input about which fields (or attributes) of the record to change concatenated together using a straightforward &someAttribute=someNewValue syntax like that found in an HTTP POST request.
By default, Rails allows any attributes to be updated through this method, which means that attackers can change any and all attributes of their choosing, simply by appending the command to unvalidated form input. For example, injecting &created_at=1955-11-05 into an HTTP request would overwrite the value of an attribute which should be off-limits, but the update sails through unimpeded. Rails does offer a whitelisting macro called attr_accessible and a blacklisting macro called attr_protected, with which developers can restrict access to critical attributes, but neither is active by default.
This "mass assignment" situation has been regarded as a security weakness for years — it is discussed, among other places, in a 2008 article on Rail Spikes, which provides pointers to a mass assignment audit tool and advice for protecting one's applications — and points out that four of the five most popular Rails applications are vulnerable to mass assignment exploits.
On March 1, Egor Homakov opened issue 5228 against Rails (which is hosted at GitHub) calling attention to the problem, and looking for a fix that would force developers to use attr_accessible. The initial comment asks only for ideas:
It is after this initial bug report that opinion begins to diverge. The issue was closed and reopened quickly by a core Rails developer, one developer accused Homakov of trolling, and there was little in the way of in-depth discussion. Then Homakov opened issue 5239 against Rails on March 2, exploiting GitHub's own mass assignment vulnerability to overwrite the timestamp and peg the bug report as coming from the year 3012. In a comment, he apologized for the "inconvenience" and called the stunt "naughty testing" (though observing that he could use the exploit to close the issue himself).
Rails developer Xavier Noria then closed issue 5228 with the comment that "the consensus is the pros of the default configuration outweigh the pros of the alternative.
" Homakov protested that issue 5239 proved that the problem was widespread and that the Rails project should assume responsibility for offering secure defaults — at least by blacklisting certain obvious attributes, such as the creation-date overwritten in issue 5239. Noria replied that Homakov was "not discovering anything unknown, we already know this stuff and we like attr protection to work the way it is
", and that "
it is your responsibility to secure your application. It is your responsibility to avoid XSS, to ensure that the user is editing a resource that belongs to him, etc.
".
On March 4, Homakov re-used the mass assignment vulnerability to grant commit privileges to his account for the Rails repository, and used it to add a file named "hacked" to Rails's master, containing the line "another showcase of rails apps vulnerability.
"
Responses
That commit is what was widely reported as the "hacking" or compromise of GitHub. Within an hour or so, GitHub pushed out a fix to the vulnerability, suspended Homakov's GitHub account, and published a blog announcement that it had "started an investigation.
"
To many commenters on the blog post and on issue 5228, that seemed like an overreaction, since it seemed clear that Homakov had only attempted to call attention to the security hole, and had not taken any destructive action. Later in the day, GitHub reinstated Homakov's account (at that point referring to it as a temporary suspension that had been made "pending a full investigation
"), after concluding that he did not act with malicious intent. The second blog post also said that Homakov had privately reported a security flaw to GitHub on March 2, and that GitHub "worked with him to fix it in a timely fashion
" — in contrast to the March 4 commit, which the blog post characterized as "without responsible disclosure
".
It is not clear what the March 2 vulnerability report was; GitHub user Max Bernstein said he had exchanged email with Homakov, who said he wrote to GitHub about the vulnerability and GitHub had not responded. Nevertheless, the GitHub ship was righted relatively quickly.
Almost as quickly, Michael Koziarski committed a fix to Rails that requires developers to explicitly whitelist all attributes with attr_accessible — fixing the original issue.
Finger-pointing
Despite the speedy technical resolution, the debate over the incident continued to rage on — over whether or not Homakov had acted irresponsibly, whether the GitHub team had overreacted in suspending his account, and whether or not Rails or GitHub was ultimately at fault for the presence of the vulnerability in the first place.
It is the latter question that has the most far-reaching implications. After all, GitHub is offering a commercial service to the public, and has a responsibility to write secure code. But Rails, like any software framework, arguably exists to simplify the process of writing quality (in this case, secure) code. As GitHub user rainyday put it, "One of the reasons people use frameworks in the first place is because this type of thing is supposed to be done for you minimizing the chance of human error.
" Likewise, user Douwe Maan said "I'm disappointed that GitHub made such an obvious mistake
" — but ultimately said Rails's defaults were to blame, which jeopardizes many other sites.
User Eric Mill commented on the GitHub-is-to-blame position, arguing that "The mechanism to secure Github is there without any code changes to Rails, which is how Github could fix it within minutes
" — but also made the counterpoint, saying:
The "disconnect" between the Rails development team and Rails application authors was echoed by others. As a commenter on LWN's own March 5 news item about the incident said:
The whole affair reminded many of PHP's experience with register_globals. Although removed in PHP 5.4.0, in previous releases the register_globals directive was on by default, and allowed uninitialized variables to be injected into an application by many means — including HTML forms. The arguments for keeping it on were that many existing applications expected it to work that way, and that anyone who was concerned about the security implications could switch it off at will.
Ultimately, PHP bowed to public concern over the security of register_globals in the wild. Thus, although Rails did close the vulnerability after Homakov's stunt, GitHub user Karl Baron expressed surprise that the lesson needed to be learned again, saying in the issue 5228 comments: "Nobody here sees the irony in Rails redoing what PHP was ridiculed for for so long? Never. inject. user. input. by. default.
"
Finally, although the mass assignment problem may have been fixed in Rails's master, and repaired on GitHub, those fixes do not mark the end of the issue. If nothing else, the publicity surrounding the event has raised awareness — but certainly not everyone who has heard the news is free of malicious intent, and there are still scores of Rails applications vulnerable to the attack. Case in point: Chris Acky posted his own analysis of the events at Posterous (another Rails-based service), and shortly thereafter, comments began appearing with hacked timestamps, including one from Homakov stamped eight years in the past — and two others, apparently from someone else altogether.
The fact that the mass assignment vulnerability is so widespread in the wild illustrates why some have mixed feelings on whether or not Homakov's attention-grabbing stunt ought to be regarded as heroically daring or recklessly bad. It does not change the vulnerability of any site, but it greatly increases the likelihood of an attack. Yet it is also a clear reminder that web frameworks should provide sensible and secure defaults for their users — and that even popular frameworks like Rails have a responsibility to learn from the mistakes of others.
Brief items
Security quotes of the week
Github compromised
The Github repository site has been compromised; this posting contains a small amount of information. "At 9:53am Pacific Time this morning we rolled out a fix to the vulnerability and started an investigation into the impact of the attack. Database and log analysis have shown that the user compromised three accounts (rails and two others that appear to have been proofs of concept). All affected parties have been or will be contacted once we are certain of the findings." Anybody hosting a repository there should probably check its integrity just to be sure.
New vulnerabilities
apt: man-in-the-middle attack
| Package(s): | apt | CVE #(s): | CVE-2012-0214 | ||||
| Created: | March 6, 2012 | Updated: | March 7, 2012 | ||||
| Description: | From the Ubuntu advisory:
Simon Ruderich discovered that APT incorrectly handled repositories that use InRelease files. The default Ubuntu repositories do not use InRelease files, so this issue only affected third-party repositories. If a remote attacker were able to perform a man-in-the-middle attack, this flaw could potentially be used to install altered packages. | ||||||
| Alerts: |
| ||||||
bugzilla: cross-site request forgery
| Package(s): | bugzilla | CVE #(s): | CVE-2012-0453 | ||||
| Created: | March 7, 2012 | Updated: | March 7, 2012 | ||||
| Description: | Bugzilla does not properly validate form attributes passed to xmlrpc.cgi, enabling cross-site request forgery attacks. | ||||||
| Alerts: |
| ||||||
file: crash from malformed CDF files
| Package(s): | file | CVE #(s): | |||||
| Created: | March 1, 2012 | Updated: | March 7, 2012 | ||||
| Description: | From the Debian advisory: The file type identification tool, file, and its associated library, libmagic, do not properly process malformed files in the Composite Document File (CDF) format, leading to crashes. | ||||||
| Alerts: |
| ||||||
gnash: information disclosure
| Package(s): | gnash | CVE #(s): | CVE-2011-4328 | ||||||||||||||||||||||||
| Created: | March 7, 2012 | Updated: | March 14, 2012 | ||||||||||||||||||||||||
| Description: | The gnash flash player stores cookies in world-readable files with predictable names. | ||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||
httpd: denial of service
| Package(s): | httpd | CVE #(s): | CVE-2007-6750 | ||||||||
| Created: | March 7, 2012 | Updated: | March 8, 2012 | ||||||||
| Description: | The Apache HTTPD server is subject to denial-of-service attacks using partial requests. | ||||||||||
| Alerts: |
| ||||||||||
imagemagick: code execution
| Package(s): | imagemagick | CVE #(s): | CVE-2012-0247 CVE-2012-0248 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | March 6, 2012 | Updated: | March 8, 2012 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the Gentoo advisory:
Two vulnerabilities have been found in ImageMagick:
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
kernel: denial of service
| Package(s): | linux-ti-omap4 | CVE #(s): | CVE-2011-3619 | ||||||||||||||||||||
| Created: | March 6, 2012 | Updated: | March 7, 2012 | ||||||||||||||||||||
| Description: | From the Ubuntu advisory:
A flaw was discovered in the Linux kernel's AppArmor security interface when invalid information was written to it. An unprivileged local user could use this to cause a denial of service on the system. | ||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||
kernel: denial of service
| Package(s): | kernel | CVE #(s): | CVE-2012-1090 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | March 7, 2012 | Updated: | June 1, 2012 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | The CIFS filesystem can be made to leak open files; the resulting dentry reference count mismatch causes an oops when the filesystem is unmounted. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
kernel: denial of service
| Package(s): | kernel | CVE #(s): | CVE-2011-4594 | ||||||||||||||||||||||||
| Created: | March 7, 2012 | Updated: | March 7, 2012 | ||||||||||||||||||||||||
| Description: | The sendmmsg() system call accesses user-space memory improperly, enabling denial-of-service attacks. | ||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||
kernel: denial of service
| Package(s): | kernel | CVE #(s): | CVE-2012-0879 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | March 7, 2012 | Updated: | April 3, 2012 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | The clone() system call does not properly handle the CLONE_IO option, enabling denial-of-service attacks. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
kernel: denial of service
| Package(s): | kernel | CVE #(s): | CVE-2011-4621 | ||||
| Created: | March 7, 2012 | Updated: | March 7, 2012 | ||||
| Description: | In some situations, user-space processes can cause specific kernel threads to block. | ||||||
| Alerts: |
| ||||||
libxml-atom-perl: unintended read access
| Package(s): | libxml-atom-perl | CVE #(s): | |||||
| Created: | March 5, 2012 | Updated: | March 7, 2012 | ||||
| Description: | From the Debian advisory:
It was discovered that the XML::Atom Perl module did not disable external entities when parsing XML from potentially untrusted sources. This may allow attackers to gain read access to otherwise protected [resources], depending on how the library is used. | ||||||
| Alerts: |
| ||||||
libxslt: denial of service
| Package(s): | libxslt | CVE #(s): | CVE-2011-3970 | ||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | March 1, 2012 | Updated: | October 4, 2012 | ||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the Mandriva advisory: libxslt allows remote attackers to cause a denial of service (out-of-bounds read) via unspecified vectors (CVE-2011-3970). | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||
lightdm: permission bypass/denial of service
| Package(s): | lightdm | CVE #(s): | |||||
| Created: | March 5, 2012 | Updated: | March 7, 2012 | ||||
| Description: | From the Ubuntu advisory:
Austin Clements discovered that Light Display Manager incorrectly leaked file descriptors to child processes. A local attacker can use this to bypass intended permissions and write to the log file, cause a denial of service, or possibly have another unknown impact. | ||||||
| Alerts: |
| ||||||
movabletype-opensource: multiple vulnerabilities
| Package(s): | movabletype-opensource | CVE #(s): | |||||
| Created: | March 5, 2012 | Updated: | March 7, 2012 | ||||
| Description: | From the Debian advisory:
Several vulnerabilities were discovered in Movable Type, a blogging system: Under certain circumstances, a user who has "Create Entries" or "Manage Blog" permissions may be able to read known files on the local file system. The file management system contains shell command injection vulnerabilities, the most serious of which may lead to arbitrary OS command execution by a user who has a permission to sign-in to the admin script and also has a permission to upload files. Session hijack and cross-site request forgery vulnerabilities exist in the commenting and the community script. A remote attacker could hijack the user session or could execute arbitrary script code on victim's browser under the certain circumstances. Templates which do not escape variable properly and mt-wizard.cgi contain cross-site scripting vulnerabilities. | ||||||
| Alerts: |
| ||||||
python-sqlalchemy: SQL injection
| Package(s): | python-sqlalchemy | CVE #(s): | CVE-2012-0805 | ||||||||||||||||||||||||||||||||||||
| Created: | March 7, 2012 | Updated: | September 27, 2012 | ||||||||||||||||||||||||||||||||||||
| Description: | The SQLAlchemy object-relational mapper does not properly sanitize offset and limit values, enabling SQL injection attacks. | ||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||
spamdyke: arbitrary code execution
| Package(s): | spamdyke | CVE #(s): | CVE-2012-0802 | ||||
| Created: | March 6, 2012 | Updated: | March 7, 2012 | ||||
| Description: | From the Gentoo advisory:
Boundary errors related to the "snprintf()" and "vsnprintf()" functions in spamdyke could cause a buffer overflow. A remote attacker could possibly execute arbitrary code or cause a Denial of Service. | ||||||
| Alerts: |
| ||||||
stunnel: code execution
| Package(s): | stunnel | CVE #(s): | CVE-2011-2940 | ||||
| Created: | March 1, 2012 | Updated: | March 7, 2012 | ||||
| Description: | From the Gentoo advisory: An unspecified heap vulnerability was discovered in stunnel. The vulnerability may possibly be leveraged to perform remote code execution or a Denial of Service attack. | ||||||
| Alerts: |
| ||||||
ubunutone-couch: certificate validation flaw
| Package(s): | ubuntuone-couch | CVE #(s): | |||||
| Created: | March 1, 2012 | Updated: | March 7, 2012 | ||||
| Description: | From the Ubuntu advisory: It was discovered that Ubuntu One Couch did not perform any server certificate validation when using HTTPS connections. If a remote attacker were able to perform a man-in-the-middle attack, this flaw could be exploited to alter or compromise confidential information. | ||||||
| Alerts: |
| ||||||
uzbl: information disclosure
| Package(s): | uzbl | CVE #(s): | CVE-2012-0843 | ||||||||
| Created: | March 7, 2012 | Updated: | March 7, 2012 | ||||||||
| Description: | The uzbl web browser stores cookies in a world-readable file. | ||||||||||
| Alerts: |
| ||||||||||
Page editor: Jake Edge
Kernel development
Brief items
Kernel release status
The current development kernel is 3.3-rc6, released on March 3. Linus says that things are calming down: "In fact, it's been calm enough that this *might* be the last -rc, but we'll see how the upcoming week goes. If it stays calm (and hopefully even calms down some more), there doesn't seem to be any major reason to drag out the release cycle any more."
Stable updates: The 2.6.32.58 stable
kernel update was released on March 4. It
contains a long set of important fixes, as usual. Greg says: "This
is the last 2.6.32 kernel I will be releasing. The 2.6.32 kernel is now in
'extended-longterm' maintenance, with no set release schedule from now on.
I STRONGLY encourage any users of the 2.6.32 kernel series to move to the
3.0 series at this point in time.
" Willy Tarreau has reaffirmed that he will take over 2.6.32
maintenance - but at a slower pace.
Quotes of the week
Kernel development news
Two approaches to kernel memory usage accounting
The kernel's memory usage controller allows a system administrator to place limits on the amount of memory used by a given control group. It is a useful tool for systems where memory usage policies must be applied - often systems where virtualization or containers are being used - but it has one notable shortcoming: it only tracks user-space memory. The memory used by the kernel on behalf of a control group is not tracked. For some workloads, the amount of memory involved may be considerable; a control group that accesses large numbers of files, for example, will create a lot of entries in the kernel's directory entry ("dentry") cache. Without the ability to control this kind of memory use in the kernel, the memory controller remains a partial solution.Given that, it should not be surprising that a patch set adding the ability to track and limit kernel memory use exists. What may be a little more surprising is the fact that two independent patch sets exist, each of which adds that feature in its own way. Both were posted for consideration in late February.
The first set was posted by Glauber Costa, the author of the related per-cgroup TCP buffer limits controller. Glauber's patch works at the slab allocator level; only the SLUB allocator is supported at this time. With this approach, developers must explicitly mark a slab cache for usage tracking with this interface:
struct memcg_cache_struct {
int index;
struct kmem_cache *cache;
int (*shrink_fn)(struct shrinker *shrink, struct shrink_control *sc);
struct shrinker shrink;
};
void register_memcg_cache(struct memcg_cache_struct *cache);
Once a slab cache has been passed to register_memcg_cache(), it is essentially split into an array of parallel caches, one belonging to each control group managed by the memory controller. With some added infrastructure, each of these per-cgroup slab caches is able to track how much memory has been allocated from it; this information can be used to cause allocations to fail should the control group's limits be exceeded. More usefully, the controller can, when limits are exceeded, call the shrink_fn() associated with the cache; that function's job is to find memory to free, bringing the control group back below its limit.
Glauber's patch set includes a sample implementation for the dentry cache. When a control group creates enough dentries to run past its limits, the shrinker function can clean some of them up. That may slow down processes in the affected control group, but it should prevent a dentry-intensive process from affecting processes in other control groups.
The second patch set comes from Suleiman
Souhlal. Here, too, the slab allocator is the focus point for memory
allocation tracking, but this patch works with the "slab" allocator
instead of
SLUB. One other significant difference with Suleiman's patch is that it
tracks allocations from all caches, rather than just those
explicitly marked for such tracking. There is a new
__GFP_NOACCOUNT. flag to explicitly prevent tracking, but, as a
whole, it's an opt-out system rather than opt-in. One might argue that, if
tracking kernel memory usage is important, one should track all of it.
But, as Suleiman acknowledges, the ability to track allocations from all
caches "is also the main source of complexity in the
patchset
".
Under this scheme, slab caches operate as usual until an allocation is made from a specific cache while under the control of a specific cgroup. At that point, the cache is automatically split into per-cgroup caches without the intervention (or knowledge) of the caller. Of course, this splitting requires taking locks and allocating memory - activities that can have inconvenient results if the system is running in an atomic context at the time. In such situations, the splitting of the cache will be pushed off into a workqueue while the immediate allocation is satisfied from the pre-split cache. Much of the complexity in Suleiman's patch set comes from this magic splitting that works regardless of the calling context.
There is no shrinker interface in this patch set, though that is clearly planned for the future.
When a control group is deleted, both implementations shift the accounting up to the parent group. That operation, too, can involve some complexity; the processes that performed the allocation may, like their control group, be gone when the allocations are finally freed. Glauber's patch does no tracking for the root control group; as a result of that decision (and some careful programming), the cost of the kernel memory tracking feature is almost zero if it is not actually being used. Suleiman's patch does track usage for the root cgroup, but that behavior can be disabled with a kernel configuration option.
Neither patch appears to be ready for merging into the mainline prior to the 3.5 development cycle - and, probably, not even then. There are a lot of details to be worked out, the mechanism needs to work with both slab and SLUB (at least), and, somehow, the two patch sets need to turn into a single solution. The two developers are talking to each other and express interest in working together, but there will almost certainly need to be guidance from others before the two patches can be combined. If users of this feature feel that tracking allocations from all slab caches is important, then, clearly, whatever is merged will need to have that feature. If, instead, picking a few large users is sufficient, then a solution requiring the explicit marking of caches to be tracked will do. Thus far, there has not been a whole lot of input from people other than the two developers; until that happens, it will be hard to know which approach will win out in the end.
Statistics for the 3.3 development cycle
As of this writing, the 3.3 development cycle is at 3.3-rc6 and things are starting to look pretty stable. So it must be about time for our traditional summary of interesting statistics for the 3.3 kernel. It has been an active cycle, with some 10,350 changesets merged from just over 1,200 developers. Some 563,000 lines of code were added to the kernel, but 395,000 lines were removed, for a net growth of about 168,000 lines.The most active developers this time around were:
Most active 3.3 developers
By changesets Mark Brown 281 2.7% Mauro Carvalho Chehab 271 2.6% Axel Lin 240 2.3% Al Viro 200 2.0% Tejun Heo 123 1.2% Tomi Valkeinen 101 1.0% Russell King 100 1.0% Matthew Wilcox 99 1.0% Ben Skeggs 94 0.9% Johannes Berg 93 0.9% Stanislaw Gruszka 93 0.9% Kuninori Morimoto 92 0.9% Eliad Peller 90 0.9% Takashi Iwai 88 0.9% Eric Dumazet 87 0.8% Dan Carpenter 86 0.8% Franky Lin 77 0.8% Kalle Valo 74 0.7% Lars-Peter Clausen 73 0.7% Artem Bityutskiy 68 0.7%
By changed lines Greg Kroah-Hartman 88664 11.7% Stanislaw Gruszka 38012 5.0% Mathieu Desnoyers 25968 3.4% Mauro Carvalho Chehab 21063 2.8% Alan Cox 20948 2.8% Kumar Gala 12083 1.6% Aurelien Jacquiot 9998 1.3% Mark Brown 9208 1.2% Evgeniy Polyakov 7979 1.1% David Daney 7684 1.0% Manuel Lauss 7316 1.0% Kuninori Morimoto 7115 0.9% Dmitry Kasatkin 6880 0.9% Jussi Kivilinna 6861 0.9% Ben Skeggs 6699 0.9% Axel Lin 6251 0.8% Jesse Gross 5940 0.8% Takashi Iwai 5140 0.7% Rob Clark 4962 0.7% Bart Van Assche 4711 0.6%
Mark Brown regularly appears in the list of top contributors; for 3.3, he contributed large numbers of patches in the sound and multi-function device subsystems. Mauro Carvalho Chehab is usually better known for routing vast numbers of Video4Linux2 changes into the kernel; this time, he wrote a substantial portion of those patches himself. Axel Lin's contributions were also in the sound subsystem. So, in other words, the top three contributors to the 3.3 kernel were all working with multimedia which is, clearly, a area with a lot of development going on. Al Viro is not a media developer; his work was mostly cleaning up interfaces deep within the virtual filesystem layer. Tejun Heo continues to dig into code all over the kernel; this time around he fixed up the memblock allocator, made a number of process freezer improvements, reworked the CFQ block I/O scheduler, and made a number of control group changes.
In the lines-changed column, Greg Kroah-Hartman heads the list again. This time around, almost all of his changes were deletions; a lot of code was removed from staging this time around, often because it graduated to the mainline kernel. Stanislaw Gruszka made a lot of changes to the iwlegacy network driver. Mathieu Desnoyers made the list for having added the LTTng tracing subsystem; unfortunately, that code was subsequently removed and will not appear in the 3.3 release. Alan Cox made the top five for his work with the gma500 graphics driver and its move out of the staging tree.
Just over 200 companies have been identified as having supported contributions to the 3.3 kernel. The most active companies this time around were:
Most active 3.3 employers
By changesets (None) 1322 12.9% Red Hat 1290 12.6% Intel 897 8.8% (Unknown) 524 5.1% Novell 450 4.4% Texas Instruments 422 4.1% IBM 357 3.5% Wolfson Microelectronics 282 2.8% Qualcomm 249 2.4% (Consultant) 243 2.4% MiTAC 240 2.3% Broadcom 231 2.3% Samsung 216 2.1% 211 2.1% Oracle 183 1.8% Freescale 163 1.6% Wizery Ltd. 111 1.1% Parallels 108 1.1% Renesas Electronics 104 1.0% (Academia) 102 1.0%
By lines changed Novell 113910 15.1% Red Hat 111338 14.7% (None) 81133 10.7% Intel 68378 9.0% Texas Instruments 35696 4.7% Samsung 27220 3.6% EfficiOS 25990 3.4% Freescale 22266 2.9% (Unknown) 19307 2.6% (Consultant) 18529 2.5% IBM 16026 2.1% Wolfson Microelectronics 13688 1.8% Qualcomm 11736 1.6% Broadcom 11180 1.5% Mellanox 8856 1.2% Cavium 7903 1.0% Renesas Electronics 7574 1.0% 7135 0.9% MiTAC 6491 0.9% Nicira Networks 6004 0.8%
This table has yielded few surprises in recent years; for the most part, the companies listed here remain the same from one cycle to the next. The continued growth in contributions from companies in the mobile and embedded areas is worth calling out, though. These companies are not just contributing support for their hardware; increasingly, they are also contributing to the core kernel and driving its evolution in the directions needed for their particular market. Once upon a time, it was common to hear that Linux kernel development was dominated by the needs of large enterprise deployments; few people make that claim now.
One other trend your editor has noted over time is a slow decline in the percentage of changes coming from people working on their own time. Here is a chart showing the numbers for all kernels since 2.6.25:
The numbers are somewhat noisy, but the trend over the last four years suggests that volunteers are not contributing as much as they once were. It is unclear why that might be. One possibility is that the kernel has reached a point where there are few easy jobs left; the complexity of contemporary kernel development may be discouraging volunteers. Or it may simply be that anybody who demonstrates an ability to get code into the kernel tends not to remain a volunteer for long unless that is what they really want to be; all the rest end up getting hired. The truth may be a combination of both - or something else altogether.
Volunteer developers are important; they help tie the kernel to the wider community and some of them will become next year's professional developers and subsystem maintainers. A kernel that is unattractive to volunteers may find itself short of developers in the future. Thus far, there is nothing to suggest that any such developer shortage is happening; the 3.3 kernel, with 1,200 contributors, is as strong as any in that regard. That said, this trend is worth watching.
As a whole, though, the kernel remains a fast-paced and seemingly healthy project. The 3.3 release should happen sometime in mid-March, right on schedule. There is already a lot of interesting code lining up for merging in 3.4; expect to see another set of big numbers when the 3.4 version of this article appears in roughly 80 days time.
The x86 NMI iret problem
Interrupts are a source of unpredictable concurrency that can cause no end of trouble for kernel developers. Even most kernel hackers, though, do not need to deal with non-maskable interrupts (NMIs), which bring some additional challenges of their own. Some shortcomings in the NMI implementation in x86 processors have long imposed limits on what can be done in NMI handlers. Recently, those limits have been lifted. This article describes the difficulties imposed by NMIs, covers why the related limitations were getting in the way, and discusses the solution in gory detail.Normal interrupts
The CPU of a computer is a complex Turing-compatible machine that seemingly processes instructions in order of how they are laid out in memory. The hardware may optimize the actual order of how the instructions are read, but in a practical sense, the CPU acts as if it is reading the instructions the way the programmer had placed them (from the view of the current processor). When an event happens on an external device, such as a USB drive, a network card, or timer, it needs to notify the CPU that it must stop its current sequence of instructions and jump to another set of instructions to process the new event. This new sequence of instructions is called a handler, and the device uses an interrupt to notify the CPU.
If the CPU is currently processing instructions that use data that is also used by the interrupt handler when that interrupt comes in, the interrupt handler could corrupt the data that the CPU was in the process of modifying. To prevent this from happening, the programmer disables interrupts for the duration of the critical path that uses the vulnerable data. With normal interrupts, the programmer can synchronize the processing of instructions in the normal workflow of the CPU with the instructions in the interrupt handler using the ability to disable the interrupt.
Non-maskable interrupts
There are some special interrupts that can trigger even when the CPU has interrupts disabled. These non-maskable interrupts are used by tools like profiling and watchdogs. For profiling, information about where the CPU is spending its time is recorded, and, by ignoring disabled interrupts, the profiler can record time spent with interrupts disabled. If profiling used normal interrupts, it could not report that time. Similarly, a watchdog needs to detect if the kernel is stuck in a location where interrupts were disabled. Again, if a watchdog used normal interrupts, it would not be useful in such situations because it would never trigger when the interrupts were disabled.
As you can imagine, having code that can trigger at any time needs special thought in writing. For one thing, it cannot take any locks that are used anywhere else (although it can take locks that are used only in NMI context to synchronize NMIs across CPUs, but that should be avoided if possible). Ideally, an NMI handler should be as simple as possible to prevent race conditions that can be caused by code that is not expecting to be re-entrant.
Although NMIs can trigger when interrupts are disabled and even when the CPU is processing a normal interrupt, there is a specific time when an NMI will not trigger: when the CPU is processing another NMI. On most architectures, the CPU will not process a second NMI until the first NMI has finished. When a NMI triggers and calls the NMI handler, new NMIs must wait till the handler of the first NMI has completed. NMI handlers do not need to worry about nesting, and the Linux NMI handlers are written with this fact in mind.
The x86 NMI iret flaw
First NMI on x86_64
On x86, like other architectures, the CPU will not execute another NMI until the first NMI is complete. The problem with the x86 architecture, with respect to NMIs, is that an NMI is considered complete when an iret instruction is executed. iret is the x86 instruction that is used to return from an interrupt or exception. When an interrupt or exception triggers, the hardware will automatically load information onto the stack that will allow the handler to return back to what it interrupted in the state that it was interrupted. The iret instruction will use the information on the stack to reset the state.
The flaw on x86 is that an NMI will be considered complete if an exception is taken during the NMI handler, because the exception will return with an iret. If the NMI handler triggers either a page fault or breakpoint, the iret used to return from those exceptions will re-enable NMIs. The NMI handler will not be put back to the state that it was at when the exception triggered, but instead will be put back to a state that will allow new NMIs to preempt the running NMI handler. If another NMI comes in, it will jump into code that is not designed for re-entrancy. Even worse, on x86_64, when an NMI triggers, the stack pointer is set to a fixed address (per CPU). If another NMI comes in before the first NMI handler is complete, the new NMI will write all over the preempted NMIs stack. The result is a very nasty crash on return to the original NMI handler. The NMI handler for i386 uses the current kernel stack, like normal interrupts do, and does not have this specific problem.
A common example where this can be seen is to add a stack dump of a task into an NMI handler. To debug lockups, a kernel developer may put in a show_state() (shows the state of all tasks like the sysrq-t does) into the NMI watchdog handler. When the watchdog detects that the system is locked up, the show_state() triggers, showing the stack trace of all tasks. The reading of the stack of all tasks is carefully done because a stack frame may point to a bad memory area, which will trigger a page fault.
The kernel expects that a fault may happen here and handles it appropriately. But the page fault handler still executes an iret instruction. This will re-enable NMIs. The print-out of all the tasks may take some time, especially if it is going out over the serial port. This makes it highly possible for another NMI to trigger before the output is complete, causing the system to crash. The poor developer will be left with a partial dump and not have a backtrace of all the tasks. There is a good chance that the task that caused the problem will not be displayed, and the developer will have to come up with another means to debug the problem.
Because of this x86 NMI iret flaw, NMI handlers must neither trigger a page fault nor hit a breakpoint. It may sound like page faults should not be an issue, but this restriction prevents NMI handlers from using memory allocated by vmalloc(). The vmalloc() code in the kernel maps virtual memory in the kernel address space. The problem is that the memory is mapped into a task's page table when it is first used. If an NMI handler uses the memory, and that happens to be the first time the current task (the one executing when the NMI took place) referenced the memory, it will trigger a page fault.
Nested NMI on x86_64
As breakpoints also return with an iret, they must not be placed in NMI handlers either. This prevents kprobes from being placed in NMI handlers. Kprobes are used by ftrace, perf, and several other tracing tools to insert dynamic tracepoints into the kernel. But if a kprobe is added to a function called by a NMI handler, it may become re-entrant due to the iret called by the breakpoint handler.
Why do we care?
For years NMIs were not allowed to take page faults or hit breakpoints; why do we want them today? In July 2010, the issue came up on linux-kernel, when Mathieu Desnoyers proposed solving the problem of using vmalloc() memory in NMIs. Desnoyers's solution was to have the page fault handler become NMI aware. On return from a page fault, the handler would check if it was triggered in NMI context and, if so, simply not do an iret but instead use a normal ret instruction. The ret instruction is the x86 assembly command to return from a function. Unlike iret, ret only pops the return address that it must jump to off the stack and does not put the system back to its original state. In Desnoyers's solution, the state would be restored directly with added instructions to get back to the NMI handler from the page fault without the need of iret.
Linus Torvalds was not happy with this solution. NMIs, because they can happen anywhere, need special treatment that is unlike other areas of the kernel. Torvalds did not want that treatment to spread to other areas of the kernel, such as the page fault handler. He preferred to make the NMI code even more complex, but to at least contain it only to the NMI handler. NMIs are a special case anyway, and not used for the normal operation of the kernel, whereas page faults are a crucial hot path in the kernel and should not be encumbered with non-important NMI handling.
The immediate solution was to change perf to not have to use vmalloc() memory within its NMI handler. Of course Desnoyers's goal was not to just fix perf, but to give LTTng the ability to use vmalloc() memory in an NMI handler. But handling page faults in the NMI handler is not the only reason to fix the x86 NMI iret problem. There is also strong reason to allow NMI handlers to use breakpoints.
Removing stop machine
There are a few areas in the kernel that require the use of stop_machine(), which is one of the most intrusive acts that the kernel can do to the system. In short, a call to stop_machine() stops execution on all other CPUs so that the calling CPU has exclusive access to the entire system. For machines with thousands of CPUs, a single call to stop_machine() can introduce a very large latency. Currently one of the areas that uses stop_machine() is the runtime modification of code.
The Linux kernel has a history of using self-modifying code. That is, code that changes itself at run time. For example, distributions do not like to ship more than one kernel, so self-modifying code is used to change the kernel at boot to optimize it for its environment. In the old days, distributions would ship a separate kernel for a uniprocessor machine and another for a multiprocessor machine. The same is true for a paravirtual kernel (one that can only run as a guest) and a kernel to run on real hardware. Because the maintenance of supporting multiple kernels is quite high, work has been done to modify the kernel on boot to change it if it finds that it is running on an uniprocessor machine (spin locks and other multiprocessor synchronizations are changed to be nops). If the kernel is loaded as a virtual guest for a paravirtualized environment, it will convert the kernels low-level instructions to use hypercalls.
Modifying code at boot time is not that difficult. The modifications are performed early on before other processors are initialized and before other services are started. At this stage of the boot process the system is just like a uniprocessor system. Changing the instruction text is simple as there is no worry about needing to flush the caches of other processors.
Today, there are several utilities in the Linux kernel that modify the code after boot. These modifications can happen at any time, generally due to actions by the system's administrator. The ftrace function tracer can change the nops that are stubbed at the beginning of almost every function into a call to trace those functions. Netfilter, which is used by iptables, uses jump labels to enable and disable filtering of network packets. Tracepoints used by both perf and ftrace also use jump labels to keep the impact of tracepoints to a minimum when they are not enabled. Kprobes uses breakpoints to place dynamic tracepoints into the code, but when possible it will modify the code into a direct jump in order to optimize the probe.
Modifying code at run time takes much more care than modifying code during boot. On x86 and some other architectures, if code is modified on one CPU while it is being executed on another CPU, it can generate a General Protection Fault (GPF) on the CPU executing the modified code. This usually results in a system crash. The way to get around this is to call stop_machine() to have all CPUs stop what they are doing in order to let a single CPU modify the code as if it were a uniprocessor. It gets a little more complex to handle NMIs happening on the stopped CPUs but that's out of scope for this article.
Being able to modify code without stop_machine() is a very desirable result. There happens to be a way to do the modification without requiring that the rest of the system stops what it was doing and wait for the modification to finish. That solution requires the use of breakpoints.
The way it works is to insert a breakpoint at the location that will be changed. A breakpoint on x86 is only one byte. The instruction that is changed is usually 5 bytes, as it is a jump to some location or a 5 byte nop. The breakpoint, being a single byte, may be substituted as the first byte of the instruction without disrupting the other CPUs. The breakpoint is inserted on the first byte of the instruction and, if another CPU hits that instruction, it will trigger the breakpoint and the breakpoint handler will simply return to the next instruction, skipping the instruction that is in the process of being changed.
Ftrace nop
55 push %rbp
48 89 e5 mov %rsp,%rbp
0f 1f 44 00 00 nop (5 bytes)
65 48 8b 04 25 80 c8 mov %gs:0xc880,%rax
Add breakpoint
55 push %rbp
48 89 e5 mov %rsp,%rbp
cc 1f 44 00 00 <brk> nop
65 48 8b 04 25 80 c8 mov %gs:0xc880,%rax
After the insertion of the breakpoint, a sync of all CPUs is required in
order to make sure that the breakpoint can be seen across the CPUs. To
synchronize the CPUs, an interprocessor interrupt (IPI) with an empty
handler is sent to all the other CPUs. The interrupt on a CPU will flush
the instruction pipeline. When another CPU reads the breakpoint it will
jump to the
breakpoint handler without processing the other 4 bytes of the instruction
that is about to be updated. The handler will set the instruction pointer
to return to the instruction after the one being modified. This keeps the
modification of the rest of the instruction out of the view of the other
CPUs.
After all the other CPUs had their pipelines flushed by the IPI sent to them, the rest of the instruction (4 bytes) may be modified:
Replace end of instruction
55 push %rbp
48 89 e5 mov %rsp,%rbp
cc af 71 00 00 <brk> <mcount>
65 48 8b 04 25 80 c8 mov %gs:0xc880,%rax
Another sync is called across the CPUs. Then the breakpoint is removed and replaced with the first byte of the new instruction:
Remove breakpoint with new instruction
55 push %rbp
48 89 e5 mov %rsp,%rbp
e8 af 71 00 00 callq <mcount>
65 48 8b 04 25 80 c8 mov %gs:0xc880,%rax
This works because adding and removing a breakpoint to code does not have the effect of causing a GPF on other CPUs. The syncs are required between each step because the other CPUs must still have a consistent view of the rest of the instruction. Since tracepoints and function tracing change the code within an NMI handler, breakpoints must be allowed within the handler.
Handling the x86 NMI iret flaw
Torvalds did not just reject Desnoyers's proposal and leave us without a solution. He, instead, came up with a solution himself. Torvalds's solution was to create a per-CPU pointer to the NMI stack frame to use. When an NMI came in, it would check the per-CPU NMI stack frame pointer. If it is NULL, then the NMI is not nested; it will update the pointer to hold its return stack frame and continue to process the NMI handler. This part of the NMI would never be in danger of nesting as no breakpoints or page faults could happen here. If, instead, the stack frame pointer is already set, the new NMI is a nested NMI. That is, a previous NMI triggered an exception that returned with an iret allowing for another NMI to nest. The nested NMI would then update the data in the per-CPU NMI frame pointer such that the interrupted NMI would fault on its return.
Then, the nested NMI would return back to the previous NMI without doing an iret and keep the CPU in NMI context (i.e. preventing new NMIs). When the first NMI returns, it would trigger a fault. In hardware, if an NMI were to trigger while the CPU was handling a previous NMI handler (before an iret is issued), the NMI would trigger a latch, which would be released when the iret is issued. That would cause the CPU to run the NMI handler again. In Torvalds's solution, the fault from the iret would act like a software latch and the fault handler would re-run the NMI handler. The atomic nature of an iret would also prevent races returning from the first NMI.
Torvalds's solution seemed like the perfect workaround until an effort was made to implement it. Torvalds suggested having the nested NMI handler cause the preempted NMI handler to fault when it issued the iret, then have the fault handler for the iret repeat the NMI as it would only fault if a nested NMI had happened. Unfortunately, this is not the case.
The iret of all exceptions, including NMIs, already has a fault handler. User-space applications can set their stack pointer to any arbitrary value. As long as the application does not reference the stack pointer, it will run fine. If an interrupt or NMI comes in, though, when it resets the stack pointer via the iret, the iret may fault because the user-space application had a bad stack pointer. Thus the iret already has a fault handler to handle this case, and entering the fault handler from an iret will not only be caused by nested NMIs but for other cases as well. Determining which case actually occurred is not a trivial task.
Another issue was that it required access to per-CPU data. This is a
bit tricky from the NMI handler because of the way Linux implements per-CPU
data on x86. That data is referenced by the %gs register. Because NMIs
can trigger anywhere, it takes a bit of work to validate the %gs
register. That would make for too much wasted effort just to know if the NMI is
nested or not.
So, in coming up with a solution to the problem, it is best not to go with a faulting iret. Instead, other tricks are available to the NMI handler. Because the NMI stack is per-CPU, Peter Zijlstra suggested the idea to use a portion of the stack to save a variable, which essentially makes that variable into a poor-man's per-CPU variable. When the first NMI comes in, it will copy its interrupt stack frame (the information needed to return back to the state that it interrupted), onto its stack. Not just once, but it will make two copies! Then it will set the special variable on the stack. This variable will be set when an NMI is processing its handler. On return, it will clear the variable and call the iret to return back to the state that it interrupted.
Now if the NMI triggers a page fault or hits a breakpoint, the iret of the exception re-enables NMIs. If another NMI comes in after that happens, it will first check the special variable on the stack. If the variable is set, then this is definitely a nested NMI handler and a jump is made to the code to handle a nested NMI. Astute readers will realize that this is not enough. What happens if the nested NMI triggers after the first NMI cleared the variable but before it returned with the iret?
If the nested NMI were to continue as if it was not nested, it will still corrupt the first NMI's stack and the first NMI iret would incorrectly read the nested NMIs stack frame. If the variable is not set, then the saved stack pointer on the stack is examined. If that stack pointer is within the current stack (the per-CPU NMI stack) then this can be considered a nested NMI. Note that just checking the stack may not be enough because there are cases where the NMI handler may change its stack pointer. Both the special on-stack variable and the location of the interrupted stack need to be examined before determining that this is a nested NMI.
Processing a nested NMI
OK, so it is determined that the NMI that came in is nested, now what? To simulate the hardware's behavior, the first NMI must somehow be notified that it must repeat the NMI handler. To perform this atomically, the return stack of the first NMI is updated to point into a trampoline. This is similar to what Torvalds proposed, except that, instead of using the exception handling of a faulting iret, the information on the stack is updated such that the iret will simply jump to the code to handle a nested NMI.
Note that a nested NMI must not update the previous NMI if the previous NMI
is executing in this trampoline area. The instruction pointer is examined
by the nested NMI, and if it is determined that it preempted a previous NMI
while it was on the trampoline it simply returns without doing anything.
The previous NMI is about to trigger another NMI handler anyway. This is
still similar to what the hardware does, as if more than one NMI were to
trigger while the CPU was processing an NMI, only one NMI would be
repeated. The first NMI can only be on the trampoline if it was previously
interrupted by a nested NMI, thus the second NMI that happened while the
first is on the trampoline may be discarded like the hardware would do with
the second of two NMIs triggering while processing a previous NMI.
Remember that the initial NMI saved its stack frame twice, which means
it has three copies of the interrupt stack frame. The first frame is
written by the hardware on entering of the NMI. But if there's a
nested NMI, this frame will be overwritten by it. One copy is used to
return from the NMI handler. But if there is a nested NMI, it will update
that copy to not return to where it triggered, but instead to return to the
trampoline that sets up the repeat of the NMI. In this case, we need the
third copy of the interrupt stack frame to copy back into the location that
the NMI will use to return to the location where the first NMI triggered.
The trampoline will set the special variable on the stack again to notify new NMIs coming in that an NMI is in progress, and then jump back to start the next NMI again. This time NMIs are still enabled because the nested NMI still uses a iret and doesn't use the ret trick to keep the CPU in NMI context. The reason for not doing so is because it makes the code even more complex. Now that there's a safe way to handle nested NMIs there's really no reason to prevent new ones after a previous NMI triggered. Future code may change things to keep NMI context when returning back to the first NMI from a nested NMI.
The above craziness was the solution for x86_64. What about i386 (x86_32)? Well, as i386 does not have a separate per-CPU stack for the NMI, and just uses the current kernel stack where it was interrupted, the solution is pretty straightforward, and even better, it's handled in the C code. On entry of the NMI C handler (called from assembly) a check is made against a per_cpu variable to see if it is nested or not. If it is, then that variable is switched to NMI_LATCHED and returns, otherwise it is set to NMI_EXECUTING. On exit of the NMI, a cmpxchg() is used to atomically update the per-CPU variable from NMI_EXECUTING to NMI_NOT_RUNNING. If it fails the update then it means that a nested NMI came in and a repeat of the C handler needs to be done.
Conclusion
NMI handling has always been a bane of kernel development. They are an anomaly that causes many fine developers hours of sleepless nights. Linus Torvalds is correct, it is best to keep the beast contained. This code may sound like a horrible hack, but because it is contained tightly with the NMI source it is actually an elegant solution. If critical code throughout the kernel needs to accommodate NMIs then the kernel will soon become an unmaintainable disaster. Keeping the NMI code in one place also lets us know where a problem is if one were to arise. All this work has been well worth the effort as it has opened the doors to the removal of yet another bane of the kernel source: stop_machine().
Patches and updates
Kernel trees
Architecture-specific
Core kernel code
Development tools
Device drivers
Documentation
Filesystems and block I/O
Memory management
Networking
Security-related
Virtualization and containers
Benchmarks and bugs
Miscellaneous
Page editor: Jonathan Corbet
Distributions
MINIX 3.2: A microkernel with NetBSD applications
After almost a year and a half of development, MINIX saw a new stable release on 2012's leap day. MINIX 3.2.0 has been updated with a lot of code from NetBSD, it now uses Clang as its default compiler, and the developers have migrated to Git for version control.
The MINIX operating system was originally developed by computer science professor Andrew Tanenbaum at Vrije Universiteit Amsterdam for educational use. He used the code as examples in his textbook "Operating Systems: Design and Implementation". MINIX has a microkernel architecture with a kernel of about 10,000 lines of code. The result is that device drivers and services like filesystems, memory management, process management, and so on are running as user-space processes. This resembles the approach taken by GNU Hurd.
These days, MINIX is not only an academic project, but it's also focused
on achieving high reliability for embedded systems through fault tolerance
and self-healing techniques. It is distributed with a BSD-type license. A
downside of that choice is that it's harder to find out how it is being used, so we don't know much about the use of MINIX outside academia. However, in a FOSDEM interview in 2010, Tanenbaum said: "I believe it was used in one embedded system for managing commercial parking garages.
"
NetBSD code
The most obvious message in the release notes of MINIX 3.2.0 are the many mentions of NetBSD. Because the MINIX userland grew outdated, many of its tools have been replaced by their counterparts from NetBSD. This project (a Google Summer of Code project in 2011 done by Vivek Prakash) was also an opportunity to expand the MINIX userland by porting additional utilities from NetBSD.
As can be seen from Vivek's GSoC status report, many of the simpler tools just required some trivial changes to their Makefiles to port them to MINIX. However, some of the tools required libraries that were not present in MINIX, so Vivek had to port these too. Moreover, if a NetBSD tool lacked an important flag that its MINIX counterpart had, he had to port this missing functionality from the MINIX tool. Of course Vivek also submitted the changes upstream. As part of this userland porting effort, MINIX also migrated its /etc/passwd file to the NetBSD format.
Vivek's minix-userland Git repository shows the result of this porting effort. The /usr/src/commands directory contains the sources for the MINIX tools, while the tools that are ported from NetBSD go into /usr/src/bin, /usr/src/sbin, /usr/src/usr.bin, and /usr/src/usr.sbin, which are the corresponding locations in the NetBSD source tree. This way you can clearly see which tools are originally from the MINIX userland and which ones are ported from NetBSD. The plan is to track the development of the NetBSD stable releases for the ported tools. MINIX 3.2 also has adopted NetBSD's C library and boot loader.
Other features
MINIX now has also experimental support for SMP (symmetric multiprocessing), as well as multi-threading and NCQ (Native Command Queuing) support in the AHCI (Advanced Host Controller Interface) driver for SATA adapters. The virtual filesystem (VFS) server is now asynchronous and multi-threaded. The developers also added a /proc file system and a proper /etc/fstab file. Because of the integration of ext2, you can install MINIX on an ext2 file system now. There's also FUSE support for user-space file systems, which was implemented by Evgeniy Ivanov as part of the Google Summer of Code (GSoC) 2011.
Reliability also has been improved. When block device I/O fails in a filesystem, for instance when reading a file from a hard drive, MINIX will transparently retry. In the same way, it can transparently recover from the crash of a block device driver. These improvements are examples of the self-healing nature of MINIX. Moreover, servers and drivers now run as unprivileged users, which should further lessen the damage when something goes wrong. The kernel mediates whether particular servers and drivers can access the hardware.
Development
There are also a lot of changes for developers. The most visible is that the project's code base has been moved to Git. The project also has set up extensive documentation for developers about using Git and an explanation of the MINIX 3 git workflow.
LLVM's Clang frontend has been adopted as the default compiler for MINIX 3.2, although GCC is still supported by setting the environment variable CC=gcc. Currently Clang runs slower than GCC for building the MINIX source and its packages, but it reports more build warnings on the MINIX code base. The plan is to fix all potential bugs found by Clang's more extensive warnings.
The default executable format in MINIX is now ELF (Executable and Linkable Format). Debugging also has been improved: MINIX 3.2 supports GDB and core dumps, which was a GSoC project by Adriana Szekeres. Moreover, tracing support for block devices has been added.
Getting started
MINIX 3.2 is available as a CD image. The command line installer is spartan, but it does the job. Most of the automatically suggested answers to questions are fine, such as the network cards detected by MINIX, the use of DHCP, and so on. When the installation is finished, it's striking how fast MINIX boots. One of the reasons is that MINIX is a really bare-bones installation, which doesn't even include OpenSSH.
When the login prompt shows up, log in as root without a password. After this, you have to do some manual post-installation steps: set your root password and time zone, add users, and so on. You can update the package database with pkgin update and then install a package with pkg install foobar. Probably some of the first packages to install would be openssh, vim, and x11. Pkgin is an apt/yum-like front-end to NetBSD's pkgsrc package management system.
There's a Getting started document on the website that guides you through all of this. The project also has a lot of documentation and a wiki, which has a User Guide, including pages about the installation, post-installation steps, and an introduction to X.
Developers are not left out in the cold either: there's documentation about the MINIX community, a Developers Guide, a list of who is working on what, and suggestions about how to contribute.
However, hardware support seems to be quite limited in MINIX. Currently, only IDE and SATA disks are supported, and there's no support for USB or FireWire peripherals. Moreover, only a limited number of Ethernet cards work. None of my systems had all the required hardware to be able to either install or run X, so I wasn't able to get a graphical desktop on MINIX. Even in a virtual machine in VirtualBox, I couldn't configure the X server of MINIX, although I followed the configuration tips for running MINIX in VirtualBox.
From academic exercise to general-purpose operating system
The migration to the NetBSD userland and C library is part of an ongoing effort to make MINIX more usable outside academia. This started with the release of MINIX 3.0 in 2005, which added X11 and over 400 common UNIX tools. In the previous release (3.1.8), the MINIX developers adopted NetBSD's package management system pkgsrc, which was implemented as a GSoC project in 2010. Thanks to pkgsrc, MINIX users potentially have access to over 8000 packages. Currently only 250 seem to be available as binary packages through pkgin, but you can build many more from source (although not all of the 8000 packages in pkgsrc will compile on MINIX).
The result of these ongoing efforts is that MINIX is much more of a general-purpose operating system than a few years ago. This should also make porting MINIX to other architectures easier. There have been efforts to port MINIX to PowerPC and ARM in the past, but these were not successful because the developers lost interest. However, currently the Vrije Universiteit Amsterdam is looking for a full-time programmer with embedded systems experience to port MINIX to ARM. So maybe in a few years we'll see MINIX on embedded systems that are currently the playing field of Linux and the BSDs.
Brief items
An alpha ARM full of Beefy Miracle
The Fedora project has announced the availability of the Fedora 17 alpha release for the ARMv7hl and ARMv5tel architectures. "It’s not for the faint of heart! If you don’t like to roll your sleeves up and get dirty, it’s time to back away and go and buy an iPad. While the ARM team is happy to help out with queries we don’t have the time to walk you though step by step. Google is your friend!" It's clearly an early-stage distribution, but it will be of interest for those wanting to experiment with Fedora in the ARM world.
Kubuntu 12.04 to be Supported for 5 Years
The Kubuntu project has announced that the 12.04 LTS version of the KDE Ubuntu flavor will be supported for five years. "Kubuntu has always been and always will be a community made project. The Kubuntu Council and community of developers would like to reaffirm their [commitment] to provide the same level of support for Kubuntu 12.04 as in previous releases, and to ensure that Canonical's staffing constraints will not affect the level and quality of support that Kubuntu offers to users. Our 11.10 release was also made without a staff member from Canonical and our future ones will be as well. The Kubuntu contributor community is dedicated to the project and will continue to support and release the latest KDE Software along with Kubuntu every six months."
Oracle Linux Release 5.8 released
Oracle Linux 5.8 is available for x86 and x86_64 architectures. It comes with three kernel packages, Oracle's enterprise kernel boots by default. This update also includes lots of bug fixes and driver updates.Ubuntu 12.04 LTS (Precise Pangolin) Beta 1 Released.
The Ubuntu team has announced the first beta release of Ubuntu 12.04 LTS (Long-Term Support) Desktop, Server, Cloud, and Core products. Beta versions of Kubuntu, Edubuntu, Xubuntu, Lubuntu, and Ubuntu Studio are also available, as well as new images for armhf.
Distribution News
Debian GNU/Linux
bits from the DPL for February 2012
Debian Project Leader Stefano Zacchiroli shares a few bits about his doings for the month of February. Some highlights include the upload of multi-arch enabled dpkg to the archive, new virtual machines for powerpc ports, Google Summer of Code, and several other topics.Debian Project Leader Elections 2012
The 2012 Debian Project Leader (DPL) elections are underway. Nominations close March 10. Campaigning will take place March 11-31, followed by a two week voting period.Make DebConf12 a success: donate or become a sponsor
DebConf 12 will take place in Managua, Nicaragua, in July. "At DebConf we try to bring together as many Debian contributors as possible, including those who could not afford to attend from their own resources. You can help make DebConf12 a success by your organisation becoming a sponsor, or by donating money as an individual."
Switzerland to host Debian Conference 2013
The DebConf committee has chosen a venue for DebConf13. The 2013 DebConf will take place on the shores of the Lake Neuchâtel in Switzerland, most likely in the middle of August.
Fedora
FUDCon APAC 2012 will be in lovely Kuala Lumpur, Malaysia
FUDCon APAC 2012 will be in Kuala Lumpur, Malaysia. "Please stay tuned for further announcements clarifying the details of attendance, including dates/times/places, registration, as well as information for those seeking subsidies for attendance."
Newsletters and articles of interest
Distribution newsletters
- Debian Misc Developer News #29 (March 1)
- Debian Project News (March 5)
- DistroWatch Weekly, Issue 446 (March 5)
- Maemo Weekly News (March 5)
- Ubuntu Weekly Newsletter, Issue 255 (March 4)
Shuttleworth on the Ubuntu 12.04 desktop
Mark Shuttleworth declares victory for the 12.04 desktop and thanks Ubuntu users for sticking with the distribution through the transition. "For the first time with Ubuntu 12.04 LTS, real desktop user experience innovation is available on a full production-ready enterprise-certified free software platform, free of charge, well before it shows up in Windows or MacOS. It’s not ‘job done’ by any means, but it’s a milestone. Achieving that milestone has tested the courage and commitment of the Ubuntu community – we had to move from being followers and integrators, to being designers and shapers of the platform, together with upstreams who are excited to be part of that shift and passionate about bringing goodness to a wide audience."
CeBIT 2012: Knoppix 7.0 presented (The H)
The H covers the release of KNOPPIX 7.0, which was announced at CeBIT. "A majority of the changes, [Klaus] Knopper says, are under the hood. These include changes to the boot sequence that load, for example, the graphics, keyboard and mouse drivers, before those for webcams and other hardware. Other changes are said to improve the distribution's overall performance."
BackTrack 5 update expands security toolkit (The H)
The H looks at the latest update of Backtrack. "Based on a custom-built 3.2.6 Linux kernel with improved wireless support, BackTrack 5 R2 upgrades a number of the existing tools and adds more than 40 new tools. These include a "special BackTrack edition" of the open source Maltego intelligence and forensics application for data mining, version 4.2.0 of the Community Edition of the Metasploit exploit framework, an updated release of the Browser Exploitation Framework (BeEF) and version 3.0 of the Social-Engineer Toolkit (SET), a social-engineering penetration testing framework. Other new tools include the findmyhash Python script for cracking hashes using online services, Goofile CLI filetype search, LibHijack, used for injecting arbitrary code and shared objects into a process during runtime, and sucrack for cracking local user accounts."
Page editor: Rebecca Sobol
Development
CUPS 1.6 shaking up Linux printing
Developers of the CUPS printing system raised a few eyebrows when it was revealed in February that the impending 1.6 release would drop several features heavily used on Linux systems (and other platforms) in order for the project to focus more on Mac OS X, per the wishes of CUPS's corporate parent, Apple. The Linux community has already adapted to the 1.6 changes, however — in fact, the past year has been an active one for other open source printing projects as well.
Apple purchased CUPS in 2007 by acquiring Easy Software Products, the company headed by CUPS creator Michael Sweet, who still orchestrates the project. On a typical Linux box, CUPS is responsible for multiple pieces of the printing workflow. To a client machine, it provides uniform access to shared printers around the network, and submits print jobs to the user's choice of print queue.
On a machine operating as a print server, CUPS provides filter-chains used to convert the print job to a format suitable for final output on the device (including converting to PostScript or PDF, rasterization, and applying any transformations like 2-up layout), and it runs the backends for many printers — although it can also hand off the final job to another driver, such as one from Gutenprint or a proprietary offering supplied by a device vendor.
CUPS 1.6
The changes landing in CUPS 1.6 affect several points in that client-server workflow, which public attention was drawn to when Red Hat's Tim Waugh posted a summary to the Fedora-devel list in late January.
First, existing versions of CUPS allow client machines to browse for printers accessible on the network. In this system, printers announce their availability using short messages sent on UDP port 631. Mac OS X, however, uses DNS Service Discovery (DNS-SD) to locate network printers instead, a feature introduced with CUPS 1.3 in 2007. CUPS 1.6 will drop the UDP-based CUPS Browsing feature, and make DNS-SD the only method for "automatic" network printer discovery.
This causes several practical challenges for Linux and other non-Apple OSes. For starters, although CUPS already works with Bonjour (Apple's implementation of DNS-SD), the announcements it sends don't work with the Linux equivalent, Avahi. Since both the print server and the client must be running DNS-SD for browsing to work, this prevents Linux print servers from being discoverable by Apple clients, and vice-versa. Waugh has submitted patches to CUPS to enable Avahi support, but they have not yet been integrated.
But the second wrinkle is that reliance on DNS-SD for printer discovery will dictate that Avahi run on all print servers and clients, which amounts to a policy-changing decision for every distribution. This means a new package dependency, but as Waugh discussed in the comments on his blog, it will also mean an adjustment to the default firewall rules, which (at least for Fedora) are accustomed to blocking Avahi.
The second change arriving in 1.6 is the elimination of all CUPS filters that are not of interest to Apple. Obviously, were they to disappear, that would strand non-Apple users. Fortunately, the OpenPrinting project immediately announced that it would maintain the filter set as a separate cups-filters package (which is already available). The filter list includes filters for image-to-PDF, PDF-to-PDF, text-to-PDF, PDF-to-raster, PDF-to-IJS (Hewlett-Packard's InkJet Server format), and PDF-to-OPVP (OpenPrinting's vector format) conversion.
OpenPrinting developments: PDF, IPP, and CPD
OpenPrinting is a vendor- and distribution-neutral workgroup overseen by the Linux Foundation that provides software support and standards for printing on Linux systems. But the adoption of the cups-filters package is not simply an effort to archive valuable code — OpenPrinting is developing the filters as part of its effort to migrate away from PostScript as the standard format for print jobs, and toward PDF. As the project's site explains, the PDF format allows for easier post-processing, newer features like transparency and high bit-depth color, and a simpler printing pipeline (considering the popularity of PDF as a document format).
The major large-scale open source projects — GTK+, Qt, Mozilla, and LibreOffice — all support PDF print queues now (and it became the system-wide default in Ubuntu 11.10), but Apple's disinterest in continuing the PDF-workflow filters led some to speculate that OS X and Linux may be on diverging roads, such that eventually a fork of CUPS will be required. Waugh commented that such a move had been considered, but that "for the time being it isn't beneficial to do that.
"
Till Kamppeter from Canonical (and currently a Linux Foundation Fellow) manages OpenPrinting, and sent his own email summary of CUPS 1.6's changes to the OpenPrinting printing-architecture list. In it, he cites another significant change, the deprecation of the PostScript Printer Description (PPD) file format.
In previous generations, PPDs served as driver interfaces for PostScript printers. Kamppeter initially said 1.6 would drop support for PPDs in an effort to shift to the IEEE Printer Working Group's IPP Everywhere model, but Sweet later corrected him — existing PPDs will continue to be supported, but new PPDs will not be added. Nevertheless, Sweet said that IPP Everywhere remains the long-term plan, and "the goal is to have IPP equivalents for what we currently provide in the PPD APIs
", and thus "
when we are able to pull the plug [applications] won't notice a thing
".
Regardless of whether OS X and Linux CUPS workflows are diverging, OpenPrinting is also attempting to unify other key pieces of the standard printing job. Right now, the emphasis is on creating a common printing dialog (CPD) so that every application can present the same interface and set of print options to the user. The project has been in development for quite some time, but took steps forward in 2011, with a Google Summer of Code project to implement a color management interface, and the first all-Qt implementation.
Color management
The CPD is not the only Linux printing endeavor to tackle color management: both CUPS and Ghostscript successfully integrated support for reading and processing ICC color profiles during the printing process.
The CUPS support was implemented by Waugh, and uses the colord daemon (we covered colord in September 2011). Colord provides a D-Bus service for applications to look up the color profiles associated with hardware devices. CUPS registers the ICC profile of each available print queue with colord, and the print filters query colord for the appropriate profile when rasterizing a print job for its final output.
Ghostscript, the PostScript interpreter often used by CUPS as the rasterizing filter during that final step, has also improved its ICC color profile support in recent releases. Support for applying a profile when rasterizing a job (which is what makes the CUPS usage of Ghostscript mentioned above function as needed) was already in place, but the recent point releases have added additional capabilities. As Libre Graphics World explained, February's 9.05 added support for soft-proofing (i.e., simulated output), attaching individual profiles to embedded images, and device link profiles, which enable additional transformation options of interest mainly to professional print services.
Printing is one of those services that it can be easy to take for granted while everything runs according to plan. However, as the news of the CUPS 1.6 changes brought to the community's attention, the ease with which complacency can set in does not mean that there is no active development or new work being done. In the past 12 months or so, the Linux printing toolchain has been augmented with several new features (including color management and the ability to use PDF as the default format) that will offer an improved experience. Some of the other changes currently in the pipeline — such as DNS-SD and IPP Everywhere — may make some uncomfortable, but, ultimately, modernizing a workflow always requires pushing forward, even when it feels disruptive.
Brief items
Quotes of the week
It's duct tape and bailing wire. And we love it for that.
If the app is useful enough, it might even get cleaned up. Or just more duct tape and bailing wire is applied, more likely. :-)
(Or else, if you want to play this game, there is PyPy's sandboxing, which is just an unpolished proof a concept so far. I can challenge anyone to attack it, and this time it includes attempts to consume too much time or memory, to crash the process in any other way than a clean "fatal error!" message, and more generally to exploit issues that are dismissed by pysandbox as irrelevant.)
PHP 5.4.0 released
The PHP 5.4.0 release is available. "This release is a major leap forward in the 5.x series, which includes a large number of new features and bug fixes." Those new features include traits, some array syntax improvements, a built-in web server, performance improvements, and more. See the changelog for lots of details.
Python 3.3.0 alpha 1
The first Python 3.3 alpha release is out and available for testing. Those who are curious about what is new in this release can find a lot of information in the What's new in Python 3.3 document. New features include a new, flexible internal representation for Unicode strings, a new yield from expression, the return of u'unicode' literals (to help with porting from Python 2), and a number of reworked modules.Wine 1.4 released
Version 1.4 of the Wine Windows emulator is out. "This release represents 20 months of development effort and over 16,000 individual changes. The main highlights are the new DIB graphics engine, a redesigned audio stack, and full support for bidirectional text and character shaping." There's a lot more; click below for a long list of new features and improvements.
Newsletters and articles
Development newsletters from the last week
- Caml Weekly News (March 6)
- GCC 4.8.0 status report (March 2)
- The LilyPond Report (March 4)
- Perl Weekly (March 5)
- PostgreSQL Weekly News (March 4)
An open-source robo-surgeon (Economist)
The Economist has an article about the "Raven," a Linux-based robotic surgeon designed to allow cooperative research and development. "Universities across America took delivery of the first brood of Ravens in February. At Harvard, Rob Howe and his team hope to use a Raven to operate on a beating heart, by automatically compensating for its motion. At the moment, heart surgery requires that the organ be stopped and then restarted. At the University of California, Los Angeles, meanwhile, Warren Grundfest is working on ways to give the robot a sense of touch that is communicated to the operator. Pieter Abbeel and Ken Goldberg at the University of California, Berkeley, will try teaching the robot to operate autonomously by mimicking surgeons."
Scheidler: Project Lumberjack to improve Linux logging
In his blog, syslog-ng developer Balázs Scheidler writes about the birth of Project Lumberjack, which is an effort to improve Linux logging. "In a lively discussion at the RedHat offices two weeks ago in Brno, a number of well respected individuals were discussing how logging in general, and Linux logging in particular could be improved. As you may have guessed I was invited because of syslog-ng, but representatives of other logging related projects were also in nice numbers: Steve Gibbs (auditd), Lennart Poettering (systemd, journald), Rainer Gerhards (rsyslog), William Heinbockel (CEE, Mitre) and a number of nice people from the RedHat team."
Page editor: Jonathan Corbet
Announcements
Brief items
OIN Announces Broad Expansion of Linux System Definition
Open Invention Network (OIN) has announced that it has expanded and updated the Linux System technologies covered in its protective network of royalty-free patents. "Over 700 new software packages – including popular packages such as KVM, Git, OpenJDK, and WebKit – will now receive coverage. In addition, coverage for over 1,000 existing software packages has been updated."
New members for the Linux Foundation
The Linux Foundation has announced that Fluendo, Lineo Solutions, Mocana and NVIDIA have joined as members. "'NVIDIA is strongly committed to enabling world-class experiences and innovation with our GPU and mobile products. Membership in The Linux Foundation will accelerate our collaboration with the organizations and individuals instrumental in shaping the future of Linux, enabling a great experience for users and developers of Linux,' said Scott Pritchett, VP of Linux Platform Software at NVIDIA."
Articles of interest
FSFE Newsletter - March 2012
The March edition of the Free Software Foundation Europe Newsletter covers Free Your Android press coverage, a report from "I love Free Software day", an FSFE FOSDEM report, and more.Raspberry Pi interview: Eben Upton reveals all (Linux User)
Linux User has an interview with Raspberry Pi co-founder Eben Upton. In it he talks about the design of the low-cost ARMv6-based board, the non-profit status of the company, and its competition. "It’s not entirely clear to me why the Beagleboard is so expensive. Somebody in that Beagleboard value chain has got to be making a pile of money – I mean, $175 for a Pandaboard or $100 for a Beagleboard? Somebody’s got to be amassing a pile of cash there, because that’s a $10 chip in that device. I don’t know why they’re so expensive. Raspberry Pi, in terms of multimedia, outperforms any other dev board in existence – which is nice. [...] In terms of general purpose computing, it’s got this 700MHz ARM11, and our benchmark shows it’s about 20 per cent slower than a Beagleboard for general purpose computing. But, you know, it’s a quarter of the price – somewhere between a sixth and a quarter of the price – so yeah, I expect that our first customers are going to be Beagleboard-type customers."
Education and Certification
LPI announces Linux Essentials Program
The Linux Professional Institute (LPI) has announced "Linux Essentials," a new program measuring foundation knowledge in Linux and Open Source Software. "Targeted at new technology users, the "Linux Essentials" program is set to be adopted by schools, educational authorities, training centers and others commencing June 2012."
LPI announces Linux Training Program with the International Telecommunication Union in the League of Arab States
The Linux Professional Institute (LPI) has announced a certification and training project in partnership with the International Telecommunication Union (ITU) throughout the 22 countries in the League of Arab States. "The three year ITU project, called "Establishment of Training Centres in Linux Curricula and Certification", will establish 132 Linux "train-the-trainer" centers on all three levels of Linux Professional Institute Certification"
Calls for Presentations
Call for music -- LAC 2012 video-trailer soundtrack
Linux Audio Conference (LAC) 2012 will take place April 12-15 in Stanford, California. This year the conference video recordings will be prefixed by a short video and the conference organizers have opened a call for music to accompany this video. Submissions are due by April 1.Flossie 2012 CfP
Flossie 2012 is a free, two-day event for women who work with, or are interested in, Software Libre/FOSS in Open Data, Knowledge Digital Arts and Education. The conference takes place in London, UK on May 25-26, 2012. The call for papers is open until March 12.EuroPython 2012: Call for Proposals
EuroPython 2012 will take place July 2-8 in Florence, Italy. The call for proposals is open until March 18. "We're looking for proposals on every aspect of Python: programming from novice to advanced levels, applications and frameworks, or how you have been involved in introducing Python into your organisation."
LSM embedded and open hardware track 2012 Cfp
The 2012 Libre Software Meeting (LSM) will take place in Geneva, Switzerland July 7-12. The call for presentations for the "Embedded Systems and Open Hardware" session is open until March 31.
Upcoming Events
Events: March 8, 2012 to May 7, 2012
The following event listing is taken from the LWN.net Calendar.
| Date(s) | Event | Location |
|---|---|---|
| March 6 March 10 |
CeBIT 2012 | Hannover, Germany |
| March 7 March 15 |
PyCon 2012 | Santa Clara, CA, USA |
| March 10 March 11 |
Open Source Days 2012 | Copenhagen, Denmark |
| March 10 March 11 |
Debian BSP in Perth | Perth, Australia |
| March 16 March 17 |
Clojure/West | San Jose, CA, USA |
| March 17 March 18 |
Chemnitz Linux Days | Chemnitz, Germany |
| March 23 March 24 |
Cascadia IT Conference (LOPSA regional conference) | Seattle, WA, USA |
| March 24 March 25 |
LibrePlanet 2012 | Boston, MA, USA |
| March 26 April 1 |
Wireless Battle of the Mesh (V5) | Athens, Greece |
| March 26 March 29 |
EclipseCon 2012 | Washington D.C., USA |
| March 28 March 29 |
Palmetto Open Source Software Conference 2012 | Columbia, South Carolina, USA |
| March 28 | PGDay Austin 2012 | Austin, TX, USA |
| March 29 | Program your own open source system-on-a-chip (OpenRISC) | London, UK |
| March 30 | PGDay DC 2012 | Sterling, VA, USA |
| April 2 | PGDay NYC 2012 | New York, NY, USA |
| April 3 April 5 |
LF Collaboration Summit | San Francisco, CA, USA |
| April 5 April 6 |
Android Open | San Francisco, CA, USA |
| April 10 April 12 |
Percona Live: MySQL Conference and Expo 2012 | Santa Clara, CA, United States |
| April 12 April 15 |
Linux Audio Conference 2012 | Stanford, CA, USA |
| April 12 April 19 |
SuperCollider Symposium | London, UK |
| April 12 April 13 |
European LLVM Conference | London, UK |
| April 13 | Drizzle Day | Santa Clara, CA, USA |
| April 16 April 18 |
OpenStack "Folsom" Design Summit | San Francisco, CA, USA |
| April 17 April 19 |
Workshop on Real-time, Embedded and Enterprise-Scale Time-Critical Systems | Paris, France |
| April 19 April 20 |
OpenStack Conference | San Francisco, CA, USA |
| April 21 | international Openmobility conference 2012 | Prague, Czech Republic |
| April 23 April 25 |
Luster User Group | Austin, Tx, USA |
| April 25 April 28 |
Evergreen International Conference 2012 | Indianapolis, Indiana |
| April 27 April 29 |
Penguicon | Dearborn, MI, USA |
| April 28 | Linuxdays Graz 2012 | Graz, Austria |
| April 28 April 29 |
LinuxFest Northwest 2012 | Bellingham, WA, USA |
| May 2 May 5 |
Libre Graphics Meeting 2012 | Vienna, Austria |
| May 3 May 5 |
Utah Open Source Conference | Orem, Utah, USA |
If your event does not appear here, please tell us about it.
Page editor: Rebecca Sobol
