User: Password:
|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for April 28, 2011

ABS: The guts of Android

By Jake Edge
April 27, 2011

In a fairly fast-paced talk, Karim Yaghmour presented the internals of Android systems at the Android Builders Summit. The talk focused on things like Android's application development model, the parts and pieces that make it up, and its startup sequence. It gave an overall picture of a system that is both familiar and not.

[Karim Yaghmour] Yaghmour is the lead author of Building Embedded Linux Systems and has done Linux kernel development along the way. He developed the original Linux Trace Toolkit (LTT) in 1999, which has since been taken over by another École Polytechnique de Montréal graduate, Mathieu Desnoyers, as LTT next generation (LTTng). Desnoyers is "doing a much better job" with it than he did, he said with a chuckle. He also developed relayfs, which is an efficient way to relay large amounts of data from kernel to user space. He is now doing Android development and training.

Android internals

With a slide showing Kirk and Spock from the original Star Trek, and the line "it's Linux, Jim, but not as we know it", Yaghmour pointed out that Android is a "strange beast" that "looks weird, feels weird, and acts weird". That's because it doesn't have the traditional Linux user space, but instead has its own user space that sits atop a somewhat non-standard Linux kernel.

For example, there is no "entry point" to an application for Android. Developers create components that get bundled together as applications. An application consists of multiple components, some of which may be shared with other applications. Each application is hosted in a Linux process. Any of those components can disappear while the system is running because its application process goes away. That can be because the application is no longer needed or because of memory pressure from other applications being loaded. That means that components need to implement "lifecycle management". Essentially, components need to be able to come back again the way they were before being killed.

Android also uses messages called "intents" that are sent between components. Yaghmour said they are like "polymorphic Unix signals" that can be sent to a specific component or service, but can also be broadcast. Applications can register their interest in various intents by specifying Intent Filters in their manifest files.

Remote procedure calls (RPCs) (or inter-process communication aka IPC) are done using the "Binder" in Android because "System V IPC sucks", at least according to comments in the Android code. The Binder allows components to talk to services and for services to talk to each other. The Binder is not used directly, however, and instead interfaces are defined using an interface definition language (IDL). Those IDL definitions are fed to a tool that generates Java code to handle the communication.

The development environment for Android is "fantastic", Yaghmour said. The Android SDK provides everything that is needed to create applications. The problems come when trying to develop a device that runs Android, he said, because the "glue that allows all these APIs to talk to the kernel is not documented anywhere". For a "normal" embedded Linux system, you generally just need the kernel, a C library, and BusyBox, which is generally enough to allow you to build any custom applications, but for Android, the picture is much more complex.

It is still the Linux kernel at the bottom, but that's about it that is the same as a normal Linux system. The kernel has numerous patches applied to it for things like wakelocks and the Binder, but it is recognizably Linux. Above that, things start to change. There are a number of libraries available, some of which appear in other systems (Linux, BSD, etc.), like WebKit and SQLite, but some are Android-specific, like the libc replacement, Bionic.

Android has its own init, which is not based on either System V init, or on BusyBox's, partly because it doesn't use the latter. Instead of BusyBox, Android has something called Toolbox that fills the same role, but not as well. Yaghmour said that the first thing he does on an Android system is to replace Toolbox with BusyBox. It was a political decision not to use BusyBox, rather than a technical one, he said. There are also various libraries to support hardware like audio devices, cameras, GPS devices, and so on, all of which are implemented in C or C++.

Android uses Java Native Interface (JNI) to talk to any of that lower level code from the Java-based code that makes up (most of) the rest of user space. The Dalvik virtual machine uses JNI to call those libraries. The system classes (in the android.* namespace), as well as the Apache Harmony-based standard Java classes (in java.*) sit atop of Dalvik, as does the all-important System Server. Above those are the stock Android applications along with any other applications installed by the user (from the Market or elsewhere).

Android replaces the Java virtual machine with Dalvik, and the JDK with classes from Apache Harmony. To create .dex files for Dalvik, the Java is first processed by the Java tools to create .class files, which are then post-processed by dx to produce the files used by Dalvik. One interesting thing noted by Yaghmour is that .dex files are typically half the size of the equivalent .class files.

The layout of the native Android user space is very different than standard Linux as well. There is no /bin or /etc, which nearly every standard Linux tool expects to find. The two main directories in Android are /data (for user data) and /system (for the core system components). But some of the expected directories are present, like /dev and /proc.

Android startup

After that relatively high-level overview of the Android system, Yaghmour looked at the Android startup sequence, starting with the bootloader. That bootloader implements a protocol called "fastboot" that is used to control the boot process over USB using a tool of the same name on a host system. The bootloader contains code to copy various portions of the code around on the flash (for returning to a stock image for example), and allows users to boot from a special partition that contains a recovery program (via a magic key sequence at boot time). Some of these are features that might make their way into U-Boot or other bootloaders, he said.

The flash layout of a typical device has multiple partitions to support Android, including "boot" (which is where the kernel resides), "system", "userdata", and "cache" (the latter three corresponding the mounted /system, /data, and /cache filesystems on a running Android system). Yaghmour noted that Android does not follow the Filesystem Hierarchy Standard (FHS) with its filesystems, but it also doesn't conflict with that standard, which allows folks to install FHS filesystems alongside Android's.

Newer Android releases are using the ext4 filesystem, rather than the yaffs2 filesystems used on earlier devices. That's because the latest devices are not using a classic NAND flash chip, and instead are using something that looks like an SD card to the processor. The kernel treats it like a block device. Because the newer devices have multicore processors, Google wanted a filesystem that is optimized for SMP, he said, thus the move from yaffs2 to ext4.

Once the kernel has booted, it "starts one thing and one thing only" and that is the init process. Init parses /init.rc and runs what it finds there. Typically that means it creates mount points and mounts filesystems, starts a low-memory handler that is specific to Android (and runs before the kernel's out-of-memory (OOM) killer), starts a bunch of services (including the servicemanager which manages Binder contexts), and starts the "root" of the Java process tree, Zygote (aka app_process).

Zygote is the parent process of all application processes in the system. It "warms up the cache of classes" so that applications start quickly, and starts the System Service. The System Service is a key part of the Android system, but one that is not very well documented. "Anything that is important that is running in the system is housed inside the System Service", Yaghmour said. That includes services for various hardware devices (battery, lights, vibrator, audio, sensors, etc.), as well as managers for things like alarms, notifications, activities, and windows (i.e. the window manager).

The System Service starts the ActivityManager to do what its name implies, manage activities. It is "one of the most important services" in Android, he said. It handles starting activities, broadcasting events, and more. He likened it to the kernel's scheduler. When an activity needs to be started, for example, the ActivityManager asks Zygote over a socket to do so.

The hardware devices are generally accessed via their services, which call into an underlying library that is typically provided by a vendor. For things like LEDs, Android provides a .h file that describes the interface and vendors create a C program that implements it. It is similar for other devices like GPS, audio, camera, and so on. For WiFi, wpasupplicant is used, while Bluetooth support comes from BlueZ.

Because of GPL concerns, Android talks to BlueZ via D-Bus, which may be controversial in some quarters, Yaghmour said. In answer to a question from an audience member, he noted that Google wanted to avoid having its partners have to explain licensing to their engineers. So, it chose not to use GPL-covered software other than the kernel to keep user space "free" of licensing concerns. That gets a little blurrier with the inclusion of BlueZ, but the "theory" is that the GPL does not apply to code that talks to it via D-Bus. Some may disagree with that theory, he said.

It must have been hard to pull together a reasonable look at Android's guts that would fit into a 50-minute slot, but Yaghmour largely succeeded in doing so. There were undoubtedly lots of details passed over, but attendees definitely got a good feel for what goes on inside the phone that resides in many of their pockets.

Comments (26 posted)

Exploring the globe with Marble 1.1

April 27, 2011

This article was contributed by Joe 'Zonker' Brockmeier.

Marble, as part of the KDE Software Collection (KSC), typically sees releases in-line with major KDE releases. However, thanks to the efforts of students working with the KDE Project for the Google Code-in, Marble picked up enough new features that it was worth releasing 1.1 mid-cycle and getting its new features out early. With 1.1 the 3D mapping application picks up plugin configuration, map editing, and voice navigation if you happen to be using Marble on the Nokia N900.

Marble is 3D virtual globe, part of the KDE application set — but also available in a Qt-only version for Linux users who prefer not to include KDE-only dependencies or for Mac and Windows users. Since LWN last looked in on Marble, it's come a long way. The basic interface is still the same — but Marble has picked up quite a few features since the 0.5 days.

[St. Louis map] Since 1.1 is out of step with KDE SC releases, it may not turn up as a package for any of the major distributions right away. To test it out, I decided to compile it from source on openSUSE 11.4. As mentioned, you have the choice of compiling the Qt-only version of Marble or the full KDE version — I opted for the full KDE version. The 1.1 release library is meant to be ABI compatible with the 1.0 release, which means that other KDE applications that depend on it should work as expected.

One word of warning if you do opt to compile Marble on your own — make sure to uninstall the prior Marble package as well. Forgetting this simple and obvious step could lead to some odd behavior, or so we've heard.

Using Marble 1.1

After compiling Marble 1.1, I set about exploring the Marble interface and checking out some of the new features. For exploring the globe and generally poking around, Marble is fantastic. The interface is easy to use, it offers a variety of map views (flat, Mercator projection, and your standard globe), and quite a few themes. The themes are things like a satellite view of Earth, OpenStreetMap (OSM) data, Earth at night (which shows city lights from space), and so on.

Marble is a good tool for students, hence its position in the KDE Educational software project. You can click on a city and see two tabs. One contains the Wikipedia entry for the city and the other contains a basic data sheet provided with Marble itself — though I found no description in any of the data sheets for any of the cities that I checked. Each had coordinates, country, time zone, etc., but Marble seems to rely on Wikipedia for any actual description.

[Earthquake map]

One of the interesting new features in Marble 1.1 is an online service that displays earthquakes that have happened in a given spot with their magnitude. It's worth noting that this feature was completed during the Code-in and is not related to or inspired by the earthquakes that caused so much damage in Japan. It was surprising to see just how many earthquakes that have been recorded in the US Midwest, though of minor magnitude, since 2006. Unfortunately, Marble doesn't provide a link to any additional information about the events online — the data is simply provided as a colored circle with the magnitude. The color and size of the circle is determined by the magnitude of the earthquake, with larger quakes being a darker red and having a larger diameter. Hovering the mouse over the circle will display the date and depth — but that's all.

For users who have a Nokia N900, Marble should provide voice navigation. Unfortunately, I don't have a Nokia N900 handy, and wasn't able to test this feature. Users who are interested in voice navigation will need to convert a TomTom voice for use with Marble, as it doesn't ship with any at the moment. The Marble folks would welcome contributions, so if you're a non-developer with a pleasant voice this may be an opportunity to contribute.

Marble will open maps or map data in GPS Exchange Format (GPX) and Keyhole Markup Language (KML). I didn't do a lot with importing GPX or KML map data, but did grab a few GPX files online and viewed them in conjunction with OpenStreetMap data. This seemed to work very well.

Where Marble falls down a bit is with routing. Marble allows you to search maps for street addresses and create routes between addresses, but tends to be hit or miss when it comes to actually creating a route or finding some street addresses. For example, I tried creating a route between my home in St. Louis and Bradenton, Florida or between my home in St. Louis and my parents' old home less than 100 miles from St. Louis. Between St. Louis and Florida, Marble was unable to generate a route at all. Marble was also unable to find my old home address, though I could create a route from my current address to my old hometown that was mostly sane.

At home cartography

One of the major new features for 1.1 is the ability to edit maps or create your own. Users can import map data from a server that provides data via Web Map Service (WMS), via a bitmap stored locally, or from a static service like OSM.

The process is laid out in a tutorial on the KDE UserBase, but is not terribly intuitive as of yet. It does work, it's just a bit clunky and certainly will be non-obvious to most users. The tutorial also provides a few pointers for WMS servers and other resources, which will be useful to anyone who wants to learn how to make a map without already having a free map service in mind. According to Dennis Nienhüser, one of the Marble developers, an updated (and more intuitive) wizard is on its way for Marble 1.2.

When using OSM maps, users can actually right-click on the map and open it in an external editor to edit the map. Marble supports a Flash-based editor called Potlatch, along with Merkaartor, or JOSM for editing maps.

Up the Marble road

Though the 1.1 release was pushed out so the world could have the new features early, one shouldn't worry that Marble 1.2 won't hit on schedule. The 1.2 release will be back in sync with the KSC release, so it's expected with the KDE 4.7 release scheduled for July. One of the things that is on the drawing board is an OpenGL mode for Marble. This doesn't mean that Marble would leave 2D systems behind — but it would add OpenGL support for platforms that have it enabled.

[Moon map]

Nienhüser also says that more mobile platforms are in the future for Marble, as well as making Marble one of the "Plasma Active" enabled applications. Which mobile platforms? Nienhüser says he's looking at MeeGo first, and "if MeeGo does not kick off, I guess Android is the next target."

Marble also has a couple of Google Summer of Code projects in the works, according to Nienhüser. One is vector rendering of OSM data (it's currently using bitmapped data — which requires quite a hefty download), the other is a QML version of Marble that would target MeeGo.

Though it's still rough around a few of the edges, Marble has come a very long way since its early days — and looks to be headed for uncharted territory as one of the most usable free software mapping tools.

Comments (3 posted)

A victory for the trolls

By Jonathan Corbet
April 25, 2011
For many years we have heard warnings that software patents pose a threat to the free software community. Repeated warnings have a tendency to fade into the noise if they are not followed by real problems; to many, the patent threat may have seemed like one of those problems we often hear about but never experience. The recent ruling in the US that Google is guilty of infringing a software patent held by a patent troll named "Bedrock Computer Technologies" serves as a reminder that the threat is real, and that solutions will not be easy to come by.

The patent in question is #5,893,120 - "Methods and apparatus for information storage and retrieval using a hashing technique with external chaining and on-the-fly removal of expired data." The independent claims from the patent are short and simple; #3 reads like this:

A method for storing and retrieving information records using a linked list to store and provide access to the records, at least some of the records automatically expiring, the method comprising the steps of:

  • accessing the linked list of records,
  • identifying at least some of the automatically expired ones of the records, and
  • removing at least some of the automatically expired records from the linked list when the linked list is accessed.

Needless to say, numerous people who are "skilled in the art" have concluded that there is little that is original or non-obvious in this claim. In its defense, Google argued that the technique is, indeed, obvious (to the point that it should be invalidated under the Bilski ruling), that the patent is invalid due to prior art, and that Linux did not infringe upon the patent in any case. All of those arguments were pushed aside by the jury, which found Google guilty and granted an award of $5 million, a small fraction of the $183 million requested by Bedrock.

The full set of docket entries - almost 800 of them - are listed on the net. Many of the interesting ones are sealed, though, and unavailable to the public. We are all affected by this ruling, but we are unable to read most of testimony that led up to it. Instead, the bulk of the publicly-available information has to do with the various bits of legal jousting which set timetables and which control the evidence that can be presented. Thus, for example, we learn that a late attempt to bring in Alan Cox to testify on his early routing table work was pushed back and eventually abandoned. Still, there are some interesting things to be learned by plowing through these documents.

The code

The code in question is that which maintains the networking stack's routing cache - some of the oldest code in the kernel; it can be found in .../net/ipv4/route.c. This code maintains a hash table of linked lists containing routing entries; as the world changes, entries must be added or deleted. Bedrock claims that its patent is infringed by this code, though even Bedrock has, more or less, admitted [PDF] that any infringement will have been done inadvertently, with no knowledge that the patent existed. The various defendants (Google is only one of the companies targeted) have made various arguments, starting with the claim that the code does not use the algorithm described in the patent at all; see this brief [PDF] for a summary of that argument:

The accused instrumentalities - servers using versions of the Linux kernel prior to 2.6.25 - do not meet all elements of the '120 patent because: (1) removal of records does not occur "when the linked list is accessed" ('120 patent claims 1, 3, 4, 5, 7, and 8); (2) the removed records are not "expired" ('120 patent claims 1, 3, 5, and 7); (3) there is no "dynamically determining maximum number" of expired records to remove ('120 patent claims 2, 4, 6, and 8) (for all accused versions); (4) the accused code does not remove an expired record while using the record search means to search for a record to delete (for all accused versions); and (5) there is no evidence that the accused code has ever executed, as required by all asserted claims.

What one learns early on is that how terms like "when the linked list is accessed" are defined is crucial in a decision regarding infringement. That is where the "claim construction" process comes into play; for the full, gory details of how it was done in this case, see docket #369 [PDF]. There was a big fight, for example, over whether "removing from the linked list" required deallocation of the entry that was removed; Bedrock won that one and got a ruling that deallocation is a separate operation. The biggest fight seemed to be over whether removal "when the linked list is accessed" meant that the structure needed to be removed during the traversal of the list; Bedrock seemed to think that removal at some later time qualified, but the court disagreed. That should have been a decisive victory for the defense, but it appears to not have been enough.

Invalidation attempts

There was also a determined effort to have the patent ruled invalid due to prior art. It is interesting to note that, in early 2010, a separate challenge to this patent was raised at the US Patent and Trade Office, citing four other patents as prior art; the patent was, in fact, invalidated by the PTO last July. But Bedrock was then allowed to tweak the wording of the claims until the PTO agreed that the modified patent was, once again, valid. This history shows why attempts to kill patents so rarely achieve the desired results: patents can never truly be killed this way. Instead, the owner is allowed to make changes, resulting in zombie patents that return from the dead to terrorize again and again. A second challenge to the patent was filed in January of this year; it cites two more patents as prior art; a ruling has not yet been made in this case.

The defendants' attempt to invalidate the patent does not depend on that prior art at all, interestingly; instead, this challenge [PDF] is based on the Linux code itself. They claim that the code in route.c has not changed materially since the 1.3.x days and that, in particular, the 2.0.1 version was quite close to what we have now. These prior versions, it is claimed, include all of the claims of Bedrock's patents, and thus serve as prior art invalidating the patent. One might find some amusing irony in the claim that older code implemented the patented technique while current code - said to be about the same - does not. The point is, of course, that if the current code is said to infringe, the older code should be said to implement the patent in the same way. Either both versions implement the patented algorithm (in which case it's invalid due to prior art) or neither does.

The argument seems strong enough. We cannot know how Bedrock argued against this reasoning, though - its response is sealed and inaccessible. It is also worth noting that the US PTO has not considered older Linux releases as prior art when reevaluating this patent; it would appear that the challengers have not asked it to.

In the midst of all this, Red Hat has filed a suit of its own against Bedrock. It seems that some Red Hat customers have been getting nervous about Bedrock's activity and asked for help; Red Hat responded by filing a preemptive suit asking that the patent be ruled invalid and that Red Hat's products be deemed to be non-infringing. That case is still in the works; Red Hat also tried to jump into the Google et al. case [PDF], but that attempt was denied by the judge. In reading the filings, one also learns the Match.com (another defendant in the suit) made a deal with Bedrock and was allowed to drop out.

Now what?

This verdict has been widely publicized as a big defeat for Linux. Perhaps it is, but not for the reasons being cited - this particular patent is not a huge problem, but the fact that patent trolls can win judgments against Linux is problematic indeed. If need be, the kernel's routing table code can be tweaked to avoid infringing Bedrock's patent; indeed, Docket #445 [PDF] lets slip the fact that Google has already changed its kernels to that effect. There could be a case for past infringement, but there need be no real fear that Bedrock will be out there collecting rents from Linux users in the future, even if the ruling stands.

We can hope that the ruling will, in fact, not stand. If Red Hat prevails in its separate case, the verdict against Google will have to be reevaluated. Even in the absence of a victory there, Google's defense was strong enough to warrant an appeal. Google is just one of a number of companies which cannot let it be thought that Linux is an easy target for shakedowns by patent trolls; there is a strong incentive for the company to keep on fighting, even if that fight is likely to cost more than the (relatively small) $5 million it has been told to pay Bedrock. For all of our sake, we must hope that all of the companies involved in this case find it worth their while to get the ruling reversed.

If Bedrock loses in the end, other potential trolls will hopefully be deterred from jumping in with suits of their own. But there can be no doubt that more of these cases will come along; that is really just the nature of the software patent system. Until we can get some sort of high-level reform, we will always have to fear trolls wielding patents on elementary techniques.

Comments (92 posted)

"Maniacal supporter" subscription level now available

For a while now, certain LWN subscribers have been asking us to add a more expensive subscription level. We are happy to announce that, at long last, we have done it; the new "maniacal supporter" level is now available for those of you who are feeling sufficiently maniacal to pay $50/month for LWN. The additional benefits of this level are small in number, but we assure you that they can be had nowhere else; from the LWN FAQ:

Subscribers at this level have all the access given to "project leader" subscribers; they are also credited as supporters in their comment postings. LWN staff will happily buy supporters a beer (or other beverage of their choice) at any conference where they may meet.

In the end, this option is the result of a rule of thumb which has never steered us wrong: always do what Rusty says. We are most curious to see how many of our supporters are willing to take this next step to help keep LWN going.

Thanks to all of you for supporting LWN at any level.

Comments (11 posted)

Page editor: Jonathan Corbet

Security

Python vulnerability disclosure

By Jake Edge
April 27, 2011

Vulnerability disclosure is often a bit tricky. There are those who would like to see the information closely held until updated packages are available for most users, while others prefer to see users get information about a security hole more quickly. Because free software development takes place in the open, with publicly accessible code repositories, it makes it that much harder to completely bottle up information about those holes. A recent discussion on the python-dev mailing list highlights the problem well.

A bug in Python's urllib and urllib2 URL handling libraries was fixed by Guido van Rossum in late March. The problem was fairly straightforward, but did have security implications. Basically, those libraries were not properly sanitizing HTTP redirects, which allowed redirects to URL schemes other than http://, https://, and ftp://. In particular, the bug report notes that a redirection to file:///etc/passwd could potentially improperly disclose the contents of that file. Other misuses are possible as well.

The fix led to a posting by Brian Curtin on the Python development blog. The posting described the problem and the fix, along with some useful information on reporting Python security flaws. It also noted that an updated version of Python 2.5 would be coming soon, but that maintenance releases for 2.6, 2.7, 3.1, and 3.2 had yet to be scheduled.

Gustavo Narea expressed concerns about the posting, though, wondering why the vulnerability was being disclosed prior to updated packages being available: "My understanding is that the whole point of asking people not to report security vulnerability publicly was to allow time to release a fix." But, as several people pointed out in the thread, once a fix has been committed to a public repository—and a public bug report is created—there is nothing to be gained by keeping the vulnerability "secret". Curtin replied to Narea to that effect:

To me, the fix *was* released. Sure, no fancy installers were generated yet, but people who are susceptible to this issue 1) now know about it, and 2) have a way to patch their system *if needed*.

Jesse Noller was even more explicit, noting that the "bad guys" likely already know about the vulnerability, and that publicity is exactly what is needed:

The code is open source: Anyone watching the commits/list know that this issue was fixed. It's better to keep it in the public's eyes, so they know *something was fixed and they should patch* than to rely on people *not* watching these channels.

Assume the bad guys already knew about the exploit: We have to spread the knowledge of the fix as far and as wide as we can so that people even know there is an issue, and that it was fixed. This applies to users and *vendors* as well.

Whether or not Python puts out an immediate release to address the issue, it is important that users get the information they need to make a decision about how (or whether) to address the problem. Without that knowledge, it may well be that the only people who know about it (outside of those working on a fix) are the ones likely to try to exploit it. In essence, this comes down to the age-old split between those who advocate "full disclosure" and those who believe that "responsible disclosure" (or some other disclosure policy) is the right way to go.

The boundaries between full and responsible disclosure become a bit fuzzy in the free software world. Certainly any "bad guys" that follow the Python development tree would have noted the fix going in well before the announcement was made. Waiting to disclose it until a release is done would obviously just make that worse. But, not committing the fix until a release is ready is also untenable. In the end, free software projects, by their very nature, are better off being toward the "full disclosure" end of the axis.

A related question that came up in the discussion was about how the information about the vulnerability was disseminated. There is currently no "official" channel for Python to publicize any vulnerabilities that arise. Clearly Curtin's blog post helped get the word out, but without an official channel (like the distribution security announcement lists), it may have been something of a hit-or-miss approach. As Antoine Pitrou put it:

Also, I think Gustavo's whole point is that if we don't have a well-defined, deterministic procedure for security announcements and releases, then it's just as though we didn't care about security at all. Saying "look, we mentioned this one on our development blog" isn't really reassuring for the target group of people.

The vulnerability has already been assigned CVE-2011-1521, which is just a reserved entry, currently, but others have associated it with the urllib flaws. So there are multiple ways for bad or good guys to find out about the problem, but none that are officially associated with the Python project. This particular vulnerability may not be serious enough to force a "drop everything and push out a release" fire drill, but others may be. Distributions and others interested should have a way to be informed of these kinds of flaws that doesn't involve closely following the commits, CVEs, or the development blog.

The consensus in the thread, at least, seemed in favor of a security-announce kind of list for Python. Though Narea's original email concerned premature release of the vulnerability information, the end result of the discussion was that the information was probably not disseminated widely enough. Other projects may want to consider this discussion when formulating their own security vulnerability disclosure plans.

Comments (3 posted)

Brief items

Security quote of the week

Under the proposed approach, a covert channel is used to encode the sensitive information by modifying the fragmentation patterns in the cluster distribution of an existing file. As opposed to existing schemes, the proposed covert channel does not require storage of any additional information on the filesystem. Moreover, the channel provides two-fold plausible deniability so that an investigator without the key cannot prove the presence of hidden information.
-- from the abstract of Designing a cluster-based covert channel to evade disk investigation and forensics

Comments (1 posted)

Android phones keep location cache, too, but it's harder to access (ars technica)

After the recent discovery that iPhones/iPads were recording location data on the device, folks have been looking for similar data on Android phones—and have now found some. "Another important difference, according to developer Mike Castelman, is that Android keeps less data overall than iOS devices. 'The main difference that I can see is that Android seems to have a cache versus iOS's log,' Castleman, who contributed some code improvements to [Magnus] Eriksson's tool, told Ars. That is, Android appears to limit the caches to 50 entries for cell tower triangulation and 200 entries for WiFi basestation location. iOS's consolidated.db, on the other hand, seems to keep a running tally of data since iOS is first installed and activated on a device. iOS will also keep multiple records of the same tower or basestation, while Android only keeps a single record."

Comments (15 posted)

New vulnerabilities

asterisk: denial of service and code execution

Package(s):asterisk CVE #(s):CVE-2011-1147 CVE-2011-1507 CVE-2011-1599
Created:April 27, 2011 Updated:May 17, 2011
Description: The asterisk telephony system suffers from one denial of service vulnerability (CVE-2011-1507), one remote code execution vulnerability (CVE-2011-1147), and one local privilege escalation problem (CVE-2011-1599).
Alerts:
Gentoo 201110-21 asterisk 2011-10-24
Debian DSA-2225-1 asterisk 2011-04-25
Fedora FEDORA-2011-6225 asterisk 2011-04-29
Fedora FEDORA-2011-6208 asterisk 2011-04-29

Comments (none posted)

fail2ban: conflicts with selinux

Package(s):fail2ban CVE #(s):
Created:April 26, 2011 Updated:April 27, 2011
Description: From the Fedora advisory:

fail2ban used predictable /tmp files which a local user can allocate before fail2ban does. All tmp files have been moved to /var/lib/fail2ban. This also helps with selinux policies.

Another security related fix is that fail2ban defaulted to gamin which conflicts with selinux, so users had to typically choose between fail2ban and selinux. fail2ban now defaults to inotify (thanks to Jonathan Underwood).

Alerts:
Fedora FEDORA-2011-5151 fail2ban 2011-04-10
Fedora FEDORA-2011-5153 fail2ban 2011-04-10

Comments (none posted)

perl: arbitrary command execution

Package(s):perl CVE #(s):CVE-2011-1487
Created:April 25, 2011 Updated:June 21, 2011
Description: From the Red Hat bugzilla:

A security flaw was found in the way Perl performed laundering of tainted data. A remote attacker could use this flaw to bypass Perl TAINT mode protection mechanism (leading to commands execution on dirty arguments or file system access via contaminated variables) via specially-crafted input provided to the web application / CGI script.

Alerts:
Gentoo 201311-17 perl 2013-11-28
Debian DSA-2265-1 perl 2011-06-20
Pardus 2011-72 perl 2011-05-02
Ubuntu USN-1129-1 perl 2011-05-03
Red Hat RHSA-2011:0558-01 perl 2011-05-19
Fedora FEDORA-2011-4918 perl 2011-04-06
SUSE SUSE-SR:2011:009 mailman, openssl, tgt, rsync, vsftpd, libzip1/libzip-devel, otrs, libtiff, kdelibs4, libwebkit, libpython2_6-1_0, perl, pure-ftpd, collectd, vino, aaa_base, exim 2011-05-17
openSUSE openSUSE-SU-2011:0479-1 perl 2011-05-13
Mandriva MDVSA-2011:091 perl 2011-05-18

Comments (none posted)

rdesktop: directory traversal

Package(s):rdesktop CVE #(s):CVE-2011-1595
Created:April 22, 2011 Updated:October 19, 2012
Description: From the Slackware advisory:

Patched a traversal vulnerability (disallow /.. requests).

Alerts:
Gentoo 201210-03 rdesktop 2012-10-18
Fedora FEDORA-2011-7697 rdesktop 2011-05-30
Fedora FEDORA-2011-7694 rdesktop 2011-05-30
Fedora FEDORA-2011-7688 rdesktop 2011-05-30
SUSE SUSE-SR:2011:010 postfix, libthunarx-2-0, rdesktop, python, viewvc, kvm, exim, logrotate, dovecot12/dovecot20, pure-ftpd, kdelibs4 2011-05-31
Mandriva MDVSA-2011:102 rdesktop 2011-05-28
Ubuntu USN-1136-1 rdesktop 2011-05-25
openSUSE openSUSE-SU-2011:0530-1 rdesktop 2011-05-24
CentOS CESA-2011:0506 rdesktop 2011-05-12
Red Hat RHSA-2011:0506-01 rdesktop 2011-05-11
Pardus 2011-69 rdesktop 2011-05-02
Slackware SSA:2011-110-01 rdesktop 2011-04-22
openSUSE openSUSE-SU-2011:0528-1 rdesktop 2011-05-24

Comments (none posted)

tinyproxy: access restriction bypass

Package(s):tinyproxy CVE #(s):CVE-2011-1499
Created:April 21, 2011 Updated:September 23, 2013
Description:

From the Debian advisory:

Christoph Martin discovered that incorrect ACL processing in TinyProxy, a lightweight, non-caching, optionally anonymizing http proxy could lead to unintended network access rights.

Alerts:
Fedora FEDORA-2013-16225 tinyproxy 2013-09-21
Debian DSA-2222-1 tinyproxy 2011-04-20

Comments (none posted)

wireshark: two buffer overflows

Package(s):wireshark CVE #(s):CVE-2011-1590 CVE-2011-1591
Created:April 27, 2011 Updated:July 8, 2011
Description: The wireshark protocol analyzer suffers from buffer overflows (possibly leading to remote code execution vulnerabilities) in the x.509if and DECT dissectors.
Alerts:
Oracle ELSA-2013-1569 wireshark 2013-11-26
CentOS CESA-2012:0509 wireshark 2012-04-24
Oracle ELSA-2012-0509 wireshark 2012-04-23
Scientific Linux SL-wire-20120423 wireshark 2012-04-23
Red Hat RHSA-2012:0509-01 wireshark 2012-04-23
Gentoo 201110-02 wireshark 2011-10-09
Debian DSA-2274-1 wireshark 2011-07-07
openSUSE openSUSE-SU-2011:0602-1 wireshark 2011-06-07
openSUSE openSUSE-SU-2011:0599-1 wireshark 2011-06-07
Pardus 2011-77 wireshark 2011-05-26
Fedora FEDORA-2011-5569 wireshark 2011-04-19
Fedora FEDORA-2011-5529 wireshark 2011-04-18
Mandriva MDVSA-2011:083 wireshark 2011-05-12

Comments (none posted)

Page editor: Jake Edge

Kernel development

Brief items

Kernel release status

The current development kernel is 2.6.39-rc5, released on April 26. According to Linus:

We have slightly fewer commits than in -rc4, which is good. At the same time, I have to berate some people for merging some dubious regression fixes. Sadly, the 'people' I have to berate is me, because -rc5 contains what technically _is_ a regression, but it's a performance thing, and it's a bit scary. It's the patches from Andi (with some editing by Eric) to make it possible to do the whole RCU pathname walk even if you have SElinux enabled.

See the full changelog for all the details.

Stable updates: the 2.6.38.4 update was released on April 21; 2.6.32.39 and 2.6.33.12 followed one day later; all contain another long list of important fixes.

The 2.6.27.59 and 2.6.35.13 updates are in the review process as of this writing; they can be expected on or after April 28.

Comments (none posted)

Quotes of the week

Can't be helped. No one has ever written a polite application regarding disk usage. Applications are like seagulls, scanning for free disk blocks and chanting "Mine! Mine!".
-- Casey Schaufler

That works. But Greg might see us doing it, so some additional mergeable patches which *need* that export will keep him happy. (iow, you're being extorted into doing some kernel cleanup work)
-- Andrew Morton

I'd been offline since Mar 25 for a very nasty reason - popped aneurysm in right choroid artery. IOW, a hemorrhagic stroke. A month in ICU was not fun, to put it very mildly. A shitty local network hadn't been fun either... According to the hospital folks I've ended up neurologically intact, which is better (for me) than expected.

Said state is unlikely to continue if I try to dig through ~15K pending messages in my mailbox; high pressure is apparently _the_ cause for repeated strokes.

-- Al Viro's welcome return

Comments (3 posted)

Dcache scalability and security modules

By Jonathan Corbet
April 27, 2011
The dentry cache scalability patch set was merged for the 2.6.38 kernel; it works by attempting to perform pathname lookup with no locks held at all. The read-copy-update (RCU) mechanism is used to ensure that dentry structures remain in existence for long enough to perform the lookup. This patch set has removed a significant scalability problem from the kernel, improving lookup times considerably. Except, as it turns out, it doesn't always work that way. A set of patches merged for 2.6.39-rc5 - rather later in the development cycle than one would ordinarily expect - has helped to address this problem.

The fact that the pathname lookup fast path runs under RCU means that no operation can block. Should it turn out that the lookup cannot be performed without blocking (if a directory entry must be read from disk, for example), the fastpath lookup is aborted and the whole process starts over in the slow mode. In the 2.6.38 lookup code, the mere fact that security modules have been built into the kernel will force a fallback to slow mode, even if no actual security module is active. Things were done this way because nobody had taken the time to verify whether the security module inode_permission() checks were RCU-safe or not. So, if security modules are enabled, the result is not just that the scalability advantages over 2.6.37 are not available; in fact, the code runs slower than it did in 2.6.37.

Enterprise distributions have a tendency to enable security modules, so this performance problem is a real concern. In response, Andi Kleen took a look at the code and found that improving the situation was not that hard; his patches led to what was merged for 2.6.39. Andi started by allowing individual security modules to decide whether they could perform the inode permissions check safely in the RCU mode or not, with the default being to fall back to slow mode. Since the default inode_permission() check does nothing, it could easily be made RCU safe; with just that change, systems with security modules enabled but with no module active can make use of the fast lookup path.

Looking further, Andi discovered that both SELinux and SMACK already use RCU for their permissions checking. Given that the code is already RCU-safe, extending it to do RCU-safe permission checks was relatively straightforward. The only remaining glitch is situations where auditing is enabled; auditing is not RCU-safe, so things will still slow down on such systems. Otherwise, though, the advantages of the dcache scalability work should now have been extended to systems with security modules enabled - assuming that the late-cycle patches do not result in regressions that cause them to be reverted.

Comments (3 posted)

Kernel development news

The return of SEEK_HOLE

By Jonathan Corbet
April 26, 2011
Back in 2007, LWN readers learned about the SEEK_HOLE and SEEK_DATA options to the lseek() system call. These options allow an application to map out the "holes" in a sparsely-allocated file; they were originally implemented in Solaris for the ZFS filesystem. At that time, this extension was rejected for Linux; the Linux filesystem developers thought they had a better way to solve the problem. In the end, though, it may have turned out that the Solaris crew had the better approach.

Filesystems on POSIX-compliant systems are not required to allocate blocks for files if those blocks would contain nothing but zeros. A range within a file for which blocks have not been allocated is called a "hole." Applications which read from a hole will get lots of zeros in response; most of the time, applications will not be aware that the actual underlying storage has not been allocated. Files with holes are relatively rare, but some applications do create "sparse" files which are more efficiently stored if the holes are left out.

Most of the time, applications need not care about holes, but there are exceptions. Backup utilities can save storage space if they notice and preserve the holes in files. Simple utilities like cp can also, if made aware of holes, ensure that those holes are not filled in any copies made of the relevant files. Thus, it makes sense for the system to provide a way for applications which care to learn about where the holes in a file - if any - may be found.

The interface created at Sun used the lseek() system call, which is normally used to change the read/write position within a file. If the SEEK_HOLE option is provided to lseek(), the offset will be moved to the beginning of the first hole which starts after the specified position. The SEEK_DATA option, instead, moves to the beginning of the first non-hole region which starts after the given position. A "hole," in this case, is defined as a range of zeroes which need not correspond to blocks which have actually been omitted from the file, though in practice it almost certainly will. Filesystems are not required to know about or report holes; SEEK_HOLE is an optimization, not a means for producing a 100% accurate map of every range of zeroes in the file.

When Josef Bacik posted his 2007 SEEK_HOLE patch, it was received with comments like:

I stand by my belief that SEEK_HOLE/SEEK_DATA is a lousy interface. It abuses the seek operation to become a query operation, it requires a total number of system calls proportional to the number holes+data and it isn't general enough for other similar uses (e.g. total number of contiguous extents, compressed extents, offline extents, extents currently shared with other inodes, extents embedded in the inode (tails), etc.)

So this patch was not merged. What we got instead was a new ioctl() operation called FIEMAP. There can be no doubt that FIEMAP is a more powerful operation; it allows the precise mapping of the extents in the file, with knowledge of details like extents which have been allocated but not written to and those which have been written to but which do not, yet, have exact block numbers assigned. Information for multiple extents can be had with a single system call. With an interface like this, it was figured, there is no need for something like SEEK_HOLE.

Recently, though, Josef has posted a new SEEK_HOLE patch with the comment:

Turns out using fiemap in things like cp cause more problems than it solves, so lets try and give userspace an interface that doesn't suck.

A quick search on the net will turn up a long list of bug reports related to FIEMAP. Some of them are simply bugs in specific filesystem implementations, like the problems related to delayed allocation that were discovered in February. Others have to do with the rather complicated semantics of some of the FIEMAP options and whether, for example, the file in question must be synced to the disk before the operation can be run. And others just seem to be related to the complexity of the system call itself. The end result has been a long series of reports of corrupted files - not the sort of thing filesystem developers want to find in their mailboxes.

It seems that FIEMAP is a power tool with sharp edges which has been given to applications which just wanted a butter knife. For the purpose of simply finding out which parts of a file need not be copied, a simple interface like SEEK_HOLE seems to be more appropriate. So, one assumes, this time the interface will likely get into the kernel.

That said, it looks like a few tweaks will be needed first. The API as posted by Josef does not exactly match what Solaris does; to add an API which is not compatible with the existing Solaris implementation makes little sense. There is also the question of what happens when the underlying filesystem does not implement the SEEK_HOLE and SEEK_DATA options; the current patch returns EINVAL in this situation. A proposed alternative is to have a VFS-level implementation which just assumes that the file has no holes; that makes the API appear to be supported on all filesystems and eliminates one error case from applications.

Once these details are worked out - and appropriate man pages written - SEEK_HOLE should be set to be merged this time around. FIEMAP will still exist for applications which need to know more about how files are laid out on disk; tools which try to optimize readahead at bootstrap time are one example of such an application. For everything else, though, there should be - finally - a simpler alternative.

Comments (29 posted)

ARM, DMA, and memory management

By Jonathan Corbet
April 27, 2011
As the effort to bring proper abstractions to the ARM architecture and remove duplicated code continues, one clear problem area that has arisen is in the area of DMA memory management. The ARM architecture brings some unique challenges to this area, but the problems are not all ARM-specific. We are also seeing an interesting view into a future where more complex hardware requires new mechanisms within the kernel to operate properly.

One development in the ARM sphere is the somewhat belated addition of I/O memory management units (IOMMUs) to the architecture. An IOMMU sits between a device and main memory, translating addresses between the two. One obvious application of an IOMMU is to make physically scattered memory look contiguous to the device, simplifying large DMA transfers. An IOMMU can also restrict DMA access to a specific range of memory, adding a layer of protection to the system. Even in the absence of security worries, a device which can scribble on random memory can cause no end of hard-to-debug problems.

As this feature has come to ARM systems, developers have, in the classic ARM fashion, created special interfaces for the management of IOMMUs. The only problem is that the kernel already has an interface for the management of IOMMUs - it's the DMA API. Drivers which use this API should work on just about any architecture; all of the related problems, including cache coherency, IOMMU programming, and bounce buffering, are nicely hidden. So it seems clear that the DMA API is the mechanism by which ARM-based drivers, too, should work with IOMMUs; ARM maintainer Russell King recently made this point in no uncertain terms.

That said, there are some interesting difficulties which arise when using the DMA API on the ARM architecture. Most of these problems have their roots in the architecture's inability to deal with multiple mappings to a page if those mappings do not all share the same attributes. This is a problem which has come up before; see this article for more information. In the DMA context, it is quite easy to create mappings with conflicting attributes, and performance concerns are likely to make such conflicts more common.

Long-lasting DMA buffers are typically allocated with dma_alloc_coherent(); as might be expected from the name, these are cache-coherent mappings. One longstanding problem (not just on ARM) is that some drivers need large, physically-contiguous DMA areas which can be hard to come by after the system has been running for a while. A number of solutions to this problem have been tried; most of them, like the CMA allocator, involve setting aside memory at boot time. Using such memory on ARM can be tricky, as it may end up being mapped as if it were device memory, and may run afoul of the conflicting attributes rules.

More recently, a different problem has come up: in some cases, developers want to establish these DMA areas as uncached memory. Since main memory is already mapped into the kernel's address space as cached, there is no way to map it as uncached in another context without breaking the rules. Given this conflict, one might well wonder (as some developers did) why uncached DMA mappings are wanted. The reason, as explained by Rebecca Schultz Zavin, has to do with graphics. It's common for applications to fill memory with images and textures, then hand them over to the GPU without touching them further. In this situation, there's no advantage to having the memory represented in the CPU's cache; indeed, using cache lines for that memory can hurt performance. Going uncached (but with write combining) turns out to give a significant performance improvement.

But nobody will appreciate the higher speed if the CPU behaves strangely in response to multiple mappings with different attributes. Rebecca listed a few possible solutions to that problem that she had thought of; some have been tried before, and none are seen as ideal. One is to set aside memory at boot time - as is sometimes done to provide large buffers - and never map that memory into the kernel's address space. Another approach is to use high memory for these buffers; high memory is normally not mapped into the kernel's address space. ARM-based systems have typically not needed high memory, but as the number of systems with 1GB (or more) memory are shipped, we'll see more use of high memory. The final alternative would be to tweak the attributes in the kernel's mapping of the affected memory. That would be somewhat tricky; that memory is mapped with huge pages which would have to be split apart.

These issues - and others - have been summarized in a "to do" list by Arnd Bergmann. There's clearly a lot of work to be done to straighten out this interface, even given the current set of problems. But there is another cloud on the horizon in the form of the increasing need to share these buffers between devices. One example can be found in this patch, which is an attempt to establish graphical overlays as proper objects in the kernel mode setting graphics environment. Overlays are a way of displaying (usually) high-rate graphics on top of what the window system is doing; they are traditionally used for tasks like video playback. Often, what is wanted is to take frames directly from a camera and show them on the screen, preferably without copying the data or involving user space. These new overlays, if properly tied into the Video4Linux layer's concept of overlays, should allow that to happen.

Hardware is getting more sophisticated over time, and, as a result, device drivers are becoming more complicated. A peripheral device is now often a reasonably capable computer in its own right; it can be programmed and left to work on its own for extended periods of time. It is only natural to want these peripherals to be able to deal directly with each other. Memory is the means by which these devices will communicate, so we need an allocation and management mechanism that can work in that environment. There have been suggestions that the GEM memory manager - currently used with GPUs - could be generalized to work in this mode.

So far, nobody has really described how all this could work, much less posted patches. Working all of these issues out is clearly going to take some time. It looks like a fun challenge for those who would like to help set the direction for our kernels in the future.

Comments (none posted)

ELC: A PREEMPT_RT roadmap

By Jake Edge
April 27, 2011

Thomas Gleixner gets asked regularly about a "roadmap" for getting the realtime Linux (aka PREEMPT_RT) patches into the mainline. As readers of LWN will know, it has been a multiple-year effort to move pieces of the realtime patchset into the mainline—and one that has been predicted to complete several times, though not for a few years now. Gleixner presented an update on the realtime patches at this year's Embedded Linux Conference. In the talk, he showed a roadmap—of sorts—but more importantly described what is still lurking in that tree, and what approach the realtime developers will be taking to get those pieces into the mainline. [Thomas Gleixner]

Gleixner started out by listing the parts of the realtime tree that have already made it into the mainline. That includes high-resolution timers, the mutex infrastructure, preemptible and hierarchical RCU, threaded interrupt handlers, and more. Interrupt handlers can now be forced to run as threads by using a kernel command line option. There have also been cleanups done in lots of places to make it easier to bring in features from the realtime tree, including cleaning up the locking namespace and infrastructure "so that sleeping spinlocks becomes a more moderate sized patch", he said.

Missing pieces

What's left are the "tough ones" as all of the changes that are "halfway easy to do" are already in the mainline. The next piece that will likely appear is the preemptible mmu_gather patches, which will allow much of the memory management code to be preemptible. Gleixner said that it was hoped that code could make it into 2.6.39; that didn't happen, but it should go in for 2.6.40.

Per-CPU data structures are a current problem that "makes me scratch my head a lot", Gleixner said. The whole idea is to keep the data structures local to a particular CPU and avoid cache contention between CPUs, which requires that any code modifying those data structures stay running on that CPU. In order to do that, the code disables preemption while modifying the per-CPU data. If that code "just did a little fiddling" with preemption disabled, it would not be a problem, but currently there are often thousands of lines of code executed. The realtime developers have talked with the per-CPU folks and they "see our pain". The right solution is use inline functions to annotate the real atomic accesses, so that the preemption-disabled window can be reduced. "Right now, there is a massive amount of code protected by preempt_disable()", he said.

The next area that needs to be addressed is preemptible memory and page allocators. Right now, the realtime tree uses SLAB because the others are "too hard to deal with". There has been talk about creating a memory allocator specifically for the realtime tree, but some recent developments in the SLUB allocator may have removed the need for that. SLUB has been converted to be completely lockless for the fast path and Christoph Lameter has promised to deal with the slow path, which is "good news" for the realtime developers. The page allocator problem is "not that hard to solve", Gleixner said. Some developers have claimed that a fully preemptible, lockless page allocator is possible, so he is not worried about that part.

Another area "that we still have to twist our brain around" is software interrupts, he said. Those currently disable preemption, but then can be interrupted themselves, leading to unbounded latency. One possibility is to split up the software interrupts into different threads and to wake them up when an interrupt is generated, whether it comes from kernel or user space. There are performance implications with that, however, because there is a context switch associated with the interrupt. There are some other "nasty implications" as well, because it will be difficult to tune the priorities of the interrupt threads correctly.

Another possibility would be to add an argument to local_bh_disable() that would indicate which software interrupts should be held off. But cleaning up the whole tree to add those new arguments is "nothing I can do right now", he said. There are tools to help with adding the argument itself, but figuring out which software interrupts should be disabled is a much bigger task.

The "last thing" that is still pending in the realtime tree is sleeping spinlocks. That work is fairly straightforward he said, only requiring adding one file and patching three others. But that will only come once the other problems have been solved, he said.

Mainline merging

So, when will the merge to mainline be finished? That's a question Gleixner and the other realtime developers have been hearing for seven years or so. The patchset is huge and "very intrusive in many ways", he said. It has been slowly getting into the mainline piece by piece, but it will probably never be complete, because people keep coming up with new features at roughly the same rate as things move into the mainline. As always, Gleixner said, "it will be done by the end of next year".

Gleixner used a 2010 quote from Linus Torvalds ("The RT people have actually been pretty good at slipping their stuff in, in small increments, and always with good reasons for why they aren't crazy.") to illustrate the approach taken by the realtime developers. The realtime changes are slipped into "nice Trojan horses" that are useful for more than just realtime. Torvalds is "well aware that we are cheating, but he doesn't care" because the changes fix other problems as well.

The realtime tree has been pinned to kernel 2.6.33 for some time now (with 2.6.33.9-rt having been released just prior to Gleixner's talk). There are plans to update to 2.6.38 soon. There a several reasons why the realtime tree is not updated very rapidly, starting with a lack of developer time. The tree also requires a long stabilization phase, partly because "some of the bugs we find are very complex race conditions", and those bugs can have serious impacts on filesystems or other parts of the kernel. Typically the problem is not fixing those kinds of bugs, but finding them as they can be quite hard to reproduce.

Another problem is that because the realtime changes aren't in the mainline Gleixner "can't yell at people yet" when they break things. Also, other upstream work and merging other code often takes priority over work in the realtime tree. But he is "tired of maintaining that thing out of tree", so work will progress. Often getting a piece of the realtime tree accepted requires lots of work elsewhere in the tree, which consumes a lot of time and brain power. "People ship crap faster than you can fix it", he said.

There are about 20 active contributors to the realtime tree, as well as large testing efforts going on at Red Hat, IBM, OSADL, and Gleixner's company Linutronix.

Looking beyond the current code, Gleixner outlined two potential future features. The first is non-priority-based scheduling, which is needed to solve certain kinds of problems, but brings with it a whole new set of problems. Even though priorities are not used, there are still "priority-inversion-like problems" that will have to be solved with mechanisms similar to priority inheritance. Academics have proved that such schedulers can work on uni-processor systems, but have just now started to "understand that there is this thing called SMP". Though there is a group in Pisa, Italy (working on deadline scheduling) that Gleixner specifically excluded from his complaints about academic researchers.

The other new feature is CPU isolation, which is not exactly realtime work, but the realtime developers have been asked to look into it. The idea is to hand over a CPU to a particular task, so that it gets the full use of that CPU. In order to do that, the CPU must be removed from the timer interrupt and the RCU pool among other things. The problem isn't so much that users want to be able to run undisturbed for an hour on a CPU or core, but that they then want to be able to interact with the rest of the kernel to send data over the network or write to disk. In general, it's fairly clear what needs to be done to implement CPU isolation, he said.

Roadmap

[RT roadmap]

It is obvious that Gleixner is tired of being asked for a roadmap for the realtime patches. Typically it isn't engineers working on devices or other parts of the kernel who ask for it, but is, instead, their managers who are looking for such a thing. There are several reasons why there is no roadmap, starting with the fact that kernel developers don't use PowerPoint. More seriously, though, the realtime developers are making their own road into the kernel, so they are looking for a road to follow themselves. But, so that it could no longer be said that he hadn't shown a roadmap, Gleixner presented one (shown at right) to much laughter.

He also fielded quite a few audience questions about the realtime tree, what others can do to help it progress, and why some of the troublesome Linux features couldn't be eliminated to make it easier to get the code merged. In terms of help, the biggest need is for more testing. In particular, Gleixner encouraged people to test the realtime patches atop Greg Kroah-Hartman's 2.6.33 stable series.

Software interrupts are still required in various places in the kernel, in particular the network and block layers. Any change to try to remove them would require changes in too much code. On the other hand, counting semaphores are mostly gone, though some uses come in through the staging tree. Those are mostly cleaned up before the staging code moves out of that tree, he said. From time to time, he looks through the staging tree for significant new users of counting semaphores and doesn't really find any, so he is not concerned about those, but is more concerned about read-write semaphores.

As for the choice of 2.6.38 as the basis for the next realtime tree, Gleixner said that he picks the "most convenient" tree when making that decision. It depends on what is pending for the mainline, and what went into the various kernel versions, because he does not want to backport things into the realtime tree: "I'm not insane", he said.

The realtime tree got started partially because of a conference he attended in 2004 where various academics gathered there agreed that it was not possible to turn a general purpose operating system into a realtime one. He started working on it because of that technical challenge. Along the same lines, when asked what he would do with all the free time he would have once the realtime code was upstream, Gleixner replied that he would like to eliminate jiffies in the kernel. He has a "strong affinity to mission impossible", he said.

One should be careful about choosing the realtime kernel and only use it if you need the latency guarantees, he said. So smartphone kernels might not have any real need for such a kernel, he said. But if the baseband stack were to move to the main CPU, then it might make sense to look at using the realtime code. One "should only run such a beast if you really need it". That said, he rattled off a number of different projects that were using the realtime kernel, including military, banking, and automation applications. He closed with a short description of a gummy bear sorting machine that used the realtime kernel, and was quite fancy, but after watching it for a bit, you wouldn't want to see gummy bears again for a year.

Comments (2 posted)

Patches and updates

Kernel trees

Architecture-specific

Build system

Core kernel code

Development tools

Device drivers

Filesystems and block I/O

Memory management

Networking

Security-related

Virtualization and containers

Page editor: Jonathan Corbet

Distributions

The Amnesic Incognito Live System: A live CD for anonymity

April 27, 2011

This article was contributed by Koen Vervloesem

The Amnesic Incognito Live System (Tails) is a specialized live Linux distribution aimed at preserving the user's privacy and anonymity. It does this job primarily by forcing all outgoing internet connections to go through the Tor network, and by leaving no trace on local storage devices unless the user asks for this explicitly. Tails is the merger of two projects, Incognito LiveCD and Amnesia. The latest version, Tails 0.7, is built on top of Debian Squeeze (6.0) and bundles some applications with customized configurations to protect the user's privacy and anonymity. It can be run as a live distribution from a CD or USB stick.

Tails heavily relies on Tor for its anonymity, and it gives a warning on its website that Tor is still experimental software that cannot guarantee strong anonymity. Users that want to know what they can expect from Tor (and Tails) with respect to their anonymity are advised to read the About Tor page. Tails 0.7 bundles Tor 0.2.1.30. An additional warning is in order here: Tor does not support IPv6 yet.

When the live CD has booted into the graphical environment, the user is greeted by a fairly typical GNOME desktop environment, including access to applications like Gimp, Inkscape, Scribus, OpenOffice.org, Claws Mail, Iceweasel, Pidgin, Liferea, Audacity, Brasero, and so on. This is not a bare-bones distribution, but a system you could start working with immediately.

Tails gets its security updates from Debian's repositories, but the live CD doesn't automatically download updates nor alert the user to download them. So a manual

    sudo apt-get update && sudo apt-get upgrade

before each use of the live CD is needed to stay on the safe side, because the distribution doesn't support persistent storage. Of course users could also download a new ISO image from time to time, but that seems like a waste of bandwidth.

Anonymous browsing

The first thing that gives away the goal of this distribution is the Vidalia window, which is a graphical controller program for Tor that gets started after the user logs in. It shows the status of the user's connection to Tor, and has buttons that allow a user to view a bandwidth graph, a message log, or a map of the Tor network. It also has a button to start using a new identity for subsequent connections to make them appear as if they are coming from another computer. Vidalia's GNOME panel also gives access to some settings for Tor, but only advanced users who know what they are doing should change them.

The web browser Iceweasel 3.5.16 has the HTTPS Everywhere extension installed to automatically use HTTPS on many web sites, AdBlock Plus to browse ad-free, CS Lite to control cookie permissions, FireGPG to encrypt webmail messages, FoxyProxy which completely replaces Firefox's limited proxying capabilities, and the Monkeysphere extension to validate certificates via an OpenPGP web of trust. The offline cache and geolocation are also disabled in Firefox to prevent leaks. The latter means that requests from web sites that want to know the user's location are denied.

[Tails desktop] Out of curiosity, your author tried EFF's Panopticlick, which tests how unique the user's browser is based on the information it shares with visited web sites. Surfing to the web site with Iceweasel in Tails gives a slightly lower number of bits of identifying information, primarily because the browser has no plugins installed (which can be detected) and hence cannot expose the presence of Flash or of Java fonts.

Torbutton is also installed, but instead of being used to enable or disable Tor in Firefox, the Tails developers have customized the extension to enable or disable a lot of JavaScript stuff that could help pierce the user's anonymity. When the status bar indicates "Tor enabled", this extra protection is turned on; when the user toggles the status to "Tor disabled", the browser still uses Tor but without the additional protection. As expected, thanks to Tor, surfing on the web in Tails is noticeably slower compared to a direct connection. A test download of the Tails ISO image turned out to be roughly four times slower, and complex sites such as Gmail load noticeably slower too, but it's not unbearable: it's a price users may be willing to pay to be anonymous.

Security-conscious developers

Tails not only bundles privacy-preserving software and browser extensions, but the developers have also customized the Debian system and pre-configured many of its applications with security in mind. For instance, Tails is protected against memory recovery: on shutdown or when the boot medium is physically removed, the computer's memory is wiped. The process is explained in detail on the wiki. In short: when the memory erasure process is triggered, a new Linux kernel is booted with kexec and all free memory is overwritten once with zeros. This way, each part of the memory is either overwritten by loading the new kernel in it or erased explicitly once the new kernel is loaded.

Tails has configured its firewall to drop incoming packets by default and to forbid queries to DNS resolvers on the LAN, as this can result in leaks. DNS queries go through Tor instead. Automatic media mounting is disabled to protect against vulnerabilities, although the developers still think that manually mounting internal disks may be too easy.

The developers are really serious about security, as you can see on their security page. In the "Probable holes" section, they write:

Until an audit of the bundled network applications is done, information leakages at the protocol level should be considered as — at the very least — possible.

They clarify this on the applications audit page:

Any included networked application needs to be analyzed for possible information leakages at the protocol level, e.g. if IRC clients leak local time through CTCP, if email clients leak the real IP address through the EHLO/HELO request etc.

It's interesting to read what they have done to audit some applications and to change their default configuration in Tails appropriately. For instance, thanks to their Claws Mail configuration, the mail client doesn't leak the network's domain in the EHLO command, and HTML rendering is fully disabled. For Iceweasel, they rely on the security measures of the Torbutton extension. And if you're using Pidgin for IRC, CTCP is disabled completely to prevent leaking your local time. Their attention to detail is also visible when Tails is started in a virtual machine: the distribution shows a big warning that the host operating system and the virtualization software are able to monitor what the user is doing.

Contribute

Users are explicitly encouraged on the web site to contribute to Tails, and newcomers are not left out in the cold: there's a Git merge policy (with rules like "Documentation is not optional" and "Do not break the build"), extensive documentation about how to work on the code, information about the Git repositories, and there's even a list of easy tasks on the list of things to do. These tasks do not require deep knowledge of the Tails internals and should be a good starting point for newcomers to learn how to contribute to the distribution. Tails also has a good relationship with Debian and other upstream projects: it tries to diverge by the smallest possible amount from its upstream projects by pushing their changes upstream. For instance, it contributes to Debian Live on a regular basis.

The project's documentation is extremely comprehensive and in-depth, although sometimes a little out-of-date. Even the release process is spelled out in detail, as well as the tests that the developers try out to see that all programs work as they should. For instance, for Iceweasel they test whether web browsing is really "torified", and whether the exposed User-Agent HTTP header field matches the one that Torbutton generates.

Roadmap

The things to do list on the web site is long and unfortunately not that structured, so it's not easy to see which of these items have priority for the developers, but there's a concise roadmap. The next big feature will be persistence: although Tails is explicitly designed to avoid leaving any trace, in some circumstances it could be interesting to save (some) data, such as GPG/SSH/VPN/OTR configurations, instant messenger and mail user agent configurations, SSL certificates, and so on.

The developers are also working on a better way to install Tails on a USB stick. This is already possible, as Tails ships hybrid ISO images that can be copied using dd to a USB stick, but then the USB stick's content is lost in the operation and the unused storage on the stick is wasted. Also, such a USB stick cannot be used to host both a Tails installation and a persistent storage volume. The developers are evaluating ways to solve this.

The roadmap also lists some unordered goals. One of these is support for other architectures. For now, the Tails live CD image only comes in a 32-bit x86 version. However, the developers are already working on a PowerPC release for pre-Intel Macs. They have already built a PowerPC ISO, but still have to test it. The release candidate of next Tails release will probably have a PowerPC image, to be tested by users. Another goal is a better integration of Monkeysphere for validating HTTPS certificates using the GnuPG web of trust.

There are also some network-related goals. One of these is the support for Tor bridges. Merely trying to use Tor might be dangerous in more authoritarian countries, as the use of Tor can be detected. With Tor bridges, users may be able to hide the fact that they are communicating with the Tor network by relaying their Tor traffic through a bridge node which is not listed in Tor's directory. The Tails developers are thinking about adding an option to the boot menu, after which first the connection to the bridge is set up and only then Tor is started, using this bridge.

The project's wish list also mentions the idea of a two-layered virtualized system, which isolates applications in a virtual machine to prevent leaking the user's identity due to security holes. The page also looks at Qubes from the Polish security researchers at Invisible Things Lab, which has a similar architecture: it uses Xen to isolate applications in several virtual machines. One proposal on the Tails wiki is to build a next version of Tails on Qubes.

Easy-to-use anonymization

If you don't mind that your internet traffic is being monitored by your internet service provider, the police, or other surveillance agencies, you're not the target user of Tails. However, if you do mind, Tails might be just what you need. The developers have jumped through hoops to be able to preserve the anonymity of their users as best as possible. This distribution is an easy way to use Tor for surfing and it pre-configures a lot of applications so that you have to worry less about accidental information leaks.

Comments (2 posted)

Brief items

Mageia 1 beta 2 available

The second (and final) beta release for Mageia 1 (a fork of the Mandriva distribution) has been announced; the project is looking for lots of testing. "We froze the software package versions last week. This means that no new, big, upstream code changes will be accepted in Mageia until our final release in June; then we will re-open the doors. We will now focus on fixing and reducing our bugs lists and refining and polishing the user experience."

Comments (none posted)

Distribution News

Debian GNU/Linux

Debian Project mourns the loss of Adrian von Bidder

The Debian News page notes the passing of longtime developer Adrian von Bidder. "Adrian was one of the founding members and current secretary of debian.ch, he sparked many ideas that made Debian Switzerland be what it is today. Adrian also actively maintained software in the Debian package archive, and represented the project at numerous events. Even to those, who haven't worked with him directly, he was well known for his sometimes thoughtful, sometimes funny blog posts."

Comments (6 posted)

Newsletters and articles of interest

Distribution newsletters

Comments (none posted)

Poettering: systemd for Administrators, Part VIII

Lennart Poettering is back with another edition of "systemd for Administrators". In it, he outlines some changes to configuration filenames and locations as part of an effort to standardize them across distributions. "Many of these small components are configured via configuration files in /etc. Some of these are fairly standardized among distributions and hence supporting them in the C implementations was easy and obvious. Examples include: /etc/fstab, /etc/crypttab or /etc/sysctl.conf. However, for others no standardized file or directory existed which forced us to add #ifdef orgies to our sources to deal with the different places the distributions we want to support store these things. All these configuration files have in common that they are dead-simple and there is simply no good reason for distributions to [distinguish] themselves with them: they all do the very same thing, just a bit differently."

Comments (51 posted)

Barnes & Noble treats Nook Color to Froyo (ZDNet)

Here's a ZDNet article on the Nook Color 1.2.0 update. "There's no need to hack the Nook Color into an Android tablet anymore as B&N is giving out the power for free. The biggest feature found in the v1.2 firmware update is the inclusion of Android 2.2. Additionally, alongside this upgraded operating system, there is yet another mobile app store open for business: Nook Apps." Nook owners who don't want to wait for the over-the-air update can update manually anytime.

Comments (none posted)

Developer Interview: Ronald "wattOS" Ropp (Linux Journal)

Michael Reed talks with Ronald Ropp about his work on wattOS. "From the beginning, my intent for wattOS (which I first released in July 2008) was to create a simple, fast desktop that can leverage the large Debian/Ubuntu knowledge base and repositories. I've tried to keep it somewhat minimal, while being as functional as possible for the average user. I don't want them to have to do a ton of command line work just to do the basics such as web, email, music, video, print, photos, word processing, chat, etc."

Comments (none posted)

Spotlight on Linux: Toorox (Linux Journal)

Susan Linton shines a spotlight on Toorox. "Toorox is sometimes compared to another Gentoo-based distribution, Sabayon. This comparison may be legitimate on the surface, but differences emerge when looking deeper. Sabayon is indeed based on Gentoo as Toorox, but Sabayon is primarily a binary distribution. Package installation almost always involves installing binary Sabayon packages. While this is convenient and often preferred, Toorox compiles and install software from Gentoo sources. Toorox begins life on your computer as a binary installation with all its advantages, such as fast, easy, and ready at boot, but subsequent package installation compiles source packages. So Toorox is perfect for users that would like a source-based distribution, but don't want the initial time and effort investment. Either over time or with a all-at-once effort, one can fairly easily transform Toorox to a full source install."

Comments (1 posted)

Page editor: Rebecca Sobol

Development

MathML, Firefox, and Firemath

April 27, 2011

This article was contributed by Nathan Willis

The World Wide Web Consortium (W3C) blessed version 3.0 of the MathML standard as an official "recommendation" in October of 2010. There are no major surprises in the revision, but software support has only started to catch up in recent months. Firefox 4 is the first browser to support MathML 3.0 directly, and new releases of the Firemath equation editor and STIX fonts enable in-browser editing and rendering, respectively. Widespread use of MathML may continue to elude scientists and educators in the near term, however, due to lackadaisical support in the other major browsers.

Let there be math

MathML is an XML-based specification for serving and rendering mathematical notation on the web. There are two ways to mark up mathematics in MathML, known as "presentation" MathML and "content" MathML. Presentation MathML focuses on making expressions human-readable, for use in web pages intended only to be read and understood. As a result, its fundamentals offer considerable control over the layout of the math, but do not try to capture its semantic meaning. Content MathML is designed to encode the semantic mathematical meaning of every expression, so it could be parsed and correctly understood by a computer algebra system.

For example, it is sufficient in presentation MathML to write x2 as:

    <msup> <mi>x</mi> <mn>2</mn></msup>

which simply renders the 2 as a superscript. But content MathML must encode the meaning of that relationship, and thus uses a different set of entities entirely:

    <apply> <power/><ci>x</ci> <cn>2</cn> </apply>

Both markup systems delimit individual "tokens" including literal numbers, variables, and operators, plus separator characters such as parentheses, braces, fraction lines, and matrix brackets. In most cases, whether the notation or the functional structure is most important dictates which system the author should use, but they can be mixed together so long as sufficient care is taken.

[TeX vs. MathML]

Given that web-based computer algebra systems are scarce, presentation MathML is where most of the effort is focused. MathML's major historical competitor is TeX formatting, which predates MathML considerably. There are numerous conversion scripts to render TeX and LaTeX expressions into inline images, which was generally regarded as the "safe" option for self-publishing as well as for content management systems. In recent years, however, conversion scripts for transforming TeX notation into MathML are making a strong showing.

MathML 3.0 does not introduce major changes to MathML syntax, although there are new elements in presentation MathML to support elementary-level math notation, such as vertical addition, multiplication, and long division. The mstack element is used to align such stacked rows of digits and operators; the related msline and mscarries elements support horizontal lines between rows and special borrow/carry rows. The mlongdiv element is used to mark up a long division, with quotient, denominator, and properly-aligned intermediate calculation steps. Because long division notation varies considerably between regions, the element supports a longdivstyle attribute with ten variations.

Most of MathML 3.0's changes address outstanding roadblocks to its adoption in the field. For starters, MathML 3.0 is officially part of HTML5, which means it can be delivered inline in HTML documents. MathML 2 required that the document be XHTML, delivered by the web server as application/xhtml+html, and linked to the W3C's MathML DTD. Until more HTML5 browsers support MathML, the old DTD as well as a new Relax NG schema are still available, and MathML MIME types have been registered, which is expected to push forward adoption in office applications and other non-browser environments.

The specification's definitions of content MathML elements have also been reorganized to better explain their alignment with the OpenMath semantic algebra system — an alignment that was part of previous revisions of MathML, but poorly documented. Finally, 3.0 explicitly supports mixing MathML with bi-directional text, closing a longstanding bug with the right-to-left text layout used in Arabic and Hebrew.

Browser support

The Gecko 2.0 rendering engine at the heart of Firefox 4 supports the majority of presentation MathML 3.0, including tokens and general layout elements, and most (but not all) modifying attributes. Still unimplemented are the new elementary math layout elements, line-breaking attributes, and the ability to embed hyperlinks within MathML expressions. Mozilla maintains a status page where users can follow the progress of individual elements.

A more practical resource for those needing MathML support today is the demos page, which links to pages showcasing support for individual elements, plus tests of Firefox's CSS support for MathML. By default, Gecko renders different MathML tokens according to different rules. Single letters (which often appear as variables) are rendered in italics, while multi-character strings (which often represent functions, such as sin() or tan()) are rendered in non-italic style. But MathML allows the page author to style the typography of any element, which is useful for highlighting one term or part of an equation, but also provides fine control where necessary, such as to differentiate between n and n.

MathML itself supports writing non-alphanumeric symbols via HTML character codes, but to take advantage of them, the browser needs fonts that supply Unicode's mathematical operators and symbols. Mozilla's demos are written to support the limited character coverage of the Symbol font, but the project's recommended solution is to install the STIX font set, which provides a much larger set of symbols, and in multiple styles. Internally, Firefox uses the font.mathfont-family configuration variable to choose the font used to display MathML, which uses a "fallback" list of math-supporting fonts to select the most appropriate option, so neither the page author nor the user needs to manually set a font preference.

STIX also supports "stretchy" elements, such as integrals, radicals, vertical bars, various braces, and arrows that are expected to expand to whatever size is needed to delimit the accompanying expression. In late 2010, however, a regression crept in to Firefox's STIX support that broke stretchy separators. The fix required both a patch to Firefox (which made it in to the 4.0 release) as well as an update to the STIX fonts. The STIX fonts are available under the SIL Open Font License, but as of 2011 are provided only in OpenType format. Type 1 (for PostScript) and TrueType versions are on the roadmap for future releases.

Equation editing

Rendering support is excellent in Firefox, but MathML's syntax is still verbose enough that writing it by hand is infeasible for all but the simplest expressions. In addition to the weight added by marking up individual tokens, operators, and layout elements, MathML uses invisible elements such as mrow, mfenced, msub/msup/msubsup, and mover/munder/moverunder to carefully align the elements in every expression vertically and horizontally.

[Firemath]

Fortunately, the Firemath extension provides a WYSIWYG MathML expression editor as a Firefox add-on. The latest edition, version 0.4.0.2, was released in March, with support for Firefox 3 and 4. Once installed, Firemath can be launched from the Tools menu. The editor runs inside a tab or browser window; the basic usage scenario consists of creating an equation or expression in the Firemath tab, copying it to the clipboard, then pasting the result as MathML into a page or external editor.

As one might anticipate, the editor itself reserves a large swatch of space for a table of math operator and element buttons — enough are provided that the main library panel breaks them into six tabs. But the most common expressions and functions are separated out into a separate toolbar at the top-left. The editor component itself consists of three panes: the active WYSIWYG editor itself, a "live" render pane showing the contents of the editor, and a source view, which shows the underlying code.

The WYSIWYG editor displays a rendering of the current expression, too, but it places small pink dots at available entry points in the MathML. Some of the toolbar buttons insert operators or elements into the expression at these entry points, while others (such as sub- or super-scripts) require you to select an existing element first. Thus, writing an expression can be tricky if you are a Firemath newbie. The interface does its best to distinguish between these different classes of button by separating them with empty space, but it still takes some getting used to.

Editing an existing expression is also on the tricky side, as the keyboard arrow keys permit movement but not selection, and selecting the right element with the mouse can be difficult (particularly with stacked and nested elements). There is also a toolbar delete key, but it is not mapped to the keyboard delete, which seems like an oversight. In my own experiments, I found it frustrating that the normal copy-paste-cut and undo/redo functions are not supported in the editor, and that several of the toolbar buttons used bitmaps rather than text labels — bitmaps obviously designed for a light-colored background and unreadable in my dark GTK theme.

Still, one must expect to learn one's way around a new tool, and to that end the Firemath online documentation and examples proved helpful, but sparse. It is clear some of the editing conventions supported grew out of necessity, such as holding down the Control key while inserting "fence" elements such as braces. Using modifier keys to place objects on the screen is a common choice in graphics editors; it is simply uncommon for a text editor. On the other hand, a more complete glossary of the supported functions and operators would be helpful — not everyone will have the entire mathematical symbol Unicode block memorized — and it would be helpful for those less skilled in MathML to show the cursor position in the source code view as well as in the WYSIWYG editor. It is easy to misjudge a few pixels of vertical spacing on the pink dot, but much harder to misjudge the location of the actual HTML tags.

In addition to the basic elements, you can click the toolbar's "attribute" button to bring up a separate editor where you can change MathML attributes, including font script, size, foreground and background color, and even add user-defined CSS classes. This is also the only way to make horizontal spacing adjustments, through changing the attributes of the mspace element.

When your work is perfected in the editor, Firemath can copy it to the clipboard as MathML or as an image (in PNG or JPEG format), or save it with the same output options. The MathML output can be exported as fully marked-up HTML5, XHTML, an inline equation, or a block-level math element.

Firemath only produces presentation MathML, and at present has not added support for MathML 3.0's new elements. Its output, however, is valid in HTML5 documents for the syntax unchanged since MathML 2. You can also manually add newer attributes through the attribute editor, if desired, and Firemath will preserve them.

In just a few minutes of working with Firemath, the distinction between presentation and content MathML becomes crystal clear (if it wasn't already) — you can use the editor to craft any mathematical-looking expression your heart desires, even if it is semantic nonsense (such as this example). Forcing you into writing semantically correct math is not the job of a display technology like MathML; rendering it correctly with respect to size, position, and alignment is. Firemath makes it easy to write expressions the way you want them to look, which is all that an editor is responsible for doing.

Infinity and beyond

As Mozilla points out in its MathML documentation, one of the most exciting aspects of Firefox 4's MathML support is that mathematical expressions are now simply another part of the page. That opens the door to off-the-wall uses of MathML beyond generic in-line equations. The MathML demos show how to manipulate MathML with JavaScript, CSS, and the DOM, including tooltips, resizing expressions on mouse clicks, and head-scratchers like using images and form fields inside of equations.

For now, though, all of this fun remains essentially Firefox-only. Despite more than a decade of W3C Recommendation status, Firefox remains the only major browser to support rendering MathML in any meaningful sense. Opera can emulate some MathML with JavaScript and CSS add-ons, and a plug-in was available for earlier versions of IE — but it has not been maintained for IE 9. WebKit, interestingly enough, has MathML support in its renderer, but none of the WebKit-based browsers (including Chrome and Safari) enable it.

In the sciences, expecting visitors to use Firefox in order to see your published content is probably not a difficult hurdle, but online courseware packages like Moodle target a wider range of users, including school computer labs that may have locked-down environments. Moodle currently uses TeX to format equations in its content editor (rendering them to inline images), but as indicated on the wiki and elsewhere, there is growing interest in supporting MathML, either directly or via TeX-to-MathML converters. For other content management systems, the W3C maintains a list of known MathML plugins, including some for general-purpose CMSes like Wordpress, but the timestamps on many of the entries are getting dangerously old.

Despite those challenges, the MathML community is optimistic. David Carlisle from the MathML Working Group wrote in a blog posting in October that the adoption of the standard into HTML5 will constitute a "massive boost to getting Mathematics into web-based systems." It is said that the browser vendors' commitment to HTML5 will surely bring more support to MathML, which may be true, but so far Safari, Chrome, and IE have yet to make a public statement to that effect. Which means that at the moment, you have exactly one alternative for robust MathML rendering.

Comments (29 posted)

Brief items

Quotes of the week

When a large fraction of the world economy is run by the creations of lousy programmers, and when embedded systems are increasingly capable of killing people, do we raise the bar and demand that programmers pay attention to pointless details such as leap seconds, or do we remove leap seconds?
-- Poul-Henning Kamp

What are the fundamental differences between Gnome and KDE development? There's lots of little differences (2006: Gnome parties on a beach. Akademy has melted ice cream in the rain) but they're both basically communities made up of people who are interested in developing a functional and interesting desktop experience. So why do the end results have so little in common?
-- Matthew Garrett

Comments (2 posted)

CodeInvestigator 1.5.0

CodeInvestigator is a tracing tool for Python program development. "Program flow, function calls, variable values and conditions are all stored for every line the program executes. The recording is then viewed with an interface consisting of the code." Version 1.5.0 is out; changes include a number of user interface improvements, connections between print statements and the resulting output, and faster data collection.

Full Story (comments: 1)

GNOME 3.0.1 released

The first update to GNOME 3.0 has been released: "It contains the usual mixture of bug fixes, translations updates and documentation improvements that are the hallmark of stable GNOME releases." This is the last planned 3.0 release; development is moving on to the 3.2 series.

Full Story (comments: none)

Newsletters and articles

Development newsletters from the last week

Comments (none posted)

Interview: Michiel de Jong

The Free Software Foundation Europe has posted a fellowship interview with Michiel de Jong, creator of the Unhosted project (which was covered here in January). "In order to write Free Software, all that is required is the time and skills of the developers concerned. But there is no way to make Free Software available to the world online which doesn't involve a monetary cost, because doing so requires the use of servers, and whoever owns those servers will charge you a monthly fee. Our architecture for separating code and data, leaving the processing in the browser, fixes that: it makes it very cheap to host Free Software web applications because all you have to host is the application logic, the code files, not the data that drives it."

Comments (none posted)

First ownCloud Sprint (KDE.News)

KDE.News has a report on the first ownCloud sprint. ownCloud is a KDE initiative to put "cloud" services under the control of users. We looked at ownCloud 1.0 last June. "Probably the most important work was done on refactoring ownCloud's initial concept and work by Jakob Sack. These changes will help to make the codebase clearer, easier to maintain and a lot more flexible. ownCloud will be easy to extend with additional applications and plugins. In the very near future ownCloud's web-based storage system will be capable of being enriched by extensions like Music Streamers, Photo Galleries or PIM functionality — basically anything a PHP application can do."

Comments (1 posted)

Page editor: Jonathan Corbet

Announcements

Brief items

Fun patent application of the day: location history

It would appear that Apple has not just implemented a location database on its devices; it is trying to patent the entire concept. Claim 1 reads: "A computer-implemented method performed by a location aware device, the method comprising: configuring a processor of the location aware device to collect network information broadcast from a number of network transmitters over a time span; and storing the network information and corresponding timestamps in a database as location history data." Perhaps it is in everybody's interest not to challenge this one.

Comments (9 posted)

GNOME Project Announces new Outreach Program for Women Interns

The GNOME project is continuing its outreach program aimed at recruiting women. "The GNOME 3.0 release has far more contributions by women than any previous release in GNOME history. This is largely thanks to the hard work of the first round of the Outreach Program for Women interns, who participated in the program from December 15, 2010 to March 15, 2011. All eight participants had their work included in the main branches of their projects and therefore included in GNOME 3.0. Following on the heels of the successful first round, the GNOME Project is delighted to announce the participants of a new round of the Outreach Program for Women internships."

Comments (none posted)

Students announced for 2011 Google Summer of Code

Google has announced the students that have been accepted for this year's Summer of Code. "We have announced the 1,116 students that will be participating in this year's Google Summer of Code program. Students will now start the community bonding period where they will get to know their mentors and prepare for the program by reading documentation, hanging out in the IRC channel and familiarizing themselves with their new community before beginning their actual coding at the end of May."

Comments (2 posted)

MIPS Technologies Now Porting Android 'Honeycomb' Platform

MIPS Technologies has announced that it has official source access to Android 3.0 also known as "Honeycomb." MIPS Technologies is now porting Honeycomb to the MIPS architecture.

Full Story (comments: none)

The Open Invention Network announces lots of new members

The Open Invention Network has announced that 74 companies have become licensees in the first quarter of 2011. "During the first quarter, OIN welcomed companies such as Hewlett-Packard, Facebook, Juniper, Fujitsu General, and others. We've also successfully licensed a number of the leading Linux distributions in the first quarter. Finally, we greatly appreciate the large number of activist individuals who signed an OIN license in the quarter."

Comments (none posted)

The WebM Community Cross-License Initiative launches

The WebM video format project has announced the WebM Community Cross-License Initiative, a sort of patent pool for WebM users. "CCL members are joining this effort because they realize that the entire web ecosystem--users, developers, publishers, and device makers--benefits from a high-quality, community developed, open-source media format. We look forward to working with CCL members and the web standards community to advance WebM's role in HTML5 video." There are 17 members at the outset.

Comments (21 posted)

Articles of interest

Barnes & Noble Charges Microsoft with Misusing Patents (Groklaw)

Groklaw has the response by Barnes & Noble to Microsoft's Android-related patent suit. "Microsoft has a scheme, Barnes & Noble asserts, to dominate Android and make it undesirable to device manufacturers and customers by demanding 'exorbitant license fees and absurd licensing restrictions' -- a license fee that it says is more than Microsoft charges for its entire operating system for mobile devices, Windows 7. Others have, it believes, signed it. Barnes & Noble says the deal with Nokia is in furtherance of this scheme."

Comments (3 posted)

1+1 (pat. pending) - Mathematics, Software and Free Speech (Groklaw)

Groklaw has posted a lengthy guest article arguing that software is simply mathematics and should not be subject to patents. "I will now substantiate the idea that software is mathematics. Let's cast aside the effect of the real world semantics on the patentability of mathematics for the moment. I will return to this question when I explain why mathematics is speech. Then I will explain that patenting a mathematical computation of the basis of its semantics is granting exclusive rights on speech. For now the focus is on showing that the patented software method is always a mathematical algorithm."

Comments (25 posted)

The FTC weighs in on patent reform (opensource.com)

Over at opensource.com, Red Hat's VP and assistant general counsel Rob Tiller writes about a US Federal Trade Commission report [PDF] on some of the problems with software patents. "The FTC report recognized that in the IT industry, it's virtually impossible to do clearance searches to verify that new products don't infringe existing patents. The lack of clarity in individual patents combined with the sheer numbers of existing patents make clearance cost prohibitive. Because IT products can contain a large number of components that might each be covered by one or more patents, the number of potentially applicable patents is also large. The report went so far as to characterize clearance as 'a virtual perfect storm of difficulties.'"

Comments (11 posted)

Google Linux servers hit with $5m patent infringement verdict (The Register)

The Register is reporting that Google (and Linux) has been found to infringe on a patent on open hashing with automatic expiration—not the most innovative idea ever, even in 1997 when the patent was filed. The damages, $5M, seem rather small in comparison to some that we have seen coming out of east Texas. "Asked to comment, a Google spokeswoman said: 'Google will continue to defend against attacks like this one on the open source community. The recent explosion in patent litigation is turning the world's information highway into a toll road, forcing companies to spend millions and millions of dollars defending old, questionable patent claims, and wasting resources that would be much better spent investing in new technologies for users and creating jobs.'"

Comments (81 posted)

New Books

Crafting Rails Applications--New from Pragmatic Bookshelf

Pragmatic Bookshelf has released "Crafting Rails Applications", by Jose Valim.

Full Story (comments: none)

Calls for Presentations

KVM Forum 2011: Call For Participation

The KVM Forum will take place August 15-16, 2011 in Vancouver, Canada, co-located with LinuxCon North America. The call for participation is open for technical talks, end-user talks and birds of a feather sessions, until May 16.

Full Story (comments: none)

Upcoming Events

Events: May 5, 2011 to July 4, 2011

The following event listing is taken from the LWN.net Calendar.

Date(s)EventLocation
May 3
May 6
Red Hat Summit and JBoss World 2011 Boston, MA, USA
May 4
May 5
ASoC and Embedded ALSA Conference Edinburgh, United Kingdom
May 5
May 7
Linuxwochen Österreich - Wien Wien, Austria
May 6
May 8
Linux Audio Conference 2011 Maynooth, Ireland
May 9
May 11
SambaXP Göttingen, Germany
May 9
May 10
OpenCms Days 2011 Conference and Expo Cologne, Germany
May 9
May 13
Linaro Development Summit Budapest, Hungary
May 9
May 13
Ubuntu Developer Summit Budapest, Hungary
May 10
May 13
Libre Graphics Meeting Montreal, Canada
May 10
May 12
Solutions Linux Open Source 2011 Paris, France
May 11
May 14
LinuxTag - International conference on Free Software and Open Source Berlin, Germany
May 12 NLUUG Spring Conference 2011 ReeHorst, Ede, Netherlands
May 12
May 15
Pingwinaria 2011 - Polish Linux User Group Conference Spala, Poland
May 12
May 14
Linuxwochen Österreich - Linz Linz, Austria
May 16
May 19
PGCon - PostgreSQL Conference for Users and Developers Ottawa, Canada
May 16
May 19
RailsConf 2011 Baltimore, MD, USA
May 20
May 21
Linuxwochen Österreich - Eisenstadt Eisenstadt, Austria
May 21 UKUUG OpenTech 2011 London, United Kingdom
May 23
May 25
MeeGo Conference San Francisco 2011 San Francisco, USA
June 1
June 3
Workshop Python for High Performance and Scientific Computing Tsukuba, Japan
June 1 Informal meeting at IRILL on weaknesses of scripting languages Paris, France
June 1
June 3
LinuxCon Japan 2011 Yokohama, Japan
June 3
June 5
Open Help Conference Cincinnati, OH, USA
June 6
June 10
DjangoCon Europe Amsterdam, Netherlands
June 10
June 12
Southeast LinuxFest Spartanburg, SC, USA
June 13
June 15
Linux Symposium'2011 Ottawa, Canada
June 15
June 17
2011 USENIX Annual Technical Conference Portland, OR, USA
June 20
June 26
EuroPython 2011 Florence, Italy
June 21
June 24
Open Source Bridge Portland, OR, USA
June 27
June 29
YAPC::NA Asheville, NC, USA
June 29
July 2
12º Fórum Internacional Software Livre Porto Alegre, Brazil
June 29 Scilab conference 2011 Palaiseau, France

If your event does not appear here, please tell us about it.

Page editor: Rebecca Sobol


Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds