LWN.net Logo

LWN.net Weekly Edition for April 8, 2010

IBM and the labors of TurboHercules

By Jonathan Corbet
April 6, 2010
Once upon a time, IBM was seen as the dark force in the computing industry - Darth Vader in a Charlie Chaplin mask. More recently, though, the company has come across as a strong friend of Linux and free software. It contributes a lot of code and has made a point of defending against SCO in ways which defended Linux as a whole. But IBM still makes people nervous, a feeling which is not helped by the company's massive patent portfolio and support for software patents in Europe. So, when the word got out that IBM was asserting its patents against an open-source company, it's not surprising that the discussion quickly got heated. But perhaps it's time to calm down a bit and look at what is really going on.

The story starts with the Hercules emulator, which lets PC-type systems pretend to be IBM's System/370 and ESA/390 mainframe architectures. Hercules is good enough to run systems like z/OS or z/VM, and, according to the project's FAQ, it has been used for production use at times, even if that's not its stated purpose. The project is licensed under the OSI-certified Q Public License.

Enter TurboHercules SAS, which seeks to commercialize the Hercules system. The company offers supported versions of Hercules - optionally bundled with hardware - aimed at the disaster recovery market. Keeping a backup mainframe around is an expensive proposition; keeping a few commodity systems running Hercules is rather cheaper. It's not hard to imagine why companies which are stuck with software which must run on a mainframe might be tempted by this product - as a backup plan or as a way to migrate off the mainframes entirely.

The problem is that systems like z/OS and z/VM are proprietary software, subject to the usual obnoxiousness. In particular, IBM's licensing does not allow these systems to be run on anything but IBM's hardware. So when TurboHercules tried to get IBM to license its operating system to run on Hercules-based boxes, IBM refused. TurboHercules responded by filing a complaint with the European Commission alleging antitrust violations. According to TurboHercules, IBM's licensing restrictions amount to an illegal tying of products.

One need not agree with IBM's position to understand it. IBM understands well the power of commoditizing its competitors' proprietary technology - that's what its support for Linux is all about, in the end. Emulated mainframes running on generic Linux or Windows boxes can only look like an attempt to commoditize one of IBM's cash cows. The fact that this product requires running IBM's proprietary software gives the company a lever with which to fight back. Whether one feels that refusing to license that software in this situation is a proper action or not, one should agree that it's unsurprising that IBM exercised that option.

TurboHercules evidently sent IBM a letter questioning whether IBM actually owned any useful intellectual property in this area. IBM responded with a letter listing 175 patents owned or applied for, all of which are said to apply to IBM's mainframe architectures. Two of these patents, it turns out, are on the list of patents which IBM explicitly pledged not to assert against the free software community.

To many, this looked like the dark side of IBM coming through at last. Florian Mueller wrote:

This proves that IBM's love for free and open source software ends where its business interests begin. In market segments where IBM has nothing to lose, open source comes in handy and the developer community is courted and cherished. In an area in which IBM generates massive revenues (an estimated $25 billion annually just on mainframe software sales!), any weapon will be brought into position against open source. Even patents, which represent to open source what nuclear arms are in the physical world.

Those are strong words, and they strike a chord with anybody in the community who is concerned about the software patent threat. But it is also worth considering IBM's response, as reported in this eWeek article:

In response to a query from eWEEK, IBM issued the following statement: 'IBM sent TurboHercules a non-exhaustive list of patents that pertain to our mainframe technology. We did not make any explicit assertions or claims that TurboHercules had violated them. We were merely responding to TurboHercules' surprise that IBM had intellectual property rights on a platform we've been developing for more than 40 years. We stand behind the pledge we made in 2005, and also our rights to protect our significant investments in mainframe technology.

There are a couple of ways in which one could interpret this statement. Perhaps somebody within IBM has realized that this whole affair does not look very good and has started furiously backpedaling. Or, perhaps, it should be accepted on its face; there is, indeed, no assertion of infringement in any publicly-available communication from IBM - though the March 11 letter, listing patents "that would, therefore, be infringed" comes close. Either way, perhaps this whole thing has been blown just a little bit out of proportion.

At this time, what we have is an argument over proprietary software licensing and European antitrust law. IBM has engaged in the sort of unpleasant behavior which is common to proprietary software; it is a classic example of why many of us try to avoid dealing with that world whenever possible. In response, a formal complaint has been brought against IBM, which has responded with some intemperate rhetoric claiming that TurboHercules is a Microsoft-funded "cheap knock-off" of its mainframe products. And while the waving around of patents is disconcerting, no direct assertion of patent infringement has been made. If IBM were to make such an assertion against Hercules, its credibility and trust within the free software community would suffer considerably. Until that happens, though, it might be best to avoid jumping to conclusions and encourage these companies to work out their proprietary software squabble on their own.

Comments (57 posted)

Ogg and the multimedia container format struggle

April 7, 2010

This article was contributed by Nathan Willis

Video codecs attract most of the attention in the multimedia format wars, from Theora adoption in HTML5 to debates about the subjective quality and objective technical demands of Dirac versus H.264. But the oft-overlooked container format is just as important; it adds overhead, it determines seekability, subtitle support, and other important features, and it can introduce patent-licensing issues for open source projects. Xiph.org's Ogg container format is the most well-known in open source, though as recent events show it has its critics and its competition.

Ogg has been under development since the beginnings of Xiph.org in 1994, and was originally designed for use with the Vorbis audio codec. As the Xiph project undertook additional codecs, Ogg continued to evolve to support them. FFmpeg developer Mans Rullgard posted a lengthy criticism of Ogg on his personal blog in March, accusing the format of falling short in six areas: poor generality, excessive overhead, high end-to-end latency, lack of random-access seeking, ill-defined timestamps, and unnecessary complexity. Rullgard cites examples from the specification and several "typical usage" numbers to support his claims.

Ogg debates

As the blog post was picked up, debate about the merits of the complaints quickly erupted in the comments and on web discussion forums. Several commenters chided Rullgard for claiming that Ogg's latency and seek times were unsuitably "bad" without citing any numbers from other formats for comparison, and for overstating the size of the problem (such as the overhead created by Ogg's headers at close to 1%) or its relative importance.

On some points, there was more of a genuine disagreement on principle, however. Rullgard said that Ogg wastes space by using a full byte for the "version" field, where a single flag bit would suffice. Xiph's Greg Maxwell explained in the Reddit discussion of the article that a byte is used for the field in order to keep the header structure byte-aligned for simplicity. Maxwell also disagreed with Rullgard's assertion that Ogg's 32-bit checksum was a waste of space, noting that Ogg also uses a 32-bit capture pattern at the beginning of each page, as opposed to the 64-bit capture pattern in FFmpeg's NUT format — thus using the same number of bits, but providing error-detection "for free."

Rullgard also argues that Ogg's ability to concatenate different streams one-after-another creates undue complexity for the decoder, without providing any practical benefit. But one blog commenter claimed to take advantage of this feature when ripping CDs with seamless track transitions.

Xiph's Christopher "Monty" Montgomery replied at length in the Slashdot discussion of the critique, admitting that Ogg has its flaws, and conceding that several design decisions made years ago would be made differently today, but attributing more of Rullgard's complaints to long-standing bad blood between the projects. Moreover, he said, even with its flaws, Ogg remains the best free option for important cases like streaming video. Neither the popular MP4 nor Matroska container formats are well-suited for streaming (particularly live streaming), and MP4 is also patent-encumbered. Additionally, he said, making changes to the Ogg format as suggested by Rullgard might improve performance at high-bitrate video, but would be detrimental to low-bitrate and audio payloads where Ogg excels.

Further work

Montgomery said that after the Rullgard blog post gained attention, Xiph decided that part of the problem with its reception was poor documentation on the Ogg format. He subsequently began rewriting and expanding the documentation, some of which is already available online. There are changes that Xiph would like to make, he added, as well as ongoing work in the metadata layer. "One of the legitimately weird things about Ogg is that we knew metadata was going to be a source of constant flux, so we moved as much as we possibly could out of the container itself. The Ogg container only does framing and delivery. [ ...] Most folks are used to this being part of the container, and so consider it 'part of Ogg' which it isn't really."

The Ogg Skeleton project is the primary focus in this area. Skeleton is essentially a "metadata track" that can hold information like MIME-types, protocol messages, and timestamps to allow the decoder to easily seek within the media. A Skeleton track could then be multiplexed or interleaved within an Ogg container file, alongside video and audio tracks.

Skeleton's timestamping capabilities are documented at the Ogg Index page, and are introduced in Skeleton 3.3. A sample indexer called OggIndex is available, and both the ffmpeg2theora converter and development builds of Firefox support it.

Montgomery concludes his Slashdot comments by noting that breaking compatibility with the existing hardware and software Ogg decoders (most of which see only Vorbis and Theora content) is probably not going to happen until the next major new codec release from Xiph.org.

The competition

Regardless of whether one finds any or all of Rullgard's criticisms valid, there are other container format options out there for content providers. The most popular on the Internet writ large is the .MP4 file, which is properly known as MPEG-4 Part 14, and was approved by ISO as ISO/IEC 14496-14:2003. A part of the larger MPEG-4 specification, MPEG-4 Part 14 is a revision of two earlier standards, MPEG-4 Part 1 and MPEG-4 Part 12. Part 14 is based on the QuickTime container format created by Apple and recognized by the .MOV file extension. It can hold content in any codec (including free codecs like Vorbis and Theora), although there is a "registered" codec list maintained by the MPEG.

There is a degree of uncertainty regarding the ability to write MPEG-4 Part 14 decoders, however. The rest of the MPEG-4 specification, like all MPEG standards, is patented, and implementations must adhere to the license terms made available by the MPEG-LA licensing authority. Part 14 was once available as part of the MPEG-4 Systems patent pool, which has subsequently been withdrawn. Many people on mailing lists and discussion forums assume that the format is free to implement since it is not explicitly mentioned in the remaining MPEG-4 patent pools: "MPEG-4 Visual" and "AVC/H.264," but this is not officially stated. MPEG-LA makes it difficult to find specific information about specific patents in its technologies, preferring instead to steer all customers into the "patent pool" products instead. The ISO specification, which should document specific patent claims, is only available to paying customers. When asked, MPEG-LA representatives said that they did not know the specific status of Part 14 in the current patent pools.

The Matroska format, like Ogg, was created to serve as an open, patent-unencumbered container. The two formats do differ in emphasis, a fact that both projects readily acknowledge. Whereas Ogg was designed alongside Vorbis with streaming audio as its primary use case, Matroska was designed to support as many codecs as possible. Xiph.org says that Matroska has better support for seeking, editing files, and using menus and chapter markers, while Matroska says that Ogg is superior at streaming media delivery (for example, Matroska only recently added support for interleaving frames from different tracks).

Matroska was launched in 2002 as a derivative of the older Multimedia Container Format (MCF). The copyright on the specification and the trademark of the name Matroska are both held by CoreCodec, Inc., but the specification is available free-of-charge. A reference library is available for download under the LGPL, and a "core parser" is offered upon request under the BSD license. The format is generally seen with the .MKV file extension for video content, although .MKA for audio is also valid.

The NUT file format Maxwell mentioned on Reddit was created by developers on the FFmpeg and MPlayer teams, but appears to be supported only in that project. The NUT project site is sparse, with a broken link to the actual specification, but there is a mailing list that indicates that development is still underway, albeit slowly. Montgomery described it as very Ogg-like in its design, incorporating some design choices that would improve Ogg, such as a simpler way of encoding the packet-length in each header (which was one of Rullgard's complaints).

Container formats are far less exciting than multimedia codecs, but the choice of containers has a very real impact on what a content provider can do. Quickly and accurately seeking within a file — while important — is just one example; another active topic right now is support for subtitle tracks. As multimedia content on the Internet grows, having subtitles accessible in their own track (or tracks, with multiple languages supported) has implications for accessibility, internationalization, and subtitle-based searching. For the record, Ogg, MP4, Matroska, and NUT all support subtitles.

As usual, the right choice depends on the usage. To some, non-free formats like MP4 ought to be avoided at all costs, even if MPEG-LA is not likely to request licensing fees. If streaming, audio-only, or low-bitrate performance are important, Ogg remains the simplest and probably best option. For seeking, video editing, menu/chapter support, or combining a wide array of codecs, Matroska offers functionality Ogg cannot. Moving forward, the relative weight of those factors may shift as either the codecs or the container formats evolve — but until then, choice is good.

Comments (42 posted)

On projects and their goals

By Jonathan Corbet
April 5, 2010
Recently, we have seen two projects come under considerable criticism for the development directions that they have taken. Clearly, the development space that a project chooses to explore says a lot about what its developers' interests are and where they see their opportunities in the future. These decisions also have considerable impact on users. But, your editor would contend, it's time to give these projects a break. There is both room and need for different approaches to free software development.

The Subversion project recently posted a "vision and roadmap proposal" describing where this popular source code management system can be expected to go in the future. The Subversion developers have made some clear decisions; these include not even trying to compete in the distributed version control system space, a reworked storage layer, rename tracking, better conflict handling, and more. The mission of the Subversion project is not to chase after the flashier distributed systems which are displacing it in a number of contexts; instead, Subversion will exist to serve the needs of users who feel the need for a simple tool with a centralized repository.

This announcement drew well over 100 comments on LWN, and similar numbers elsewhere. Quite a few of them were of the "here's a nickel, get a real SCM" variety; it seems that many see Subversion as old, unfashionable, and past its expiration date. Others were critical of Subversion's users, claiming that there's no reason why they couldn't move to a proper distributed system like all of the cool people have. Quite a few people, it seems, would be happy to see Subversion curl up and die; others think that the decision not to pursue distributed features will cause that to happen.

But there are plenty of distributed version control systems out there, a few of which have accumulated substantial user and developer communities. The Subversion developers are right to believe that they would be hard put to create a credible offering in that "market" at this point; they would have to create something which is demonstrably better than the existing systems, bearing in mind that those systems are improving quickly. Beyond that, there truly are large numbers of Subversion users who are mostly happy with what they have. Those users may have "look into distributed version control" on their long-term to-do lists, but, meanwhile, they have projects to manage. They are best served by a plan which calls for improvements in the Subversion they are using now.

Subversion is mature software. There will certainly be no shortage of things which can be improved in it, but its period of rapid development may be well behind it. There is nothing wrong with the developers saying so; in fact, there is much to commend there. Developers looking for fast-moving, distributed systems have a variety of offerings to choose from. Subversion, instead, will focus on what it does best: better serving the users that it has now. It seems entirely likely that there will be quite a few of them for some time yet.

Here, instead, we're told that users like the way things are now, and that trying to make changes is a mistake. A very different discussion has surrounded the minor user interface changes planned for the upcoming Ubuntu 10.04 release. Here, instead, we're told that users like the way things are now, and that trying to make changes is a mistake. It's tempting to throw all of the complaints into the "bike shed" category, but this is a shed that Ubuntu users will be staring at all day long. These changes risk creating gratuitous differences between distributions and causing confusion in users who are used to finding their window buttons in a different place. Might not it be better to leave well enough alone?

Note the difference, though: while there is probably limited scope for innovations in the problem space that Subversion has chosen for itself, anybody who tries to argue that our desktop system usability problems have been solved will face a skeptical audience indeed. We have come a long way, but "usability" as a problem in general is far from solved. It makes sense to be conducting experiments in this area, especially for a distribution like Ubuntu, which has always had a focus on desktop usability. Other Ubuntu experiments - less intrusive desktop notifications, for example - have found their way into other distributions as well.

This line of reasoning can be taken farther: we desperately need more experimentation with usability in the free software space. We have spent years trying to catch up to proprietary alternatives; this work, for the most part, is done. At this point, we can focus with trying to match usability changes made by others, or we can try to come up with interesting new stuff of our own. Your editor clearly prefers the latter.

Given the scale of the problem, the biggest complaint with moving window buttons to the left might well be: why spend so much time and energy on such little things? We're not at the stage where we work for months to yield a 1% improvement; it's time to be a bit more bold. Projects like MeeGo seem much more interesting in this regard; those developers are seriously trying to rethink how specific groups of users will use their computers in the future. Android, too, has done some interesting work toward the creation of finger-friendly interfaces. And so on. That is the kind of experimentation we need to have.

Two other criticisms have been aimed at the Ubuntu changes: that user interface changes require the participation of human-computer interaction experts, and the top-down decision mechanism is not particularly community-oriented. On the first charge your editor - who made human-computer interaction the focus of his Master's degree work - has a bit of sympathy. But that claim also sounds vaguely reminiscent of the SCO Group's assertion that the Linux community could never have come up with an enterprise-class server operating system on its own; one should never underestimate what our community can do. In the end, the real key to usability is to pay attention to the users. Free software developers have a high degree of access to their users; those who take advantage of that access will have a higher chance of creating successful interfaces.

Beyond that, we do also have usability experts in our community.

On the second charge: undoubtedly Mark Shuttleworth's ability to direct Ubuntu by decree will be irksome to some. The "behind closed doors" nature of some Ubuntu development is also annoying and detrimental to the creation of a true developer community. On the other hand, it's a rare distribution which makes these decisions in a democratic way; even Debian doesn't hold general resolutions on window button placement. There comes a time when it's best to make the decision and move on; individual users can always fix things they don't like.

In summary: Ubuntu could certainly be more open about the changes it is trying to make and, perhaps, more open-minded about accepting input from its user community. But Ubuntu's work toward improving usability is desperately needed, and any interaction changes are certain to upset some users. Even if the specific change in question is not necessarily the best, experimenting with this kind of change needs to be done, regardless of the inevitable complaints.

More generally, every project has to have some idea of the problem it is trying to solve. In some ways, that's a far more important part of a project than any specific body of code or any specific developer. One of the best things about free software is that it's alive; it will evolve and, with any luck, be better tomorrow. A project's goals say a lot about how it can be expected to evolve. In your editor's opinion, both Subversion and Ubuntu have set worthwhile goals, and both seem to be trying to work toward those goals. These are good things; our community is richer for the existence of both.

Comments (64 posted)

Page editor: Jonathan Corbet

Security

Enabling Intel TXT in Fedora

By Jake Edge
April 7, 2010

Intel's Trusted Execution Technology (TXT) has always been somewhat controversial because it enables the complete lockdown of a computer system. For the DRM-loving crowd, that is seen as a feature, of course, but others, who might want to make their own choices about what code runs on their hardware, do not see it quite the same way. TXT was added to Linux in 2.6.32, without much in the way of complaints—though there were some concerns about protests—now Fedora is discussing whether to enable it for its kernels. The sticking point is not the DRM-lockdown that TXT allows, but is, instead, the fact that it requires an opaque binary "blob" in order to operate.

TXT is a means for ensuring that the code running on a system is what is intended to be run there. By looking at all of the code that the system runs, including things like BIOS, option ROMs, the bootloader, the kernel, and the initrd image, TXT can determine whether any of that code or data has been altered. The idea is to protect the integrity of the system as a whole, and to thwart rootkits or offline attacks, such as swapping in a new hard disk or BIOS for systems like voting machines, medical devices, or ATMs. As mentioned, though, it can also be used to ensure that only code signed by some authority is allowed to run on the device. For ATMs, that's probably a good thing, but if it becomes widespread, it could become a serious impediment to freedom.

As described in an article from a year ago, there are two separate components that collaborate to provide the TXT integrity checking: the tboot "trusted boot" hypervisor and an Authenticated Code Module. The latter, often referred to as the "SINIT AC", is distributed as a binary-only object, which is signed by Intel.

Because there is no source available for SINIT AC—even if there were, without Intel's keys users couldn't build and use their own—some Fedora developers are leery of enabling TXT in Fedora kernels. Stephen Smalley's request to enable TXT, which he sent to the fedora-kernel list in October 2009 shortly after TXT was added to the kernel, was quickly shot down. Eric Paris explained:

After some discussion with a couple of people on the Fedora kernel team on IRC they decided that we should not enable CONFIG_INTEL_TXT until it is useful for something other than a closed source binary blob which Fedora is unable to distribute. We have messaged that Fedora was unable to include the binary blob from Intel and it has been suggested that they create an open module rather than forcing Linux users to trust some part of their system security to an unknown binary blob. Hopefully you can add your weight to that discussion and help intel see the need for an open source blob.

More recently, IBM has agreed to move the blob into the BIOS of its xSeries servers. That would alleviate the problem of needing to ship a binary blob to make TXT work—though it does nothing to open up the code, of course. But, that announcement led Paris to reopen the discussion on enabling TXT. In a fairly long message, he lays out the case for enabling the feature. Because xSeries users will be able to use TXT without installing the Intel blob, he sees it as a desirable feature for Fedora:

This config option allows a user to download new (open source) software (tboot) along with other third party software to verify the correctness of the BOOTED system. This allows us to build future solutions such as utilizing the TPM chip in many laptops to harden the disk encryption key. It can be used as root of trust for the verification of the software originally loaded on a machine before it is allowed network access (aka machines with a rootkit couldn't get on the network.) The technology can also be extended to provide usefulness to system integrity checkers like aide or IMA for tamper proof software integrity logging. These are all things which are impossible to do with today's kernels.

But Fedora engineering manager Tom "spot" Callaway is less enthusiastic. He notes that IBM is just taking the same binary blob and stuffing it into the BIOS. He is also concerned about supporting Fedora users:

For the rest of the x86/x86_64 computing universe, this means binary blobs, and I think you're fooling yourself if you think that all the other hardware vendors will be so willing to shove prebuilt code from a third party into their BIOS (or even have room to do so).

In the non-IBM Xseries case (which is by far, the more common one for Fedora), we would be enabling this option solely to enable proprietary binary blobs during the boot process. In my opinion, given that it is not possible at all for us to troubleshoot or bug fix systems in such a scenario, we should not imply to our userbase that it is supportable by enabling this kernel option.

Smalley thinks that getting TXT into Fedora would allow more testing, but Callaway isn't convinced that's necessarily a good thing:

We enable this in Fedora. This sends a message to Fedora's users that altering their bootup configuration to support SINIT (whether loaded from BIOS or from a binary-only blob that Intel will be so happy to provide) is _Supported_.

And then, it breaks. And we get bugs filed. Which we have absolutely 0 chance of being able to fix.

Others see the SINIT AC blob as no different than the firmware blobs that are required to make various hardware function—and are shipped by Fedora. Callaway counters that the firmware "is the only way to enable that hardware to work." But, as Chris Wright points out, that leads to an inconsistency: "And TXT needs SINIT AC to work. It's just inconsistent reasoning."

If the proposal were to distribute SINIT AC with Fedora, the situation would be more "analogous", Callaway said, "but Intel already tried that, and it doesn't meet the strict guidelines we have defined in Fedora for what is considered acceptable firmware". Red Hat has apparently tried to convince Intel to open up the SINIT AC code, but without success.

The core difference, at least in Callaway's mind, seems to be that users will be depending on this code, which they cannot inspect, for the security of their systems. Faulty firmware for other hardware may make the system unstable or fail entirely, but that firmware isn't vouching for the security of the whole system as the SINIT AC does. TXT "requires that we explicitly trust a third party vendor for security. [...] This makes me extremely uncomfortable, and also makes me wonder why the NSA seems comfortable with such a scenario in practice."

Callaway is referring to the US National Security Agency (NSA), which is where Smalley works. But, as Smalley points out, adding TXT doesn't really change anything: "And you were already dependent on Intel for correct operation of their hardware. Nothing new to see here, move along..."

Red Hat's James Morris, who seems a bit surprised that the TXT code made it into the kernel without any ACKs from the security subsystem folks, is also a bit concerned about the secrecy surrounding the code: "I really hope the secrecy of the AC module is not part of its security design." He also noted that bugs in the SINIT AC were recently used to break TXT, but he doesn't see any technical barriers to enabling it in the Fedora kernel. The security of TXT is not reliant on "keeping the SINIT module closed source", according to Smalley, but Intel "adamantly" refuses to open source it, Callaway said.

It's not clear why Intel is being so secretive, nor why there isn't support for other signing keys on AC modules. That, at least, would allow others to potentially create alternative AC modules. Intel may believe that "security through obscurity" will help prevent exploits, though there is good reason to believe that it won't—and hasn't.

No conclusion was reached in the thread, though one would guess that Callaway's opinions would carry a fair amount of weight. Had Intel originally placed SINIT AC in the BIOS, rather than providing it as a separate—and separately upgradable—component, it seems likely that this issue would not have reared its head. Certainly users who really want TXT support can build their own kernels, as was suggested, but then they will be on their own for support. That may not be much of an issue for Fedora users, who don't have much of a support plan beyond what the distribution provides, but it will affect RHEL users—and that may be the real target of this effort.

Depending on hardware vendors for security solutions is not without pitfalls, but we are already dependent on them for the correct functioning of our systems, which includes security. It's a question of how far one wants to follow the rabbit hole. Until there are fully free hardware solutions, there will always be hardware dependencies. Its hard to imagine that RHEL, at least, doesn't get TXT support at some point; Fedora would make a good testbed for that support.

Comments (13 posted)

Brief items

Microsoft: Google Chrome doesn't respect your privacy (ars technica)

Ars technica reports on a Microsoft offensive against Google's Chrome browser, which is contained in a video presentation by IE product manager Pete LePage. While some of the complaints are, perhaps unsurprisingly, disingenuous, there is a real privacy issue in the way that Chrome handles the address bar. With only one box to type in, Chrome sends all keystrokes, even when typing a URL, to the search provider, which potentially leaks information about which sites are being visited. "It's worth taking a closer look at LePage's first accusation. Even though he didn't really elaborate, the reason for the striking difference for IE8's and Chrome's behaviors is really that simple: IE8 has two boxes and Chrome has one. LePage makes an important mistake in his accusation against Google: his statement should not be 'Chrome sends a request back to Google' but it should be 'Chrome sends a request back to the search provider.'"

Comments (20 posted)

Unknown root certificate in Firefox

The Mozilla project has disclosed that Firefox currently contains a root certificate authority that nobody knows anything about. "I have not been able to find the current owner of this root. Both RSA and VeriSign have stated in email that they do not own this root. Therefore, to my knowledge this root has no current owner and no current audit, and should be removed from NSS." It seems past time for the user community to start paying more attention to the root certificates accepted by our browsers.

Comments (9 posted)

Mozilla "unknown root certificate" followup

Here's a post from the Mozilla Security Blog explaining the what was going on with the mysterious root certificate accepted by Mozilla. "The confusion stems from a comment made in the newsgroup threads discussing the removal which suggested that the root didn't have a current owner. We know where the root came from, it was added at RSA's request several years ago and vetted according to our inclusion guidelines." A look at the original discussion shows that they only (re)verified the origin of that certificate on April 6; prior to that, nobody was really sure.

Comments (none posted)

ClamAV 0.94.x end of life announcement

The ClamAV developers have sent out a reminder that the end is near for version 0.94.x - and they really mean the end: "This is a reminder that starting from 15 April 2010 our CVD will contain a special signature which disables all clamd installations older than 0.95 - that is to say older than 1 year." Time for anybody who has not yet upgraded to do it.

Full Story (comments: 1)

New vulnerabilities

firefox: multiple vulnerabilities

Package(s):firefox CVE #(s):CVE-2010-0173 CVE-2010-0181
Created:April 1, 2010 Updated:June 14, 2010
Description:

From the Mozilla advisories: [1, 2]

CVE-2010-0173: Mozilla developers identified and fixed several stability bugs in the browser engine used in Firefox and other Mozilla-based products. Some of these crashes showed evidence of memory corruption under certain circumstances, and we presume that with enough effort at least some of these could be exploited to run arbitrary code.

CVE-2010-0181: phpBB developer Henry Sudhof reported that when an image tag points to a resource that redirects to a mailto: URL, the external mail handler application is launched. This issue poses no security threat to users but could create an annoyance when browsing a site that allows users to post arbitrary images.

Alerts:
Mandriva MDVSA-2010:070-1 2010-04-20
SuSE SUSE-SR:2010:013 2010-06-14
Mandriva MDVSA-2010:070 2010-04-13
SuSE SUSE-SA:2010:021 2010-04-14
Ubuntu USN-921-1 2010-04-09
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5840 2010-04-03
Slackware SSA:2010-090-03 2010-04-01
Slackware SSA:2010-090-02 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5506 2010-04-01
Fedora FEDORA-2010-5506 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5506 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5506 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5506 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5506 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5506 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Fedora FEDORA-2010-5506 2010-04-01
Fedora FEDORA-2010-5539 2010-04-01
Fedora FEDORA-2010-5526 2010-04-01
Fedora FEDORA-2010-5539 2010-04-01
Fedora FEDORA-2010-5526 2010-04-01
Fedora FEDORA-2010-5515 2010-04-01
Gentoo 201301-01 2013-01-07

Comments (none posted)

gnome-screensaver: unauthorized access

Package(s):gnome-screensaver CVE #(s):CVE-2010-0732
Created:April 7, 2010 Updated:May 27, 2010
Description: Hitting the "return" key repeatedly can cause an X error, causing gnome-screensaver to exit.
Alerts:
Mandriva MDVSA-2010:109 2010-05-27
SuSE SUSE-SR:2010:008 2010-04-07

Comments (none posted)

hamlib: arbitrary code execution

Package(s):hamlib CVE #(s):CVE-2009-3736
Created:April 5, 2010 Updated:April 7, 2010
Description: From the Red Hat bugzilla:

CERT reported a vulnerability in libltdl (part of libtool) where it could, in some cases, load and execute code from a library in the current directory (or the system's shared library search path) instead of the library that was requested with an absolute path. Systems which don't enforce specific naming for loadable objects, or which search for loadable objects in insecure directories (such as the current working directory), or don't require that loadable objects be signed in some way or have their execute bits set, are particularly vulnerable, and are trivial to exploit via an uploaded file.

Alerts:
Fedora FEDORA-2010-4352 2010-03-13
Fedora FEDORA-2010-4407 2010-03-13

Comments (none posted)

horde: cross-site scripting

Package(s):horde CVE #(s):CVE-2008-3824
Created:April 1, 2010 Updated:April 7, 2010
Description:

From the Red Hat bugzilla entry:

oCERT reported an XSS vulnerability discovered by Alexios Fakos affecting horde:

Horde relies on code similar to Popoon's externalinput.php to filter out potential XSS attacks on user-supplied input. This filter, and the original, fail to fully sanitize user data. In particular, this filter fails to protect against '/'s acting as spaces in both Microsoft Internet Explorer and Mozilla Firefox.

For example, the following snippet, supplied by the reporter, is treated as valid by the browsers but safe by the filter: <body/onload=alert(/w00w00/)>

Alerts:
Fedora FEDORA-2010-5520 2010-04-01
Fedora FEDORA-2010-5483 2010-04-01

Comments (none posted)

ikiwiki: cross-site scripting

Package(s):ikiwiki CVE #(s):
Created:April 1, 2010 Updated:April 7, 2010
Description:

From the Red Hat bugzilla entry:

Ivan Shmakov pointed out that the htmlscrubber allowed data:image/* urls, including data:image/svg+xml. But svg can contain javascript, so that is unsafe.

This hole was discovered on 12 March 2010 and fixed the same day with the release of ikiwiki 3.20100312. A fix was also backported to Debian etch, as version 2.53.5. I recommend upgrading to one of these versions if your wiki can be edited by third parties.

Alerts:
Fedora FEDORA-2010-4933 2010-03-23
Fedora FEDORA-2010-4884 2010-03-23

Comments (none posted)

imlib2: arbitrary code execution

Package(s):imlib2 CVE #(s):CVE-2008-6079
Created:April 6, 2010 Updated:July 2, 2010
Description: From the Debian advisory:

It was discovered that imlib2, a library to load and process several image formats, did not properly process various image file types. Several heap and stack based buffer overflows - partly due to integer overflows - in the ARGB, BMP, JPEG, LBM, PNM, TGA and XPM loaders can lead to the execution of arbitrary code via crafted image files.

Alerts:
Mandriva MDVSA-2010:127 2010-07-02
Debian DSA-2029-1 2010-04-05

Comments (none posted)

java-1.6.0-sun: multiple vulnerabilities

Package(s):java-1.6.0-sun CVE #(s):CVE-2010-0082 CVE-2010-0084 CVE-2010-0085 CVE-2010-0087 CVE-2010-0088 CVE-2010-0089 CVE-2010-0090 CVE-2010-0091 CVE-2010-0092 CVE-2010-0093 CVE-2010-0094 CVE-2010-0095 CVE-2010-0837 CVE-2010-0838 CVE-2010-0839 CVE-2010-0840 CVE-2010-0841 CVE-2010-0842 CVE-2010-0843 CVE-2010-0844 CVE-2010-0845 CVE-2010-0846 CVE-2010-0847 CVE-2010-0848 CVE-2010-0849
Created:April 1, 2010 Updated:September 21, 2010
Description:

From the Red Hat advisory. The first number is a reference to the Red Hat bugzilla bug number.

575736 - CVE-2010-0082 OpenJDK Loader-constraint table allows arrays instead of only the base-classes (6626217)

575740 - CVE-2010-0084 OpenJDK Policy/PolicyFile leak dynamic ProtectionDomains. (6633872)

575747 - CVE-2010-0085 OpenJDK File TOCTOU deserialization vulnerability (6736390)

575755 - CVE-2010-0088 OpenJDK Inflater/Deflater clone issues (6745393)

575756 - CVE-2010-0091 OpenJDK Unsigned applet can retrieve the dragged information before drop action occurs(6887703)

575760 - CVE-2010-0092 OpenJDK AtomicReferenceArray causes SIGSEGV -> SEGV_MAPERR error (6888149)

575764 - CVE-2010-0093 OpenJDK System.arraycopy unable to reference elements beyond Integer.MAX_VALUE bytes (6892265)

575769 - CVE-2010-0094 OpenJDK Deserialization of RMIConnectionImpl objects should enforce stricter checks (6893947)

575772 - CVE-2010-0095 OpenJDK Subclasses of InetAddress may incorrectly interpret network addresses (6893954)

575775 - CVE-2010-0845 OpenJDK No ClassCastException for HashAttributeSet constructors if run with -Xcomp (6894807)

575808 - CVE-2010-0838 OpenJDK CMM readMabCurveData Buffer Overflow Vulnerability (6899653)

575818 - CVE-2010-0837 OpenJDK JAR "unpack200" must verify input parameters (6902299)

575846 - CVE-2010-0840 OpenJDK Applet Trusted Methods Chaining Privilege Escalation Vulnerability (6904691)

575854 - CVE-2010-0841 OpenJDK JPEGImageReader stepX Integer Overflow Vulnerability (6909597)

575865 - CVE-2010-0848 OpenJDK AWT Library Invalid Index Vulnerability (6914823)

575871 - CVE-2010-0847 OpenJDK ImagingLib arbitrary code execution vulnerability (6914866)

578430 - CVE-2010-0846 JDK unspecified vulnerability in ImageIO component

578432 - CVE-2010-0849 JDK unspecified vulnerability in Java2D component

578433 - CVE-2010-0087 JDK unspecified vulnerability in JWS/Plugin component

578436 - CVE-2010-0839 CVE-2010-0842 CVE-2010-0843 CVE-2010-0844 JDK multiple unspecified vulnerabilities

578437 - CVE-2010-0090 JDK unspecified vulnerability in JavaWS/Plugin component

578440 - CVE-2010-0089 JDK unspecified vulnerability in JavaWS/Plugin component

Alerts:
SUSE SUSE-SR:2010:017 2010-09-21
Red Hat RHSA-2010:0574-01 2010-07-29
Pardus 2010-59 2010-05-10
SuSE SUSE-SR:2010:011 2010-05-10
Red Hat RHSA-2010:0383-01 2010-04-29
Mandriva MDVSA-2010:084 2010-04-28
Fedora FEDORA-2010-6039 2010-04-09
Fedora FEDORA-2010-6025 2010-04-09
SuSE SUSE-SR:2010:008 2010-04-07
Ubuntu USN-923-1 2010-04-07
Red Hat RHSA-2010:0339-01 2010-03-31
Red Hat RHSA-2010:0338-01 2010-03-31
Red Hat RHSA-2010:0337-01 2010-03-31
Gentoo 201006-18 2010-06-04
SUSE SUSE-SA:2010:028 2010-07-06
SuSE SUSE-SA:2010:026 2010-07-01
Red Hat RHSA-2010:0489-01 2010-06-17
CentOS CESA-2010:0339 2010-06-12

Comments (none posted)

krb5: denial of service

Package(s):krb5 CVE #(s):CVE-2010-0629
Created:April 7, 2010 Updated:October 18, 2010
Description: The kadmind daemon contains a user-after-free vulnerability which can be exploited by a remote, authenticated user to cause a crash.
Alerts:
rPath rPSA-2010-0065-1 2010-10-17
CentOS CESA-2010:0343 2010-05-28
Pardus 2010-53 2010-04-20
Mandriva MDVSA-2010:071 2010-04-13
SuSE SUSE-SR:2010:009 2010-04-14
Debian DSA-2031-1 2010-04-11
Fedora FEDORA-2010-6108 2010-04-09
Ubuntu USN-924-1 2010-04-07
Red Hat RHSA-2010:0343-01 2010-04-06
Gentoo 201201-13 2012-01-23

Comments (none posted)

libnids, dsniff: remotely triggerable null pointer dereference

Package(s):dsniff, libnids CVE #(s):
Created:April 1, 2010 Updated:April 7, 2010
Description: libnids 1.24 (Mar 14 2010): - fixed another remotely triggerable NULL dereference in ip_fragment.c
Alerts:
Fedora FEDORA-2010-5535 2010-04-01
Fedora FEDORA-2010-5545 2010-04-01
Fedora FEDORA-2010-5535 2010-04-01
Fedora FEDORA-2010-5545 2010-04-01

Comments (none posted)

libnss-db: information disclosure and possible privilege escalation

Package(s):libnss-db CVE #(s):CVE-2010-0826
Created:April 1, 2010 Updated:May 28, 2010
Description:

From the Ubuntu advisory:

Stephane Chazelas discovered that libnss-db did not correctly set up a database environment. A local attacker could exploit this to read the first line of arbitrary files, leading to a loss of privacy and possibly privilege escalation.

Alerts:
CentOS CESA-2010:0347 2010-05-28
Fedora FEDORA-2010-6361 2010-04-10
Fedora FEDORA-2010-6331 2010-04-10
Mandriva MDVSA-2010:077 2010-04-17
Red Hat RHSA-2010:0347-01 2010-04-13
Ubuntu USN-922-1 2010-03-31

Comments (none posted)

mahara: SQL injection

Package(s):mahara CVE #(s):CVE-2010-0400
Created:April 7, 2010 Updated:April 7, 2010
Description: The mahara electronic portfolio system does not properly escape input when generating user names, enabling an SQL injection attack and the compromise of the database.
Alerts:
Debian DSA-2030-1 2010-04-06

Comments (none posted)

openssl: denial of service

Package(s):openssl CVE #(s):CVE-2010-0740
Created:April 1, 2010 Updated:April 20, 2010
Description:

From the CVE entry:

The ssl3_get_record function in ssl/s3_pkt.c in OpenSSL 0.9.8f through 0.9.8m allows remote attackers to cause a denial of service (crash) via a malformed record in a TLS connection that triggers a NULL pointer dereference, related to the minor version number. NOTE: some of these details are obtained from third party information.

Alerts:
Gentoo 201110-01 2011-10-09
Mandriva MDVSA-2010:076-1 2010-04-19
Mandriva MDVSA-2010:076 2010-04-15
Pardus 2010-46 2010-04-06
Slackware SSA:2010-090-01 2010-04-01

Comments (none posted)

pidgin-sipe: unspecified vulnerability

Package(s):pidgin-sipe CVE #(s):
Created:April 5, 2010 Updated:April 7, 2010
Description: See the comments to this update: The security update is "NTLMv2 and NTLMv2 Session Security support (pier11)" -- previously it only supported the insecure NTMLv1.
Alerts:
Fedora FEDORA-2010-4830 2010-03-20
Fedora FEDORA-2010-4848 2010-03-20

Comments (none posted)

Page editor: Jake Edge

Kernel development

Brief items

Kernel release status

The current development kernel remains 2.6.34-rc3, but -rc4 can be expected at almost any time. Quite a few patches have gone in since -rc3; most are fixes, but there's also a 4200-file cleanup from Tejun Heo and a new driver for Chelsio T4-based Ethernet adapters.

Stable updates: Greg Kroah-Hartman has announced the release of four separate stable kernels: 2.6.27.46, 2.6.31.13, 2.6.32.11, and 2.6.33.2. These are fairly sizable updates, weighing in at 45, 89, 116, and 156 patches respectively (at review time anyway, a few patches may have been dropped). As usual, all users are strongly encouraged to upgrade. In addition, it sounds like stable updates for the 2.6.31 series are nearing their end, so users of that kernel should move to .32 or .33.

Comments (none posted)

Quotes of the week

4208 files changed, 3717 insertions(+), 717 deletions(-)
-- Tejun Heo casts a wide net for -rc4

I have two machines that show very different performance numbers. After digging a little I found out that the first machine has, in /proc/cpuinfo:

model name : Intel(R) Celeron(R) M processor 1.00GHz

while the other has:

model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz

and that seems to be the main difference. Now the problem is that /proc/cpuinfo is read only. Would it be possible to make /proc/cpuinfo writable so that I could do:

echo -n "model name : Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz" > /proc/cpuinfo

in the first machine and get a performance similar to the second machine?

-- Paulo Marques

It's probably buggy as hell, I don't dare try to actually boot the crap I write.
-- Linus Torvalds

Comments (4 posted)

A "live mode" for perf

By Jake Edge
April 7, 2010

The perf tracing tool has evolved quickly. When last we looked, Tom Zanussi had added Python and Perl scripting to perf. Next up would seem to be perf "live mode", where perf no longer requires two steps: record the data, then analyze. Live mode will allow perf trace record and perf trace report to operate via a pipe, which allows instantaneous, as well as continuously updating (a la top), output.

So that no existing perf users need to change their scripts, Zanussi only added the new capabilities when perf recognizes that its record output is going to stdout or report input is coming from stdin. In that case, perf handles the data through a pipe, and uses special synthesized events to provide header information. This will also allow perf to operate over the network by piping its record output to netcat, and then reading it via netcat on another system and piping it into report.

All of the scripts that are installed in the standard perf location (i.e. those which are listed in perf trace -l) are automatically able to be run in live mode:

  $ perf trace syscall-counts
will run both ends of the the syscall-counts script with a pipe in between, a more usable shorthand for:
  $ perf trace record syscall-counts -o - | perf trace report syscall-counts -i -
which itself is shorthand for:
  perf record -c 1 -f -a -M -R -e raw_syscalls:sys_enter -o - | \
  perf trace -i - -s ~/libexec/perf-core/scripts/python/syscall-counts.py

Zanussi also included several sample top-style scripts that can be used to monitor read/write or system call activity updated every three seconds. It looks to be a very useful addition to perf, which is rapidly becoming the "swiss army knife" of kernel monitoring.

Comments (4 posted)

The NO_BOOTMEM patches

By Jonathan Corbet
April 7, 2010
Every kernel development cycle seems to involve one set of patches which turn out to be more trouble than had been expected. With 2.6.34, that award should probably go to the patches found under the somewhat confusing CONFIG_NO_BOOTMEM option.

"Bootmem" is a simple, low-level memory allocator used by the kernel during the early parts of the bootstrap process. One might think that the kernel does not need yet another allocator, but the memory management code used during operation requires that much of the kernel already be functional before it can be called. Getting to that point involves a chain of increasingly complicated memory allocation mechanisms; on the x86 architecture, those begin the "early_res" mechanism which takes over from the BIOS "e820" facility. Once things get a little farther, the architecture-independent bootmem allocator takes over, followed, eventually, by the full buddy allocator.

Yinghai Lu came to the conclusion that things could be simplified considerably if the bootmem stage were taken out of the picture. The result was a series of patches which extends the use of the early_res mechanism for long enough to bootstrap the buddy allocator. These changes were merged for 2.6.34, but the old bootmem-based code was left behind. The CONFIG_NO_BOOTMEM option controls which allocator is used, with the default being to short out bootmem.

This is a significant change to the crucial and tricky early bootstrap code, so few people were surprised when some regressions were reported against 2.6.34-rc1. When the reports continued to arrive after -rc3, though, the level of irritation began to grow, to the point that Linus started talking about reverting the whole thing. Nobody seemed to dislike the objectives of the patches, but system-killer regressions after -rc3, along with the twisted mess of #ifdefs created by the patch and the fact that it was on by default led to some grumpiness.

Normally, new features are expected to be configured out by default; to the greatest extent possible, a new kernel should behave as much like its predecessors as possible when the default options are taken. In this case, the default led to significant changes and problems. The purpose of this option was twofold: to allow the new code to be configured out when it proved to be problematic, and to ensure that it was well tested in the mean time. Certainly it was successful on both fronts, even if some of the testers proved to be not entirely willing.

As of this writing, it would appear that the worst problems have been fixed; talk of removing the no-bootmem code has subsided. Eventually, perhaps, all architectures will make similar changes and the bootmem code can be removed entirely. Meanwhile, Yinghai has a new set of changes on the horizon for 2.6.35: replacing the early_res code with the "logical memory block" allocator currently used by some other architectures. That change looks even more disruptive than the bootmem elimination was.

Comments (3 posted)

Kernel development news

Memory management for virtualization

By Jonathan Corbet
April 7, 2010
For some time now, your editor has asserted that, at the kernel level, the virtualization problem is mostly solved. Much of the remaining work is in the performance area. That said, making virtualized systems perform well is not a small or trivial problem. One of the most interesting aspects of this problem is in the interaction between virtualized guests and host memory management. A couple of patch sets under discussion illustrate where the work in this area is being done.

The transparent huge pages patch set was discussed here back in October. This patch seeks to change how huge pages are used by Linux applications. Most current huge page users must be set up explicitly to use huge pages, which, in turn, must be set aside by the system administrator ahead of time; see the recent series by Mel Gorman for more information on how this is done. The "some assembly required" nature of huge pages limits their use in many situations.

The transparent huge page patch, instead, works to provide huge pages to applications without those applications even being aware that such pages exist. When large pages are available, applications may have their scattered pages joined together into huge pages automatically; those pages can also be split back apart when the need arises. When the system operates in this mode, huge pages can be used in many more situations without the need for application or administrator awareness. This feature turns out to be especially beneficial when running virtualized guests; huge pages map well to how guests tend to see and use their address spaces.

The transparent huge page patches have been working their way toward acceptance, though it should be noted that some developers still have complaints about this work. Andrew Morton recently pointed out a different problem with this patch set:

It appears that these patches have only been sent to linux-mm. Linus doesn't read linux-mm and has never seen them. I do think we should get things squared away with him regarding the overall intent and implementation approach before trying to go further... [T]his is a *large* patchset, and it plays in an area where Linus is known to have, err, opinions.

It didn't take long for Linus to join the conversation directly; after a couple of digressions into areas not directly related to the benefits of the transparent huge pages patch, he realized that this work was motivated by the needs of virtualization. At that point, he lost interest:

So I thought it was a more interesting load than it was. The virtualization "TLB miss is expensive" load I can't find it in myself to care about. "Get a better CPU" is my answer to that one.

He went on to compare the transparent huge page work to high memory, which, in turn, he called "a failure". The right solution in both cases, he says, is to get a better CPU.

It should be pointed out that high memory was a spectacularly successful failure, extending the useful life of 32-bit systems for some years. It still shows up in surprising places - you editor's phone is running a high-memory-enabled kernel. So calling high memory a failure is something like calling the floppy driver a failure; it may see little use now, but there was a time when we were glad we had it.

Perhaps, someday, advances in processor architecture will make transparent huge pages unnecessary as well. But, while the alternative to high memory (64-bit processors) has been in view for a long time, it's not at all clear what sort of processor advance might make transparent huge pages irrelevant. So, should this code get into the kernel, it may well become one of those failures which is heavily used for many years.

A related topic under discussion was the recently-posted VMware balloon driver patch. A balloon driver has an interesting task; its job is to "inflate" within a guest system, taking up memory and making it unavailable for processes running within the guest. The pages absorbed by the balloon can then be released back to the host system which, presumably, has a more pressing need for them elsewhere. Letting "air" out of the balloon makes memory available to the guest once again.

The purpose of this driver, clearly, is to allow the host to dynamically balance the memory needs of its guest systems. It's a bit of a blunt instrument, but it's the best we have. But Andrew Morton questioned the need for a separate memory control mechanism. The kernel already has a function, called shrink_all_memory(), which can be used to force the release of memory. This function is currently used for hibernation, but Andrew suspects that it could be adapted to the needs of virtualization as well.

Whether that is really true remains to be seen; it seems that the bulk of the complexity lies not with the freeing of memory but in the communication between the guest and the hypervisor. Beyond that, the longer-term solution is likely to be something more sophisticated than simply applying memory pressure and watching the guest squirm until it releases enough pages. As Dan Magenheimer put it:

Historically, all OS's had a (relatively) fixed amount of memory and, since it was fixed in size, there was no sense wasting any of it. In a virtualized world, OS's should be trained to be much more flexible as one virtual machine's "waste" could/should be another virtual machine's "want".

His answer to this problem is the transcendent memory patch, which allows the operating system to designate memory which is available for the taking should the need arise, but which can contain useful data in the mean time.

This is clearly an area that needs further work. The whole point of virtualization is to isolate guests from each other, but a more cooperative approach to memory requires that these guests, somehow, be aware of the level of contention for resources like memory and respond accordingly. Like high memory and transparent huge pages, balloon drivers may eventually be consigned to the pile of failed technologies. Until something better comes along, though, we'll still need them.

Comments (13 posted)

Receive flow steering

By Jake Edge
April 7, 2010

Today's increasing bandwidth, and faster networking hardware, has made it difficult for a single CPU to keep up. Multiple cores and packages have helped matters on the transmit side, but the receive side is trickier. Tom Herbert's receive packet steering (RPS) patches, which we looked at back in November, provide a way to steer packets to particular CPUs based on a hash of the packet's protocol data. Those patches were applied to the network subsystem tree and are bound for 2.6.35, but now Herbert is back with an enhancement to RPS that will attempt to steer packets to the CPU on which the receiving application is running: receive flow steering (RFS).

RFS uses the RPS hash table to store the CPU of an application when it calls recvmsg() or sendmsg(). Instead of picking an arbitrary CPU based on the hash and a CPU mask optionally set by an administrator, as RPS does, RFS tries to use the CPU where the receiving application is running. Based on the hash calculated on the incoming packet, RFS can look up the "proper" CPU and assign the packet there.

The RPS CPU masks, which can be set via sysfs for each device (and queue for devices with multiple queues), represent the allowable CPUs to assign for a packet. But dynamically changing those values introduces the possibility of out-of-order packets. For RPS, with largely static CPU masks, it was not necessarily a big problem. For RFS, however, multiple threads trying to read from the same socket, while potentially bouncing around to different CPUs, would cause the CPU value in the hash table to change frequently, thus increasing the likelihood of out-of-order packets.

For RFS, that was considered to be a "non-starter", Herbert said, so a different approach was required. To eliminate the out-of-order packets, two types of hash tables are created, both indexed by the hash calculated from the packet information. The global rps_sock_flow_table is populated by the recvmsg() or sendmsg() call with the CPU number where the application is running (this is called the "desired" CPU). Each device queue then gets a rps_dev_flow_table which contains the most recent CPU used to handle packets for that connection (which is called the "current" CPU). In addition, the value of the tail queue counter for the current CPU's backlog queue is stored in the rps_dev_flow_table entry.

The two CPU values are compared when deciding which CPU to process the packet on (which is done in get_rps_cpu()). If the current CPU (as determined from the rps_dev_flow_table hash table) is unset (presumably for the first packet) or that CPU is offline, the desired CPU (from rps_sock_flow_table) is used. If the two CPU values are the same, obviously, that CPU is used. But if they are both valid CPU numbers, but different, the backlog tail queue counter is consulted.

Backlog queues have a queue head counter that gets incremented when packets are removed from the queue. Using that and the queue length, a queue tail counter value can be calculated. That is what gets stored in rps_dev_flow_table. When the kernel makes its decision about which CPU to assign the packet to, it needs to consider both the current (really last used by the kernel) CPU and the desired (last used by an application for sending or receiving) CPU.

The kernel compares the current CPU's queue tail counter (as stored in the hash table) with that CPU's queue head counter. If the tail counter is less than or equal the head counter, that means that all packets that were put on the queue by this connection have been processed. That in turn means that switching to the desired CPU will not result in out-of-order packets.

Herbert's current patch is for TCP, but RFS should be "usable for other flow oriented protocols". The benefit is that it can achieve better CPU locality for the processing of the packet, both by the kernel, and the application itself. Depending on various factors—cache hierarchy and application are given as examples—it can and does increase the packets per second that can be processed as well as lowering the latency before a packet gets processed. But, interestingly, "on simple benchmarks, we don't necessarily see improvement and sometimes see degradation".

For more complex benchmarks, the performance increase looks to be significant. Herbert gave numbers for a netperf run where the transactions per second went from 104K without either RFS or RPS, to 290K for the best RPS configuration, and to 303K with RFS and RPS. A different test, with 100 threads handling an RPC-like request/response with some user-space work being done, was even more dramatic. That test showed 103K, 174K, and 223K respectively, but also showed a marked decrease in the latency for both RPS and RPS + RFS.

These patches are coming from Google, which has been known to process a few packets using the Linux kernel. If RFS is being used on production systems at Google, that would seem to bode well for its reliability and performance beyond just benchmarks. The patches were posted April 2, and seemed to be generally well-received, so it's a little early to tell when they might make it into the mainline. But it seems rather likely that we will see them in either 2.6.35 or 36.

Comments (5 posted)

The padata parallel execution mechanism

By Jonathan Corbet
April 6, 2010
One day, Andrew Morton was happily reading linux-kernel when he encountered a patch fixing a minor problem with the "padata" code. Andrew, it seems, had never heard of padata, which was merged during the 2.6.34 merge window. So he asked: "OK, on behalf of thousands I ask: what the heck is kernel/padata.c?" On behalf of those same thousands, your editor set out to learn what this new bit of core kernel code does and how to use it.

In short: padata is a mechanism by which the kernel can farm work out to be done in parallel on multiple CPUs while retaining the ordering of tasks. It was developed for use with the IPsec code, which needs to be able to perform encryption and decryption on large numbers of packets without reordering those packets. The crypto developers made a point of writing padata in a sufficiently general fashion that it could be put to other uses as well, but that requires knowing that the API is there and how to use it. Unfortunately, they made a bit less of a point of updating the documentation directory.

The first step in using padata is to set up a padata_instance structure for overall control of how tasks are to be run:

    #include <linux/padata.h>

    struct padata_instance *padata_alloc(const struct cpumask *cpumask,
				         struct workqueue_struct *wq);

The cpumask describes which processors will be used to execute work submitted to this instance. The workqueue wq is where the work will actually be done; it should be a multithreaded queue, naturally.

There are functions for enabling and disabling the instance:

    void padata_start(struct padata_instance *pinst);
    void padata_stop(struct padata_instance *pinst);

These functions literally do nothing beyond setting or clearing the "padata_start() was called" flag; if that flag is not set, other functions will refuse to work. There must be some perceived value in this functionality, but the only current padata user (crypto/pcrypt.c) does not make use of it. So padata_start() looks like one of those exercises in pointless bureaucracy that we all have to cope with sometimes.

The list of CPUs to be used can be adjusted with these functions:

    int padata_set_cpumask(struct padata_instance *pinst,
			   cpumask_var_t cpumask);
    int padata_add_cpu(struct padata_instance *pinst, int cpu);
    int padata_remove_cpu(struct padata_instance *pinst, int cpu);

Changing the CPU mask has the look of an expensive operation, though, so it probably should not be done with great frequency.

Actually submitting work to the padata instance requires the creation of a padata_priv structure:

    struct padata_priv {
        /* Other stuff here... */
	void                    (*parallel)(struct padata_priv *padata);
	void                    (*serial)(struct padata_priv *padata);
    };

This structure will almost certainly be embedded within some larger structure specific to the work to be done. Most its fields are private to padata, but the structure should be zeroed at initialization time, and the parallel() and serial() functions should be provided. Those functions will be called in the process of getting the work done as we will see momentarily.

The submission of work is done with:

    int padata_do_parallel(struct padata_instance *pinst,
		           struct padata_priv *padata, int cb_cpu);

The pinst and padata structures must be set up as described above; cb_cpu specifies which CPU will be used for the final callback when the work is done; it must be in the current instance's CPU mask. The return value from padata_do_parallel() is a little strange; zero is an error return indicating that the caller forgot the padata_start() formalities. -EBUSY means that somebody, somewhere else is messing with the instance's CPU mask, while -EINVAL is a complaint about cb_cpu not being in that CPU mask. If all goes well, this function will return -EINPROGRESS, indicating that the work is in progress.

Each task submitted to padata_do_parallel() will, in turn, be passed to exactly one call to the above-mentioned parallel() function, on one CPU, so true parallelism is achieved by submitting multiple tasks. The workqueue is used to actually make these calls, so parallel() runs in process context and is allowed to sleep. The parallel() function gets the padata_priv structure pointer as its lone parameter; information about the actual work to be done is probably obtained by using container_of() to find the enclosing structure.

Note that parallel() has no return value; the padata subsystem assumes that parallel() will take responsibility for the task from this point. The work need not be completed during this call, but, if parallel() leaves work outstanding, it should be prepared to be called again with a new job before the previous one completes. When a task does complete, parallel() (or whatever function actually finishes the job) should inform padata of the fact with a call to:

    void padata_do_serial(struct padata_priv *padata);

At some point in the future, padata_do_serial() will trigger a call to the serial() function in the padata_priv structure. That call will happen on the CPU requested in the initial call to padata_do_parallel(); it, too, is done through the workqueue, but with local software interrupts disabled. Note that this call may be deferred for a while since the padata code takes pains to ensure that tasks are completed in the order in which they were submitted.

The one remaining function in the padata API should be called to clean up when a padata instance is no longer needed:

    void padata_free(struct padata_instance *pinst);

This function will busy-wait while any remaining tasks are completed, so it might be best not to call it while there is work outstanding. Shutting down the workqueue, if necessary, should be done separately.

The API as described above is what can be found in the 2.6.34-rc3 kernel. As was seen back at the beginning of this article, padata is just coming into more general awareness, and some developers are asking questions about the API. So changes are possible - but, then, that is true of any internal kernel interface.

Comments (1 posted)

Patches and updates

Kernel trees

Core kernel code

Development tools

Device drivers

Filesystems and block I/O

Memory management

Networking

Security-related

Virtualization and containers

Miscellaneous

Page editor: Jonathan Corbet

Distributions

News and Editorials

The role of the Debian ftpmasters

April 2, 2010

This article was contributed by Joe 'Zonker' Brockmeier.

Linux distributions don't simply appear on mirrors and BitTorrent networks fully formed. A great deal of work goes on behind the scenes before a release sees the light of day. Linux users who aren't involved in the production of a Linux distribution may not fully appreciate all of that work. Take, for example, the work done by Debian's ftpmasters team.

While it's obvious, or should be, that someone has to actually create the packages that go into Debian and one assumes that there is Quality Assurance (QA) and so forth, the ftpmasters team is largely invisible to users. This was highlighted recently by Joerg Jaspert's call for new volunteers for the ftpmaster team.

The essence of the job is maintaining the Debian archive, accepting new packages (NEW), maintaining the scripts for processing incoming packages, and pulling packages when asked by QA, among other things. The ftpmasters also move packages from testing to stable when the time comes, though the decision to do this is made by the release managers team. The canonical description of the ftpmaster job is more detailed (unfortunately ftpmaster.debian.org was down at the time this article was written), but the basic gist is that the team deals with new packages and keeps the archive going.

The job is unique to Debian, in that other distributions don't have an exact analog to the ftpmasters team. openSUSE, for example, has an autobuild team to ensure package quality, and Ciaran Farrell looks over licenses when there's a question. The various teams decide what packages do or don't go in, and if there's a dispute at some point it can be handled by the release manager and product manager for openSUSE.

Debian is, as Jaspert alluded to, "not getting smaller" and managing the number of new packages is a "kind of Sisyphean task." The Debian archive contains thousands of packages, and the NEW queue can have hundreds of packages awaiting approval. NEW packages are those entering Debian for the first time, which do not have source packages in the archive, or those adding new binary packages. New versions of existing packages are moved automatically into the pool.

Currently, there are fewer actual "masters" than Jaspert would like, which is to say only Jaspert and Mark Hymers are currently serving as ftpmasters. The team also has six assistants at the moment: Barry deFreese, Chris Lamb, Frank Lichtenheld, Mike O'Connor, Alexander Reichle-Schmehl, and Torsten Werner. Though it doesn't sound like a lot, Jaspert says he'd really just like to add one more ftpmaster, as the bulk of the work is done by the assistants:

[Three] is a good number for masters. Though two also works, as long as its not just one. The majority of work is with the assistants, NEW and removals and overrides, masters usually have the background work. Keeping the archive running. Merging patches to the software and making sure the archive still runs when deploying it. Such stuff.

It takes some time to bring new assistants, and masters, into the fold. It requires a fair amount of knowledge to do the job. Jaspert says it's not only necessary to have a basic understanding of "just about every programming language you can imagine" but also have a love of reading and dealing with legal texts. The team is responsible for digging through new packages and sussing out all manner of problems (particularly legal ones). Though it is not responsible for the actual QA work, the ftpmasters team is the line of defense before packages enter the archive, which is an enormous responsibility. Before one is added to the assistants team, there's a training period to learn the ropes of working with packages in NEW:

"The way this setup works is simply letting trainees access the ftpmaster machine and the NEW queue. You can look at packages and their source as any other team member. But trainees can not do the actual ACCEPT or REJECT. Instead you have a special ability to leave notes about the packages, explaining what action you would take and why. The other team members will then review those notes and either follow your advice or tell you why they decided to do something different.

After a while we and you will know if you actually fit the team, but more important we (and you yourself) will know if you should (want to) continue doing NEW and will promote you up to assistant. We set ourself a time limit of 6 months as a maximum stay in the trainee group, but none of the current team members has ever stayed in trainee that long. The longest is 3 months, the shortest is 6 days.

While one may be able to graduate from trainee to assistant in six days, Jaspert says that six months is the minimum stay to graduate to master:

For masters its also 6 months, but this time a minimum before we look at "upgrading" them. And then its a discussion between the candidate and the existing masters. Not all of the assistants want to become masters, some are happy with the assistant role, for various reasons. Some of them private, some of them just do not want more power, various. We accept that and be happy that they are assistants and do their share of work in that role.

So the short summary for "graduation": Both, the candidate and the existing masters are happy with it. Then we go the usual road in Debian and voila, one more is there.

The ftpmasters team wields a considerable amount of power over what does, and doesn't, make it into Debian package archive. For the vast majority of packages, the decisions may be cut and dried. At least there are relatively clear-cut guidelines based on known license problems, lack of licensing information, a failure for packages to build from source, policy violations, or any number of other known issues.

The ftpmasters also have room for discretion in applying the rules and may reject packages for other reasons. Consider, for example, the decision to reject qmail packages from inclusion. This was less about Debian Policy and more, apparently, about the ftpteam's opinion of Qmail.

Though the reasons for rejecting qmail or other packages may not be specifically enumerated in the guidelines, Jaspert says that it had various policy and Filesystem Hierarchy Standard (FHS) violations, and that the guidelines are not all-inclusive. "Basically its an 'apply common sense' [policy] and nothing one can hardcode. 'Is this package one thats about to go bitrot even before its in the next stable release?'"

There have also been other issues centered around the team. For this reason, the team not only needs to have a depth of programming knowledge and interest in licensing, but also a very thick skin. As Jaspert writes, the team needs to be able to deal with unpopular decisions and take some flames. But he also says that the team "doesn't (usually) bite" and hopes that people will talk to the team when there's a disagreement over decisions made by them.

By and large, the work done by the ftpmasters is invisible to most users, but crucial to the success of Debian and all its downstream projects.

Comments (7 posted)

New Releases

Mandriva Linux 2010 Spring Beta 1 available

Mandriva has announced the first beta release of Mandriva Linux 2010 Spring (2010.1). "This release is including GNOME 2.30 (released on April 1st) and a preview edition of GNOME-Shell, which will be part of GNOME 3 (which is planned for release on september 2010). Of course, KDE 4.4.2 is also available, as well as various updates for many programs in the distribution."

Comments (5 posted)

Puredyne 9.11 (Carrot and Coriander+)

Puredyne 9.11 "Carrot and Coriander +" has been released. "Puredyne is a GNU/Linux live distribution aimed at creative people, looking for tools outside the standard. It provides the best experimental creative applications alongside a solid set of graphic, audio and video tools in a fast, minimal package. For everything from sound art to innovative filmmaking."

Full Story (comments: none)

Distribution News

Debian GNU/Linux

(Final) Bits from (this) DPL

Steve McIntyre presents his final bits as Debian Project Leader. "It's been a very long time since I've written a summary of what I've been up to, and I apologise for that. I've been plenty busy enough, but rubbish at reporting back regularly. As it's coming up to the end of my term and I'll be handing over to a new DPL very soon, I should rectify that. Here's some more details, and some links to other stuff that's been going on."

Full Story (comments: none)

Bits from the Release Team: Scheduling, transitions, how to help

Click below for an update from Debian's release team. "The situation of the release is not as good as we had hoped, but it looks like we can do the release in a few months if we all work together. Below is a list of bigger transitions and issues we are currently aware of." They are currently hoping for a freeze in late May or June. As always, they could use some extra help.

Full Story (comments: none)

Debian Project Leader Elections 2010: Call for votes

Voting is open for this year's Debian Project Leader Election. Votes must be received by April 15, 2010.

Full Story (comments: none)

Let's resurrect Debian Weekly News (act the second)

Alexander Reichle-Schmehl has a proposal to resurrect the Debian Weekly News. There has been some discussion on the debian-project mailing list on working with news.debian.net so the two efforts can share submissions.

Full Story (comments: none)

New mips porter box available

The Debian Project has a new mips porter box available. "I am pleased to inform you that we have a new powerful mips porterbox available. It is a Movidis Revolution x16 box, called gabrielli.debian.org. Movidis donated four of those boxes, so two or three of them will run as buildd and one as porter box."

Full Story (comments: none)

Fedora

Fedora 13 slips by one week

The Fedora 13 beta release has slipped by one week and is now scheduled for April 13. Because it is the second slip in the F13 release schedule, the final release will also be pushed back and is now scheduled for May 18. "This does not mean we will be pulling in a bunch of 'nice to have' updates, we will instead be concentrating only on release blocking issues in order to produce an RC that achieves Beta release criteria. Promotion of builds into 'stable' for F13 will continue to be extremely targeted until an RC goes 'GOLD'." Click below for the full announcement.

Full Story (comments: none)

FUDCon North America 2011 -- bids opening

The Fedora Project has opened up the bidding for a venue for FUDCon North America, to be held in December 2010 or January 2011. "The bid process will be open for a period of approximately 3 weeks. At that point the FPL and Community Architecture teams, as major stakeholders in the event, will go through the bids and make a decision on where we'll locate FUDCon North America."

Full Story (comments: none)

Fedora Board Recap 2010-04-01

Click below for a recap of the April 1, 2010 meeting of the Fedora Advisory Board. Topics include Strategic Working Group output and unfinished issues.

Full Story (comments: none)

SWG Meeting recap for 2010-04-05

Click below for a recap of the April 5, 2010 meeting of the Fedora Board's Strategic Working Group. Topics include User Base, SWG Spins Pages Resolution, and Default Offering.

Full Story (comments: none)

Ubuntu family

Shuttleworth: Shooting for the Perfect 10.10 with Maverick Meerkat

The next Ubuntu development cycle, evidently, will be called Maverick Meerkat. "This is a time of change, and we're not afraid to surprise people with a bold move if the opportunity for dramatic improvement presents itself. We want to put Ubuntu and free software on every single consumer PC that ships from a major manufacturer, the ultimate maverick move. We will deliver on time, but we have huge scope for innovation in what we deliver this cycle. Once we have released the LTS we have plenty of room to shake things up a little. Let's hear the best ideas, gather the best talent, and be a little radical in how we approach the next two year major cycle."

Comments (63 posted)

Ubuntu's window buttons to stay on the left

Mark Shuttleworth has rendered a verdict on the location of the window controls for Ubuntu 10.04 (Lucid Lynx) and presumably beyond. They will stay in the upper left, but the order will change from how they appeared in the beta; now it will be (from left to right): close, minimize, maximize. We looked at the topic last week, as well as a discussion of what might be done with the newly clear upper right part of the window border. "This bug is now marked wontfix. Please focus ongoing participation on the opportunities for innovation that this opens up. The decision as to the window controls location and order itself is now final, and as they say in the old newspapers, no further correspondence will be entered into."

Comments (73 posted)

Minutes from the Ubuntu Technical Board meeting

Click below for the minutes from the April 6, 2010 meeting of the Ubuntu Technical Board. Topics include 10.10 technical direction, Review progress of DMB, and more.

Full Story (comments: none)

Other distributions

What Is Unity Linux? (Yet Another Linux Blog)

Yet Another Linux Blog takes a look at Unity Linux. "Unity Linux is not a conventional distribution of Linux. It's a core on which developers can build their own distribution of Linux. We've set out from the start to provide an excellent minimum graphical environment that gave developers "just enough graphics" for them to create something. The smaller, the better. We elected to go with Openbox because of it's size and stability. We selected using Mandriva as our base because of the number of packages they provide and the quality of those packages. We pushed lxpanel as a minimal panel because it provides just enough functionality for distro developers to see what they've installed after they've installed it...it also is familiar to most people whereas Openbox right click menu's may not be. All in all, our target for the core release is developers."

Comments (none posted)

Distribution Newsletters

CentOS Pulse #1002

The CentOS Pulse newsletter for April 1, 2010 is out. "In this issue we talk about the ongoing build of CentOS 5.5 and we have another interview with someone from the community. Furthermore, we bring you updates concerning this amazing Operating System."

Comments (none posted)

DistroWatch Weekly, Issue 348

The DistroWatch Weekly for April 5, 2010 is out. "A variety of topics, ranging from Sony's controversial decision to remove Linux support from PlayStation to Ubuntu's announcement about "Maverick Meerkat", are discussed in this week's issue of your favourite distro-related magazine. The publication starts with a first-look review of Asturix 2.0 "Business" edition, a relatively new, Spanish distribution based on Ubuntu, before it continues with the usual round-up of news and links to interesting articles of the past week, including a story about the upcoming beta release of Red Hat Enterprise Linux 6, an update about Puppy Linux 5 series, and a link to an overview of Unity Linux, a minimalist Mandriva-based operating system. Then we have the regular Questions and Answers section which looks at a simple way of converting an RPM package into a DEB for easy installation on any Debian-based system. Finally, the Site News section presents the latest DistroWatch donation which goes to Libre Graphics Meeting, before it introduces Puredyne, an Ubuntu-based distribution designed for creative artists. Happy reading!"

Comments (none posted)

Fedora Weekly News 219

The Fedora Weekly News for March 30, 2010 is out. "As there were no announcements during the past week, we kick this issue off with news from the Fedora Planet, including news on a new virtualization tool for resizing VM disks, details on a new Fedora Mini, to discuss Fedora on platforms such as Sugar, Moblin and Maeon (MeeGo), and thoughts on how to use Wikipedia better in the classroom. Ambassadors news brings us an event report from Open Fest 2010 in Athens, Greece. In Quality Assurance, details on this past week's Test Day on printing, and this week's two test days on SSSD implementation and ABRT, the automated bug report tool, as well as the first test compose for Fedora 13 beta, as well as many other great tidbits! In Artwork/Design Team news, details on a Fedora 13 countdown banner, details on preparing beta artwork, and a new icon set submission for F13. This week's issue wraps up with security advisories for Fedora 11, 12 and 13 released over the past week. Enjoy FWN!"

Full Story (comments: none)

openSUSE Weekly News/117

This issue of the openSUSE Weekly News has news from Novell and the openSUSE community, as well as a bit of April foolishness.

Comments (none posted)

Ubuntu Weekly Newsletter #187

The Ubuntu Weekly Newsletter for April 3, 2010 is out. "In this issue we cover: Mark Shuttleworth: Shooting for the Perfect 10.10 with Maverick Meerkat, Ubuntu 10.04 beta 2 freeze now in effect, Ubuntu 8.10 reaches End-Of-Life April, 30, 2010, Call for Session Leaders for Ubuntu Open Week, Ubuntu Manual Team call for help, LoCo Directory: Team Events app Rocks, Ubuntu Ireland Global Jam Review, Help Translate the main LoCo Council page, Ubuntu One contacts, now with merging, Kubuntu Netbook Edition ScreenKast, At Home With Jono Bacon Podcast, Better sounding music with Rhythmbox, Ubuntu-UK Podcasts, and much, much more!"

Full Story (comments: none)

Distribution reviews

Hoogland: Android vs Maemo - Hands on Review

Jeff Hoogland has posted a detailed, comparative review of Android and Maemo. "Using Maemo on the other hand feels like you are holding a full computer in your hand. It is easy to keep track of multiple applications you have open on Maemo because you can tap a single button to view/switch between all open applications at any given time. Similar to Android, Maemo also has four work spaces on which you can place widgets, application launchers, and contacts for quick access. Like a full Linux distro however Maemo's desktops allow you to flow one into the next, continuously in a loop. Maemo also allows you to easily edit the number of workspaces available to you in case four is too many for your needs."

Comments (47 posted)

Page editor: Rebecca Sobol

Development

Visualizing open source projects and communities

April 7, 2010

This article was contributed by Koen Vervloesem

Visualization is a critical tool for exploring and understanding large amounts of data. Thanks to the computer power of the 21st century it has become possible to visualize ever-expanding amounts of data. Because the open source development model is massively decentralized and network-centric, it is by its nature the perfect domain for graph-based visualizations. Connections or dependencies between projects, communities, and code commits can be explored and displayed in a lot of ways. These visualizations can give us a unique perspective on open source projects and communities, such as fundamental differences in their approach.

Your author has a longstanding interest in visualizations, especially of non-numerical information. The classic books about visualizing complex data are of course Edward Tufte's works, beginning with The Visual Display of Quantitative Information. Recently, your author has enjoyed reading more programming-oriented books like Ben Fry's Visualizing Data and Toby Segaran's and Jeff Hammerbacher's Beautiful Data. What's even better is that a lot of open source software exists to put this theory into practice. We'll look at a few of the most interesting open source visualization programs and their application to open source projects and communities.

Code_swarm

[Apache code_swarm]

Michael Ogawa, a Ph.D. student in the Visualization & Interface Design Innovation group of UC Davis, conducted some interesting research about software visualization. The purpose of this research is to help understand the relationship of the communication between developers and the evolution of the source code. In 2007, Michael published a paper about Visualizing Social Interaction in Open Source Software Projects [PDF]. In 2008, he presented StarGate [PDF], a system that grouped developers of a software project visually into clusters corresponding to the areas of the file repository they work on the most. In both visualization methods, he used Apache and PostgreSQL as case studies. Interested readers should consult the papers for some illustrative insights in these projects.

Michael's most popular visualization method is code_swarm, which shows the history of commits in an open source project as a video. Both developers and files are represented as moving elements. When a developer commits a file, it lights up and flies towards that developer. Files are colored according to their purpose, such as whether they are source code or a document. If files or developers have not been active for a while, they fade away. The design of code_swarm is explained in the paper code_swarm: A Design Study in Organic Software Visualization [PDF], which shows some case studies of Python, Eclipse, Apache, and PostgreSQL. Videos generated by code_swarm for these projects are also available on the web site.

The code for code_swarm, written in Ben Fry's Java-based open source programming environment Processing, is available under the GPL v3. It supports various types of repositories: Subversion, CVS, Git, Mercurial, Perforce, VSS, Starteam, and Darcs. The wikiswarm add-on even allows visualizing Wikipedia page histories and user contributions. By downloading and executing the code, everyone can create their own software visualizations, and there's a mailing list for help. The project's wiki also has some documentation, such as a step by step guide of how to generate a video, a FAQ, and a gallery of third-party code_swarm videos.

Gource

[Python Gource]

At the end of 2009, New Zealand software developer Andrew Caudwell presented his software visualization project Gource (a play on Source and Gorse) on his computer graphics blog The Alpha Blenders. Gource takes the logs from a version control system of a software project and displays them as an animated tree with the root directory of the project at its center. Directories appear as branches with files as "leaves", represented by spheres that are colored dependent on their file extension. Developers currently contributing to the project can be seen floating near the files they are modifying. The whole visualization looks organic and is interactive, as the user can rotate the view and move the camera position.

The code for Gource is available under the GPL v3. It's designed for use with Git, Mercurial, or Bazaar, but it has also scripts to support CVS and Subversion. It needs a 3D accelerated video card and uses OpenGL for rendering. The wiki has some documentation, such as how to show Gravatar images for developers or how to change the appearance. The wiki also explains how to produce a video and shows some example videos and screenshots. In January, Andrew showed some of his visualizations at linux.conf.au.

In the last few months, several enthusiasts have been experimenting with Gource. For example, Michael DeHaan used Gource to create a visualization of Red Hat's provisioning server Cobbler and he explained that it can be really useful to evaluate an open source project:

When evaluating OSS software for use in business, you always need to know if the community is solid and self sustaining. [Gource] allows you to watch a short video and find out. Coupled with looking through the mailing list archives, that's a pretty good check. It can also help identify interesting patterns of large scale refactoring, new development, or stagnation.

Michael's visualization inspired Daniel Berrange to do the same exercise for libvirt:

It is clear from the video just how much development of libvirt has been expanding over the past 4 years, particularly with the expansion to cover VirtualBox and VMWare ESX server as hypervisor targets.

Daniel also produced a visualization of libvirt using code_swarm, which makes it easy to compare the merits of both methods.

Gephi

[CPAN Explorer]

A third, more research-oriented and more general graph visualization tool is Gephi, which is Java-based and is distributed under the GPL v3. Users of this tool call it "like PhotoShop for graphs", or should we say "like GIMP for graphs"? It's a very powerful and interactive tool for exploring, manipulating and visualizing graphs: users can manipulate the structure, shapes, colors, locations of the nodes, and so on, but they can also find the shortest path between nodes and compute graph metrics, find clusters, and conduct a lot of advanced graph analyses. Gephi was originally created in 2007 at WebAtlas, a French non-governmental organization involved in mapping the web and data mining, but is now developed by an international consortium of open source developers.

Gephi is not oriented specifically towards visualizing software projects: the wiki page about data sets that can be explored and visualized with Gephi gives examples such as the structure of internet, the topology of the Western States Power Grid of the United States, airlines, a network of disorders and disease genes linked by known disorder-gene associations, and a couple of social networks.

However, the social networks section shows some interesting data sets of open source projects, which have all been visualized in Gephi by Franck Cuny, a Perl hacker working at the French social media agency Linkfluence. He developed the CPAN Explorer web site, an interactive visualization to analyze relationships between developers and packages of CPAN (Comprehensive Perl Archive Network). The authors page shows relationships between developers inside CPAN: each developer is represented by a node with a size proportional to the number of modules the developer has released on CPAN. An edge between two developers is created when one developer uses a module from the other developer. The more uses of a developer's modules by other developers, the bigger the label.

One can deduce some interesting facts about developers from this graph: for example, Adam Kennedy gets a big node with a small label, because he has released a lot of modules on CPAN, but few of them are used by other CPAN developers. In contrast, Gisle Aas has a small node with a big label, because he has few modules, but some very popular ones like LWP and URI.

The community page shows web sites that write about Perl. Each website is represented by a node with a size dependent on the number of inbound links. A hyperlink between two websites is represented by an edge. The official Perl community web sites get a purple node, bloggers a green node, open source web sites a red node, companies a black node, and CPAN author pages a blue node. The last CPAN visualization on CPAN Explorer is a visualization of dependencies between CPAN module distributions. The CPAN Explorer developers offer static PDF and SVG version of all these graphs and dynamic JavaScript visualizations, but also the original data sets in gexf file format to explore in Gephi.

Last month, Franck introduced a new project, Github Explorer. He explained that this was a very natural choice:

I wanted to do something similar again, but not with the same data. So I took a look at what could be a good subject. One of the things that we saw from the map of the websites is the importance github is gaining inside the Perl community. Github provides a really good API, so I started to play with it.

This time, Franck didn't aim for the Perl community only, but the whole community of users of Github, a popular web-based hosting service for projects that use the Git revision control system. He warns that Github doesn't represent the whole open source community and that he has collected only a selection of all user profiles, but nevertheless it gives us a good picture. Each profile is represented by a node, and a link between two profiles is represented by an edge. The weight of the edge is incremented each time the person forks code from the target profile.

[Github Explorer PHP]

On his blog, Franck shows some Github visualizations he made with Gephi, colored by country and split according to the programming language. He has some thought-provoking analyses about some of the languages. For example, the Perl community is clearly split between the 'west' and Japan. In the Python community, there is clearly one main project, the web framework Django. PHP is the only community on Github where the visualization shows clusters of people working together on a specific project. The Ruby graph looks like a big ball of yarn with a couple of isolated countries. In other visualizations, Franck split the graphs according to their country. He offers the data he gathered for use in Gephi, he has published all the graphs on Flickr, and he will offer a printed version on posters of size A2 and A1 for sale soon.

Towards a better understanding of open source communities

Many of these visualizations are beautiful, but that's not the point: to paraphrase Richard Hamming's dictum "The goal of computation is not numbers, but insight.", your author would say "The goal of visualization is not beauty, but understanding." and visualization tools can help understand the internals and the dynamics of open source projects and communities. While code_swarm and Gource can show users a lot about patterns and evolutions in the development of a specific project, including how the developer community works together, CPAN Explorer and Github Explorer are about visualizing global connections between a lot of open source projects, which is also an important factor in open source communities. Now we just have to wait for some creative minds to visualize the SourceForge or Launchpad communities.

Comments (1 posted)

Brief items

Quotes of the week

Q: O Great Rabbi, Perl has so many precedence laws I feel I shall never learn them all. Which is the most important of these commandments?

A: As it was in the beginning, is now, and ever shall be, the First Commandment is the Law of Algebraic Precedence:

#1 MULTIPLICATIVE OPERATORS BIND MORE TIGHTLY THAN ADDITIVE ONES.

The Second Commandment is to think of those who come after you, most preferably before they do so:

#2 DON'T BE A DAMNED FOOL: OTHERWISE USE PARENTHESES!

Follow these two Commandments and all the days of your life will be blessed, for your code shall be ever right[eous] and all shall love you for it.

-- Tom Christiansen

Since Emacs is just an editor, not a god, it cannot do miracles.
--Richard Stallman

Comments (2 posted)

Firefox 3.6.3 security update now available

Mozilla has released Firefox 3.6.3 to address a critical security issue that could allow remote code execution.

Full Story (comments: 3)

Grease 0.2 Released

Grease is a Python-based 2D game engine and development framework, focused on quick development, good performance, and fun. The project's documentation includes a detailed tutorial on the creation of an asteroids-style game. "Grease does not attempt to provide one-size-fits-all solutions. Instead it provides pluggable components and systems than can be configured, adapted and extended to fits the particular needs at hand."

Full Story (comments: none)

Notmuch 0.1 released

Version 0.1 of the "Notmuch" email client (recently reviewed on LWN) has been released. "In trying to get notmuch to grow up a little bit, I've just added a version number (0.1 initially) and have started doing releases." More informative release notes are promised for the future.

Full Story (comments: 7)

StretchPlayer 0.500

The initial release of StretchPlayer is available. StretchPlayer is an audio file player with time-stretching and pitch-shifting features. The intended audience would appear to be musicians who want to slow down a song to learn how to play with it. More information can be found on the project's home page.

Full Story (comments: none)

A proposed Subversion vision and roadmap

A group of Subversion developers recently met in New York in an attempt to come up with a plan for the future development of this source code management system; a summary of that meeting has now been posted. "Subversion has no future as a DVCS tool. Let's just get that out there. At least two very successful such tools exist already, and to squeeze another horse into that race would be a poor investment of energy and talent. What's more, huge classes of users remain categorically opposed to the very tenets on which the DVCS systems are based. They need centralization. They need control. They need meaningful path-based authorization. They need simplicity. In short, they desperately need Subversion. It's this class of user -- the corporate developer -- that stands to benefit hugely from what Subversion brings to the party." Read the whole thing for details on how they plan to meet that developer's needs.

Full Story (comments: 227)

UFRaw 0.17 released

UFRaw is a utility for the processing of raw images from digital cameras. The biggest addition in the 0.17 release would appear to be the incorporation of the lensfun library, allowing UFRaw to correct for lens distortion using a database of hundreds of lenses. Also in this release are a new despeckling algorithm, hot pixel elimination by default, better zoom support, and more.

Full Story (comments: none)

X server 1.9 release thoughts

Fresh from the X.Org server 1.8 release, Keith Packard is pondering making some changes for the next time around. At the top of his list is shortening the release cycle to something closer to three months as a way of getting new hardware support to users more quickly. That proposal is not universally loved, though, so it's not clear if it will be adopted or not. He is proposing that the 1.9 release happen in late August. "I don't think there are any major changes planned for this release, so this shorter merge window seems like it should be sufficient. Nor do I necessarily think that this would also mean that the X.org release date should be moved in; having the X server ready *before* the X.org release seems like a good idea to me."

Full Story (comments: none)

Newsletters and articles

Development newsletters from the last week

Comments (none posted)

Georges Auberger: Songbird Singing A New Tune

Georges Auberger reports that the Songbird media player is dropping Linux support. "After careful consideration, we've come to the painful conclusion that we should discontinue support for the Linux version of Songbird. Some of you may wonder how a company with deep roots in Open Source could drop Linux and we want you to know it isn't without heartache. We have a small engineering team here at Songbird, and, more than ever, must stay very focused on a narrow set of priorities. Trying to deliver a raft of new features around all media types, and across a growing list of devices, we had to make some tough choices." An untested and unsupported version of Songbird for Linux will still be available for developers.

Comments (9 posted)

Page editor: Jonathan Corbet

Announcements

Non-Commercial announcements

Fedora: A Case Study of Design in a FLOSS Community

Máirín Duffy will be presenting a paper on design in the free software community at the upcoming ACM SIGCHI conference. That paper has been posted as a 12-page PDF file, and she is looking for feedback. It's an interesting read from somebody who has been very effective at getting things done in that environment. "Contributors in FLOSS projects, among whom the majority are volunteers, come to the project with different and sometimes conflicting visions and goals. The absence of a central driving vision is challenging to a designer. It is necessary in design practice to balance the needs and requirements of different stakeholders, but it can be substantially more difficult to do so in a FLOSS project where the very goals of the entire project itself may not be agreed upon."

Comments (19 posted)

Articles of interest

Mueller: IBM breaks the taboo and betrays its promise to the FOSS community

Florian Mueller writes about the dispute between IBM and TurboHercules SAS; TurboHercules is trying to commercialize an open-source emulator for IBM's mainframe systems. "To add insult to injury, the list of patents with which IBM tries to intimidate the Hercules project even includes two of the 500 patents IBM originally 'pledged' to the open source community. Patent numbers U.S. 5613086 and U.S. 5220669 appear on page 4 of IBM's 2005 'patent pledge', and also appear as patents #83 and #106 in the letter IBM sent to TurboHercules. This betrayal of the promise is unbelievable, but I never believed that IBM was sincere about that pledge in the first place."

A Scribd link to IBM's patent-threat letter is provided in the article, but it can be read without Flash on this page. It's worth noting that TurboHercules has been involving the lawyers on its side as well.

Comments (15 posted)

IBM Denies Breaking Its Open Source Promise (eWeek)

Here's an eWeek article about the battle of words between IBM and TurboHercules. "In response to a query from eWEEK, IBM issued the following statement: 'IBM sent TurboHercules a non-exhaustive list of patents that pertain to our mainframe technology. We did not make any explicit assertions or claims that TurboHercules had violated them. We were merely responding to TurboHercules' surprise that IBM had intellectual property rights on a platform we've been developing for more than 40 years. We stand behind the pledge we made in 2005, and also our rights to protect our significant investments in mainframe technology.'"

Comments (11 posted)

Five open source alternatives to the iPad (opensource.com)

Opensource.com briefly looks at iPad alternatives, which provide a less locked-down experience than the new tablet from Apple. Several are not yet for sale—or are theoretical like the Google tablet—but they start to give an idea of what will be coming in the free-software-based tablet world. "A 2-pound tablet for 300-400€ with a Linux OS designed specifically for a touchscreen with usability in mind. That's the completely open source iFreeTablet developed at the University of Cordoba in Spain. And you get the Flash support and multitasking you can only dream of with that shiny iPad. It's biggest shortcoming is the mere 2.5 hour battery life."

Comments (14 posted)

Emacs & the birth of the GPL (The H)

The H delves into the history of emacs and the GPL. "Emacs has become emblematic of Lisp, Unix and free software, but was originally written by Richard Stallman, with contributions from Guy Steele, Dave Moon, Richard Greenblatt and Charles Frankston, as an extension to the TECO editor on MIT's AI Lab Incompatible Timesharing System (ITS) which ran on PDP-6 and PDP-10 machines, somewhere around 1974."

Comments (17 posted)

People of Lava Want to Put a Big Android in Your Living Room (TechNewsWorld)

TechNewsWorld takes a look at an Android powered television. "The much-anticipated "Google TV" may be in the works, but a Swedish company has already won the race to the proverbial finish line with the world's first Android-based TV. Due to hit stores this fall, People of Lava's Scandinavia is a fully interactive Internet TV that combines the functionality of an Android smartphone with that of a high-end, full-HD LED TV set, according to the company."

Comments (1 posted)

Resources

CE Linux Forum Newsletter: March 2010

The March 2010 CE Linux Forum Newsletter covers ELC 2010 Program Highlights Update, CELF Architecture Group Project Proposals - Evaluation in Progress, 32nd Japan Technical Jamboree Report, ARM Device Tree Work, and -ffunction-sections Work.

Full Story (comments: none)

Interviews

QA with IBM's Dan Frye: "Everything Has Changed" (Linux.com)

Linux.com has an interview with IBM's Dan Frye as a prelude to his upcoming keynote at the Collaboration Summit. In the interview he talks about how things have changed since IBM joined the Linux community over ten years ago along with some thoughts on where Linux goes from here. "Linux plays a significant role in IBM's smarter planet initiative. As the world becomes more instrumented, interconnected and intelligent, Linux will be a fundamental element of most cloud infrastructures in the future because of the same characteristics that have drawn customers to Linux over the last decade. The open nature of Linux and its ability to run on a wide variety of platforms is ideal for spanning an enterprise and virtualizing the aggregated computing resources. This capability makes it an ideal building block for a smarter planet."

Comments (none posted)

QA with Parallels CEO: Prioritizing Kernel-Level Contributions (Linux.com)

Linux.com has an interview with Serguei Beloussov, CEO of Parallels. "Parallels is joining The Linux Foundation today and speaking at the Collaboration Summit next week. Why are you investing in these activities? Beloussov: Since Parallels was founded in 2000, we have been a strong contributor and supporter of Linux - in fact we did not support any other platforms until 2005. Parallels enables its partners to become profitable providers of cloud services. This space - of traditional hosting and cloud services providers for small businesses - has always been dominated by Linux and it's not a coincidence. Delivering profitable cloud service requires a number of capabilities where Linux and Open Source have traditionally been strong - cost structure, flexibility, scale, etc."

Comments (2 posted)

Calls for Presentations

PyCon Australia Call For Proposals

PyCon Australia will take place June 26-27, 2010 in Sydney. The deadline for proposal submission is the 29th of April. "We are looking for proposals for Talks on all aspects of Python programming from novice to advanced levels; applications and frameworks, or how you have been involved in introducing Python into your organisation."

Full Story (comments: none)

Registration for LVEE 2010 is open

The sixth international conference of developers and users of free software "Linux Vacation / Eastern Europe" (LVEE 2010) will take place in Belarus on July 1-4, 2010. The call for participation is open until June 12, 2010. Abstracts of reports are due by May 24, 2010.

Full Story (comments: none)

Hack.lu 2010 CfP

The hack.lu 2010 security conference will be held in the Grand-Duchy of Luxembourg, October 27-29, 2010. The call for papers is open until June 1, 2010.

Full Story (comments: none)

Upcoming Events

O'Reilly Open Source Convention Reveals Program and Opens Registration

OSCON, the O'Reilly Open Source Convention, will be held in Portland, Oregon July 19-23, 2010. Program chairs Allison Randal and Edd Dumbill have announced the program, and registration has opened. Early registration discounts apply until June 2, 2010.

Full Story (comments: none)

Events: April 15, 2010 to June 14, 2010

The following event listing is taken from the LWN.net Calendar.

Date(s)EventLocation
April 12
April 15
MySQL Conference & Expo 2010 Santa Clara, CA, USA
April 14
April 16
Linux Foundation Collaboration Summit San Francisco, USA
April 14
April 16
Lustre User Group 2010 Aptos, California, USA
April 16 Drizzle Developer Day Santa Clara, CA, United States
April 16
April 17
R/Finance 2010 Conference - 2nd Annual Chicago, IL, US
April 23
April 25
FOSS Nigeria 2010 Kano, Nigeria
April 23
April 25
QuahogCon 2010 Providence, RI, USA
April 24 Festival Latinoamericano de Instalación de Software Libre Many, Many
April 24 Open Knowledge Conference 2010 London, UK
April 24
April 25
OSDC.TW 2010 Taipei, Taiwan
April 24
April 25
BarCamb 3 Cambridge, UK
April 24
April 25
Fosscomm 2010 Thessaloniki, Greece
April 24
April 25
LinuxFest Northwest Bellingham WA, USA
April 24
April 26
First International Workshop on Free/Open Source Software Technologies Riyadh, Saudi Arabia
April 25
April 29
Interop Las Vegas Las Vegas, NV, USA
April 28
April 29
Xen Summit North America at AMD Sunnyvale, CA, USA
April 29 Patents and Free and Open Source Software Boulder, CO, USA
May 1
May 2
OggCamp Liverpool, England
May 1
May 2
Devops Down Under Sydney, Australia
May 1
May 4
Linux Audio Conference Utrecht, NL
May 3
May 6
Web 2.0 Expo San Francisco San Francisco, CA, USA
May 3
May 7
SambaXP 2010 Göttingen, Germany
May 6 NLUUG spring conference: System Administration Ede, The Netherlands
May 7
May 8
Professional IT Community Conference New Brunswick, NJ, USA
May 7
May 9
Pycon Italy Firenze, Italy
May 10
May 14
Ubuntu Developer Summit Brussels, Belgium
May 17
May 21
Fourth African Conference on FOSS and the Digital Commons Accra, Ghana
May 18
May 21
PostgreSQL Conference for Users and Developers Ottawa, Ontario, Canada
May 24
May 25
Netbook Summit San Francisco, CA, USA
May 24
May 26
DjangoCon Europe Berlin, Germany
May 24
May 30
Plone Symposium East 2010 State College, PA, USA
May 27
May 30
Libre Graphics Meeting Brussels, Belgium
June 1
June 4
Open Source Bridge Portland, Oregon, USA
June 3
June 4
Athens IT Security Conference Athens, Greece
June 7
June 9
German Perl Workshop 2010 Schorndorf, Germany
June 7
June 10
RailsConf 2010 Baltimore, MD, USA
June 9
June 11
PyCon Asia Pacific 2010 Singapore, Singapore
June 9
June 12
LinuxTag Berlin, Germany
June 10
June 11
Mini-DebConf at LinuxTag 2010 Berlin, Germany
June 12
June 13
SouthEast Linux Fest Spartanburg, SC, USA

If your event does not appear here, please tell us about it.

Page editor: Rebecca Sobol

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds