LWN.net Logo

LWN.net Weekly Edition for August 12, 2010

GUADEC: Owen Taylor on GNOME Shell

By Jake Edge
August 11, 2010

The biggest user-visible piece of GNOME 3 will be the GNOME Shell, which manages the desktop experience. At GUADEC, Owen Taylor updated the assembled GNOME hackers on the current status of GNOME Shell and the work still to be done. He also demonstrated some of the new functionality and compared the Shell with where it was when he presented at the Gran Canaria Desktop Summit (GCDS) in 2009.

[Owen
Taylor]

Since GCDS, "we wrote code", Taylor said, consisting of some 1362 commits, 1174 of which were code along with 188 translation commits. The project has also added new contributors and four of the top ten contributors since GCDS were new to the project. He was happy to see that those new contributors were not only prolific, but also added "very significant" new features.

In preparing for his talk, Taylor went back to the version from GCDS and was "surprised that it still built". The user interface has been redesigned since then and, in some cases, things have been rewritten three or four times in the interim. He found it interesting that certain features that he thought were there from the start were missing, while others that he thought were newer actually appeared in the year-old version.

The Shell now sports a "sleeker, black look" that fades into the background because it blends into the monitor bezel, he said. The "mess in the upper right hand corner" has been cleaned up, and the task list from last year is gone. In addition, the menus are not GTK-based, but are instead styled with the Shell, giving them a more integrated look.

Switching between applications and workspaces has also undergone some major changes. Alt-Tab now groups all of the windows for an application into a single entry, so that you choose between applications, rather than individual windows. Last year, activities were represented by a grid of workspaces with a dashboard to launch new applications. Now it is "slicker", with a linear view of the workspaces that can be scrolled horizontally to view additional workspaces.

The application browser has switched from a "straight reimplementation from GNOME 2" to a gridded view with fewer categories. Searching has also been integrated into the dashboard. The message tray now slides up from the bottom and you no longer have to dismiss each notification as you did in GNOME 2. It is also integrated with Empathy so that replying to a message no longer requires switching to the application itself—the reply can be typed into the message notification.

The "hot corner"—simply moving the mouse to the upper left corner of the screen—is another new feature. Moving there brings up the activities overview that shows workspaces, places, and recent documents. It's very useful, Taylor said, so much so that "if you go back to GNOME 2, you will keep going to the hot corner".

Looking Glass is an integrated JavaScript console, inspector, and debugger, which acts like "Firebug for the Shell". Taylor did a live demonstration by typing one-line JavaScript expressions into Looking Glass. With that (fairly) simple JavaScript, he was able to change menu titles, scale them to different sizes, and even have them run the scaling as a transition so that the text continuously grew in size.

GNOME Shell is based on the Clutter toolkit, but because "Clutter itself only has four actors that are useful" for the Shell, it uses the Shell toolkit (St) atop Clutter. St is descended from Moblin's MX toolkit, but is more focused on the needs of the Shell. St emphasizes CSS with transitions, property inheritance, shadows, rounded corners, and so on, which make for a "pretty powerful set of capabilities". MX has many more widgets and is more powerful overall. The separate evolution of St and MX is "not a good thing long-term" and the team aims to reunite them at some later point.

Taylor also talked about the development process for GNOME Shell, which is very reliant on code review. The normal GNOME model is "code ownership" he said, but the Shell does "code review of everything". There is no formal structure to the review process, but they get two pairs of eyes on all code changes. The process has its "good parts and not-so-good parts", but overall it works well because it spreads out the knowledge of the code among multiple people in the project. It can lead to bottlenecks, where "patches sit around for a while", but he definitely recommends that development model for other projects.

GNOME Shell is in "good shape" for basic functionality, like window switching and launching applications, Taylor said, though there are still bugs and other things to fix. It makes for a "pretty coherent whole" that can be used on a day-to-day basis. The status area in the upper right hand corner is still a work in progress as is the integration of the Shell with the rest of the system. He pointed to the log out and lock screen dialogs as things that were not yet rendered in the Shell style, and still look like the GNOME 2 versions.

Those changes will come relatively soon, but there are some other things that are a bit further out. The "recently used documents" in the activities overview is just a placeholder right now. There are no customization options for those who want to change the styles or behavior of the Shell. The plan is to add an extension API like Firefox has, but other than some basic infrastructure, that isn't nearly ready. In addition, there is no fall-back support if 3D graphics—required by Clutter—are not available. Some way to fall back to the GNOME 2 look in that case is desired.

Based on the status, it was probably fairly obvious to those in the room that GNOME Shell might not be completely ready for the September release—foreshadowing the next presentation, which was by Vincent Untz and the release team delaying GNOME 3.0. GNOME Shell definitely looks like more than an incremental change to the desktop experience, and Taylor's demo with Looking Glass showed the latent potential for theming and other customization that underlies the Shell. With an additional six months to work on it, focusing on completing a coherent whole, GNOME Shell seems quite likely to impress.

Comments (15 posted)

LinuxCon: The state of MeeGo today

August 11, 2010

This article was contributed by Joe 'Zonker' Brockmeier.

It seems like longer, but it's only been six months since Intel and Nokia announced that they'd be joining the Maemo and Moblin communities into MeeGo. A lot has happened in the interim, and MeeGo community manager Dawn Foster was on hand at LinuxCon to provide an update about the state of MeeGo and its community.

Foster started the presentation talking about the basics of MeeGo, its history and reasons for the merger. MeeGo's scope is everything from IVI (in-vehicle) systems to handsets and netbooks. MeeGo releases have been staggered so far, with the netbook developer release coming first, followed by the handset and then IVI release. However, Foster says that this is not the long-term plan. The MeeGo project is moving to a "cadence" of six-month releases starting in November.

Foster talked, in general terms, about the MeeGo focus on contributing back to upstream as part of the project goals. She said that goal was to contribute all work back to upstream projects used by MeeGo.

[Dawn
Foster]

Why the merger? Intel and Nokia realized they had similar projects with similar ideas and goals, so it didn't make sense to pursue the two projects separately. The decision was made from a technical perspective, and Foster acknowledged that it was an internal decision between Intel and Nokia and not a community driven decision.

The MeeGo merger was not, shall we say, universally well-received. Development communities on both sides were surprised by the move and unhappy with some technical decisions. Foster noted some of the challenges that the project has had since its inception, including architectural issues like the packaging format (choosing RPM over Debian packages), governance challenges, and figuring out who would be responsible for various tasks. With Maemo and Moblin, people had well-defined areas of responsibility on each side, and Foster noted (without specifics) that, after the merger, it was necessary to choose one person from either MeeGo or Maemo to take responsibility.

This has brought on significant social and community challenges. The Maemo project had many interested users of mobile devices, while Moblin was focused on netbooks. Foster said that it's taken a lot of adjustment on both sides. "You have a new community that is different than the original communities. And there was frustration [...] early on with all these things and the timeline required to do this."

Foster then ran through the timeline of MeeGo development since its announcement in February. March 31st was "day one code" for the core operating system — everything below the user experience. May 25th was the netbook project code release that included the user experience for netbooks. June 30th was the release of handset day one code, targeted for developers. August 2nd MeeGo made the first IVI release. Now MeeGo will be moving to regular six-month update and release schedules with the 1.1 release coming in November where all the releases will converge. The only reason for the staggered releases initially, says Foster, is that the project is on an aggressive schedule.

MeeGo has been solving technical challenges quickly but social and community challenges take more time. Foster talked about the community growth and tools that have been put into place since its inception like mailing lists, forums, bugzilla, and so on. Foster noted that the community has been frustrated with the time required to determine the governance model and resolve other issues, but that has been mitigated with the code releases and having more clarity around the roles in MeeGo. Still, she says that there's a lot of work yet to be done.

Looking at the numbers, the MeeGo community does seem to be growing at a reasonable clip. The community now has more than 11,000 members, which is up from 9,626 in June. There have been about 7,400 posts on the developer mailing lists since project started in March, and more than 7,000 wiki edits and nearly 7,000 forum posts since the start a few months ago. Foster also said that there were about 430 people in the #meego IRC channel on the morning of the talk. Metrics are public, and can be found on the MeeGo wiki where Foster puts up monthly statistics.

Next, Foster focused on where MeeGo needs help and is looking to recruit contributors. In particular, Foster said that the "best" contributions were applications and noted that they need to get people "excited" about building applications for MeeGo. MeeGo does start with a fair base of applications. When Foster demonstrated MeeGo for the audience, she pointed out that she'd had pretty good success just installing things using RPMs. For instance, OpenOffice.org. However, the handset and IVI editions of MeeGo are unlikely to run random RPMs or applications like OpenOffice.org.

Foster talked briefly about the MeeGo Software Development Kit (SDK) to point out that developers could work on MeeGo apps on Linux and Windows.

Foster also stressed non-development contributions and noted that MeeGo could use people to update and edit the wiki, report bugs, write documentation and FAQs, and work on translations and localization.

What about core contributions? I asked whether MeeGo had any core or significant contributions outside developers employed by Intel and Nokia. So far, not much. Foster did mention that Novell had been "very active" and that a few developers from other companies had been involved but not very many.

Another question about MeeGo from the audience was the state of open source drivers. Foster says that MeeGo can't control the drivers from OEMs, but "the goal is to have an environment that is fully functional from an OSS perspective, but we can't control that." What about drivers from Intel or Nokia? Foster, and another Intel employee in the audience, noted that they were making a good effort to ensure that hardware from Intel came with open drivers. However, Foster says that MeeGo is run by a software unit inside Intel, while the bulk of Intel is (of course) focused on hardware. Thus, it requires negotiation with other business units to try to make sure that hardware is always released with open drivers. Unfortunately, they don't always succeed, and can't guarantee 100% success in the future.

Foster demonstrated a MeeGo system for the audience, taking about 10 minutes to walk through the interface and features on a Toshiba that had originally shipped with Windows. Overall, the interface is looking pretty good. Some details need to be ironed out, however. For example, MeeGo currently does not expose any way to turn the system off or reboot through software. Foster says this is an area of contention within the MeeGo community, with some passionately arguing for or against the presence of power controls in software versus only featuring a hardware button for power off. This can be very confusing when software updates prompt the user to reboot.

Contributors interested in working on MeeGo can join the meetings on IRC, and should consider attending the first MeeGo conference to be held in Dublin, Ireland. The conference is to be held November 15 through 17, though apparently it will end early on the 17th to make way for a football (or "soccer" as recognized by those in the U.S.) match. A tour of the Guinness facility is also in the offing. The conference is capped at 600 people, and travel sponsorship may be available for those with significant contributions beyond just employees of Intel and Nokia. Proposals are welcome through August 23rd.

Overall, the update shows a community that is still in a nascent stage. Foster signaled willingness to address community issues and try to include developers outside Nokia and Intel's walls, though specifics were a bit lacking. It would have been interesting to hear more details about MeeGo's governance and plans to include contributors outside the corporate walls, which is going to be fairly important if MeeGo is going to succeed as a legitimate community project. As it stands, it does seem that MeeGo has taken some reasonable steps toward addressing community concerns and trying to include external contributors in the long term.

Comments (3 posted)

The LinuxCon media panel

By Jonathan Corbet
August 11, 2010
A common event at conferences is a panel of developers with reporters listening from the audience; your editor moderated just this kind of panel at LinuxCon 2010. This time around, though, we also saw the tables turned: there was a panel of journalists facing the developers that they write about. The panelists were Joe "Zonker" Brockmeier, Jason Brooks, Sean Michael Kerner, Ryan Paul, and Steven Vaughan-Nichols; it was an interesting opportunity to see how things look from the other side of the keyboard.

The opening question was simple: what was the most significant Linux-related story in the last ten years? Sean wasted no time in naming the SCO case - it is, he says, "the story that keeps on giving"; seven years later and he's still writing about it. Steven, instead, cited IBM's endorsement of Linux and statement that it would be investing $1 billion in the platform. That announcement, he says, legitimized the platform and made it possible for people in companies worldwide to consider using it without getting into trouble. Jason pointed at the birth of Red Hat Enterprise Linux, while Ryan talked about the onset of "mobile ubiquity" and the near dominance that Linux has in that area. Ryan also mentioned MeeGo as an example of how large companies have come to appreciate the value of collaboration.

Zonker, instead, nominated the rise of Ubuntu. He said that Ubuntu forced the other distributors to focus on community, something they had not been doing well before; this is a claim which was not universally accepted by the audience. At this point, Sean jumped in to say that Ubuntu only took off because of the seemingly unending delays in the Debian Sarge release. Had Sarge gone out on time, he says, we would not be hearing so much about Ubuntu now. Steven added that Ubuntu succeeded because it was an attempt at commercializing Debian from the outside; earlier attempts from the inside (he mentioned Ian Murdock in particular) were seriously attacked by the community and didn't get very far.

Moving on: what is the big story for Linux today? The consensus answer seemed to be "Android." Steven claims that ChromeOS is going to be a big deal. He also mentioned the license compliance program just announced by the Linux Foundation which, he says, will speed Linux adoption.

The reporters were then asked about numbers from analysts, which, when it comes to Linux, are somewhat controversial. How do they cope with that uncertainty? Ryan responded that these numbers (covering Linux adoption and such) are not really illustrative and are missing a lot of context. Beyond that, they are the product of companies with conflicts of interest; analyst firms have paying customers who have an interest in how those numbers come out, so the result is not objective. Sean said he does not trust the numbers; they are always wrong, so he does not use them. Jason wished for better numbers on enterprise subscription sales, while Zonker criticized analyst firms for refusing to come up with a solid methodology for counting unpaid Linux use. Steven asked simply: who cares about these numbers anyway?

It was asked: it seems to be harder to get reporters' attention for Linux-related stories in recent years, what are reporters looking for? Sean suggested that there are really only ten Linux stories that he writes and rewrites repeatedly; one of them is "Mark Shuttleworth said..." He also said that he always covers what the big vendors are doing, but news of the form "application X now runs on Linux" is not really interesting. Zonker noted that, while more reporters (with less expertise) are covering Linux due to its increasingly mainstream nature, a lot of reporters have also been laid off in recent years. Steven said that we're seeing a natural progression; like the radio magazines of the 1920's or the Internet magazines of the 1990's, much Linux news has simply become mundane and boring.

Steven also said that there is little interest by publishers in "serious" stories about Linux, a statement that Zonker seconded. It is necessary to write "popular" stories that will draw advertisers. Linux companies, it seems, are not big buyers of online advertising; that affects coverage too. Several of the panelists said that there is still a firm wall between advertising and editorial, but that claim (in your editor's opinion) seems somewhat contradicted by the fact that they have a hard time pitching stories which do not appeal to advertisers. Jason said that the publishing business model is, in general, in trouble and hasn't yet figured out the changes that have come with the Internet.

As an aside, Zonker asked how many members of the audience run AdBlock (quite a few hands were raised). Those people were told that they are "killing publishing, seriously."

Next question: who is the audience for what the panelists are writing? Ryan said that ars technica has a highly diverse audience, since it is not just a Linux-related site. Their readers are technology enthusiasts who (advertisers are told) will take what they learn to the workplace. Jason writes for enterprise information technology workers, while Zonker writes for a number of different publications (including LWN) with a variety of audiences. Steven, too, writes for many audiences.

What about companies becoming their own publishers? Steven claimed that people are becoming confused by publications which really just carry the company line, as opposed to what a real reporter would say. Readers are not asking often enough where a particular bit of news comes from. Ryan noted that open source companies are much more transparent than many others, so information tends to be more accessible; community members can use that information to get the word out, reducing the need for traditional journalism. But Steven noted that these companies always have something that they are not saying - he mentioned silent fixes in Mozilla releases - so there is still a need for people who will dig through stuff. Zonker said that what's often missing is context; he suggested that people will wander into (for example) the GNOME census story without understanding all that's going on.

At that point, time ran out for this standing-room-only session. In your editor's opinion, it was an interesting look at how the more traditional media sees our community and the pressures that reporters are working under. Those people, too, are operating in a rapidly changing world; they have the challenging task of documenting those changes while being very much in the middle of them.

Comments (29 posted)

Page editor: Jonathan Corbet

Security

EFF analyzes SSL certificates and certificate authorities

August 11, 2010

This article was contributed by Nathan Willis

The Electronic Frontier Foundation (EFF) undertook a lengthy study of the web's publicly-visible SSL certificates earlier this year, logging the certificates and analyzing their makeup. EFF's Senior Staff Technologist Peter Eckersley and iSEC Partners' Jesse Burns presented the project at DEFCON 18, paying particular attention to what the data reveals about SSL certificate authorities (CAs), which are the entities that sign SSL certificates and are the basis of the trust that web browsers place on certificates. As one might suspect, there are potential vulnerabilities discovered "in the wild" encompassing everything from cryptographically weak certificates to CAs that employ troublesome signing behavior.

The data

Over a period of three months, the SSL Observatory project collected SSL certificate data from around the Internet. The process began by scanning with Nmap for hosts that were listening on port 443 and logging the results. A Python client then initiated an SSL connection handshake with each host, dropping the connection before the key exchange, but saving the certificate and other data for later analysis.

The certificates then needed to be parsed, which is not a trivial task due to the many quirks, options, extensions, missing fields, and other oddities found in real-world certificates. The certificate data was then stored in a MySQL database, that the team could query to examine particular facets of the certificates or the CAs themselves.

The Observatory collected more than 4.3 million certificates, but for analysis purposes pared the list down to only those certificates that could be verified by a browser as valid (based on CA signature, date, key usage, and other attributes), and removed duplicates. That process left 1,377,067 individual certificates.

The individual certificates were then mined for CA trust attributes. SSL uses the X.509 standard, in which each "leaf" certificate can be authenticated by a signature from a trusted CA. The browser retrieves the CA certificate, verifies that the CA's public key created the signature found on the leaf, and verifies the CA certificate itself by checking its signatures by higher-authority CAs, following such a path all the way back to a "root" CA.

Browsers and operating systems come with a predefined list of trusted root CAs. The DEFCON slides note that Mozilla browsers recognize 124 trust roots, and Microsoft lists 19 in Windows — although the Windows list can be updated, silently, on demand. The reason for the large size discrepancy between the two sets is not explained; Mozilla makes its list public, however, and Microsoft periodically publishes a PDF list of trusted "root certificate program members" on its support site. Combined, those two sources of CA trust construct a graph of 1482 total CAs (including roots and subordinates), which was found by the SSL Observatory data to come from 651 distinct organizations when corporate ownership relationships are accounted for.

Analysis

The invalid certificates are a security problem in their own right (such as phishing sites masquerading as well-known organizations, or using mobile network operators operating WAP gateways with wildcard certificates), but the Observatory project was more interested in exploring the makeup of the valid certificates. The web of trust created by the CA model is taken for granted by most users, and developers, so problems with the valid certificates constitute a potentially more damaging security risk.

The first peculiarity found in the certificate data is that a handful of certificates account for a startlingly high percentage of the Internet's signed leaf certificates. Of the 1,377,067 valid certificates, 300,224 of them were signed by a single CA certificate from web hosting company GoDaddy. Another 244,185 were signed by one certificate from Equifax. 89,216 were signed by a Thawte certificate that lacks a Subject Key Identifier (i.e. SKID, a required element that browsers are supposed to use to match the signing key's ID against the key ID found in the signatures on signed certificates), and 85,440 by a single Comodo certificate. Together, these four CA certificates account for signatures on 52.2% of all valid SSL certificates.

The data also revealed a large number of unused certificates and valid CA certificates that share the same public signing key — situations that may be acceptable, but also entail risks. Unused CA certificates, for example, may be made available as a "safety switch" for the CA, allowing it to revoke a compromised certificate and switch to another one that is already known to the world's browsers. There are also situations where it makes sense for multiple CAs to share the same signing key, such as in the case of corporate mergers or acquisitions, where the CAs become legally one entity but continue to use their original issuer identities in business.

But these situations can also cause trouble for browsers, who should not, for example, retain certificates that are unused because the CA has ceased operations. The DEFCON talk also explains how a CA can use the same signing key to artificially extend the life of a signature by creating a second CA certificate identical to the original, but with a later expiration dates. The data set found 80 distinct keys used in more than one CA certificate.

Practical concerns and vulnerabilities

There are also examples of CAs signing what should not be considered valid certificates, such as RFC1918-reserved IP addresses or unqualified host names. There are, for example, more than 6000 certificates validated by trusted CAs for "localhost."

Other problem areas include countries that do not use their own national CA for signing government web sites, certificates that feature conflicting settings (e.g., a certificate that indicates the subject is not a CA, but whose key advertises that it is a CA), and certificates with weak public keys. The presentation identified at least two trusted leaf certificates that use 508-bit RSA keys, signed by Equifax and Thawte.

A more serious problem is certificates that use keys generated by the vulnerable OpenSSL package included in Debian between 2006 and 2008. SSL Observatory cataloged around 28,000 vulnerable certificates, although only 500 are still valid today. Eckersley said that EFF is working with the sites found to fix the vulnerability before it releases the SSL Observatory data set for public consumption.

Aside from the issue of sites still relying on such vulnerable certificates, however, the analysis also found that some CAs still have not issued revocations of the vulnerable certificates that they themselves signed.

Finally, the Observatory team generated a "CA map" connecting the trust relationships between the various root CAs and the subordinate CAs found in the data set. A PDF version of the map is available on the SSL Observatory site. Analysis of that graph, the team noted, reveals some potential problems, including the large number of subordinate-of-subordinate CAs, and important entities that are subordinate CAs — such as Google and the US Department of Homeland Security. There are only 46 countries that have valid root CAs, and there are some surprising absences, such as the Russian Federation and the United Arab Emirates.

The precise risks of a large or important entity not operating its own trust root are probably open for discussion. Surely in the national governments' cases, though, it is unwise to outsource the government's trust to a private third-party, especially a foreign one. Corporations with only a subordinate CA may be at risk to being "held up" by a company beyond their control.

The future of CA trust

The DEFCON presentation closes by posing questions about the future of CA-based SSL. As the site puts it, "the security of HTTPS is only as strong as the practices of the least trustworthy/competent CA," so perhaps the security model needs to be revised, at least reducing complexity and cost.

The project stops short of endorsing specific reforms, instead saying that it hopes the data it has collected so far will serve as a useful research tool for others, and foster more openness and accountability of CAs. Eckersley said that the full data set will be published on August 23, after the team has completed its work to privately discuss the insecure certificates it uncovered.

Mozilla's Johnathan Nightingale applauds the project on his blog, discussing the considerable effort the browser maker puts into maintaining its trusted root CA list, and hopes that the EFF will continue to collect certificates and update the data set on an ongoing basis.

At the very least, this batch of work serves an important purpose, pulling back the curtain on a security layer the web depends on, but which rarely gets discussed. The concerns raised by the analysis thus far cover a wide range: some, like the 6000 "localhost" certificates, are bright red flags, but others are not so straightforward. Is there an attack vector that exploits the fact that more than 300,000 sites all rely on the same GoDaddy CA certificate? Perhaps not, but the mess that would be created if GoDaddy's signing key were to fall into nefarious hands tonight would be tremendous in scope, causing widespread confusion to the majority of the public when the registrar revokes the key and has to re-sign all of its customers' certificates.

The CA-based SSL certificate system has been the target of usability work at Mozilla for quite some time — such as the introduction of "Larry the Passport Officer" in Firefox 3. But there is still a long way to go before the average user will be well-informed about all of the potential risks. X.509 certificates themselves may be more complicated than the average layperson wants to understand, so high-level analysis like SSL Observatory's helps simply by shedding light on the potential problems.

Comments (2 posted)

Brief items

Security quotes of the week

Q: What could such a worm do on my phone?

A: Anything. It could do anything you can do on your phone, and more. So it could destroy or steal all of your data. Track your location. Spam your friends. Listen to your phone calls. Dial the presidents of every country in the world. Anything. And you would pay for all the charges it would create, too.

-- F-Secure's FAQ on the JailbreakMe iPhone vulnerabilities

It's also clear that users should have different rights with respect to each data type. We should be allowed to export, change, and delete disclosed data, even if the social networking sites don't want us to. It's less clear what rights we have for entrusted data -- and far less clear for incidental data. If you post pictures from a party with me in them, can I demand you remove those pictures -- or at least blur out my face?
-- Bruce Schneier in "A Revised Taxonomy of Social Networking Data"

Comments (3 posted)

The EFF SSL Observatory

The EFF has put up a new page for a project which it calls the SSL observatory. They have spent months collecting information about SSL certificates across the net; as one might expect, they have found some interesting things. Those results are really only available as a set of slides [PDF] for now, but it's worth a look. It seems there are over 6,000 valid certificates out there for "localhost"...

Comments (11 posted)

Weekend Project: Secure Instant Messaging with Off The Record (Linux.com)

Linux.com has an article by Nathan Willis about Off The Record (OTR). "Instant messaging, just like email or VoIP traffic, needs to be secure from eavesdroppers, man-in-the-middle attackers, and other security threats. Many IM clients can tunnel messages over transport layer security (TLS) to provide encryption, including XMPP (a.k.a. Jabber), IRC, and the OSCAR protocol used by AIM. TLS provides authentication and encryption at a low level, but a considerably secure solution for IM is a protocol called Off The Record (OTR). Pull up a chair and secure your instant messaging today."

Comments (none posted)

New vulnerabilities

base-files: arbitrary code execution

Package(s):base-files CVE #(s):CVE-2010-0834
Created:August 6, 2010 Updated:August 11, 2010
Description: From the Ubuntu advisory:

It was discovered that the Ubuntu image shipped on some Dell Latitude 2110 systems was accidentally configured to allow unauthenticated package installations. A remote attacker intercepting network communications or a malicious archive mirror server could exploit this to trick the user into installing unsigned packages, resulting in arbitrary code execution with root privileges.

Alerts:
Ubuntu USN-968-1 2010-08-05

Comments (none posted)

dbus-glib: denial of service

Package(s):dbus-glib CVE #(s):CVE-2010-1172
Created:August 11, 2010 Updated:May 27, 2011
Description: From the Red Hat advisory:

It was discovered that dbus-glib did not enforce the "access" flag on exported GObject properties. If such a property were read/write internally but specified as read-only externally, a malicious, local user could use this flaw to modify that property of an application. Such a change could impact the application's behavior (for example, if an IP address were changed the network may not come up properly after reboot) and possibly lead to a denial of service.

Alerts:
Ubuntu USN-1138-2 2011-05-27
Ubuntu USN-1138-1 2011-05-26
SUSE SUSE-SR:2011:007 2011-04-19
openSUSE openSUSE-SU-2011:0300-2 2011-04-08
openSUSE openSUSE-SU-2011:0300-1 2011-04-06
MeeGo MeeGo-SA-10:32 2010-10-09
SUSE SUSE-SR:2010:022 2010-11-30
openSUSE openSUSE-SU-2010:0969-1 2010-11-23
openSUSE openSUSE-SU-2010:0968-1 2010-11-23
SUSE SUSE-SR:2010:020 2010-11-03
Fedora FEDORA-2010-12911 2010-08-17
Fedora FEDORA-2010-12911 2010-08-17
Fedora FEDORA-2010-12911 2010-08-17
Fedora FEDORA-2010-12863 2010-08-17
Fedora FEDORA-2010-12911 2010-08-17
Fedora FEDORA-2010-12863 2010-08-17
CentOS CESA-2010:0616 2010-08-11
CentOS CESA-2010:0616 2010-08-11
Red Hat RHSA-2010:0616-01 2010-08-10
SUSE SUSE-SR:2010:019 2010-10-25

Comments (none posted)

freetype: arbitrary code execution

Package(s):freetype CVE #(s):CVE-2010-1797
Created:August 6, 2010 Updated:October 21, 2010
Description: From the Red Hat advisory:

Two stack overflow flaws were found in the way the FreeType font engine processed certain Compact Font Format (CFF) character strings (opcodes). If a user loaded a specially-crafted font file with an application linked against FreeType, it could cause the application to crash or, possibly, execute arbitrary code with the privileges of the user running the application.

Alerts:
Debian DSA-2105-1 2010-09-07
SUSE SUSE-SR:2010:016 2010-08-26
openSUSE openSUSE-SU-2010:0549-1 2010-08-25
Fedora FEDORA-2010-15705 2010-10-05
Ubuntu USN-972-1 2010-08-17
CentOS CESA-2010:0607 2010-08-16
Pardus 2010-114 2010-08-12
Mandriva MDVSA-2010:149 2010-08-12
CentOS CESA-2010:0607 2010-08-06
Red Hat RHSA-2010:0607-02 2010-08-05
Mandriva MDVSA-2010:201 2010-10-13
Gentoo 201201-09 2012-01-23
SUSE SUSE-SU-2012:0553-1 2012-04-23

Comments (none posted)

git: arbitrary code execution

Package(s):git CVE #(s):CVE-2010-2542
Created:August 9, 2010 Updated:February 22, 2011
Description: From the Pardus advisory:

An exploitable buffer overrun was fixed in git. In particular, if an attacker were to create a crafted working copy where the user runs any git command, the attacker could force execution of arbitrary code.

Alerts:
SUSE SUSE-SR:2011:004 2011-02-22
openSUSE openSUSE-SU-2011:0115-1 2011-02-16
MeeGo MeeGo-SA-10:26 2010-09-03
Debian DSA-2114-1 2010-09-26
Fedora FEDORA-2010-15534 2010-09-30
Mandriva MDVSA-2010:194 2010-10-03
Pardus 2010-103 2010-08-09
Fedora FEDORA-2010-15501 2010-09-30

Comments (none posted)

kernel: multiple vulnerabilities

Package(s):kernel CVE #(s):CVE-2008-7256 CVE-2010-1436 CVE-2010-1643 CVE-2010-2492
Created:August 5, 2010 Updated:January 21, 2011
Description: From the Ubuntu advisory:

Junjiro R. Okajima discovered that knfsd did not correctly handle strict overcommit. A local attacker could exploit this to crash knfsd, leading to a denial of service. (Only Ubuntu 6.06 LTS and 8.04 LTS were affected.) (CVE-2008-7256, CVE-2010-1643)

Mario Mikocevic discovered that GFS2 did not correctly handle certain quota structures. A local attacker could exploit this to crash the system, leading to a denial of service. (Ubuntu 6.06 LTS was not affected.) (CVE-2010-1436)

Andre Osterhues discovered that eCryptfs did not correctly calculate hash values. A local attacker with certain uids could exploit this to crash the system or potentially gain root privileges. (Ubuntu 6.06 LTS was not affected.) (CVE-2010-2492)

Alerts:
Red Hat RHSA-2011:0007-01 2011-01-11
MeeGo MeeGo-SA-10:38 2010-10-09
openSUSE openSUSE-SU-2010:0664-1 2010-09-23
Mandriva MDVSA-2010:188 2010-09-23
Debian DSA-2110-1 2010-09-17
Mandriva MDVSA-2010:172 2010-09-09
Mandriva MDVSA-2010:198 2010-10-07
Red Hat RHSA-2010:0631-01 2010-08-17
Ubuntu USN-966-1 2010-08-04
CentOS CESA-2010:0723 2010-09-30
Red Hat RHSA-2010:0723-01 2010-09-29

Comments (none posted)

kernel: denial of service

Package(s):kernel CVE #(s):CVE-2010-2248 CVE-2010-2521
Created:August 6, 2010 Updated:March 21, 2011
Description: From the Red Hat advisory:

* a flaw was found in the CIFSSMBWrite() function in the Linux kernel Common Internet File System (CIFS) implementation. A remote attacker could send a specially-crafted SMB response packet to a target CIFS client, resulting in a kernel panic (denial of service). (CVE-2010-2248, Important)

* buffer overflow flaws were found in the Linux kernel's implementation of the server-side External Data Representation (XDR) for the Network File System (NFS) version 4. An attacker on the local network could send a specially-crafted large compound request to the NFSv4 server, which could possibly result in a kernel panic (denial of service) or, potentially, code execution. (CVE-2010-2521, Important)

Alerts:
Mandriva MDVSA-2011:051 2011-03-18
Ubuntu USN-1083-1 2011-03-03
Ubuntu USN-1074-2 2011-02-28
Ubuntu USN-1074-1 2011-02-25
SUSE SUSE-SA:2010:060 2010-12-14
Red Hat RHSA-2010:0907-01 2010-11-23
Red Hat RHSA-2010:0893-01 2010-11-16
openSUSE openSUSE-SU-2010:0664-1 2010-09-23
Mandriva MDVSA-2010:188 2010-09-23
SUSE SUSE-SA:2010:040 2010-09-13
SUSE SUSE-SA:2010:038 2010-09-03
SUSE SUSE-SA:2010:036 2010-09-01
CentOS CESA-2010:0606 2010-08-27
Ubuntu USN-1000-1 2010-10-19
Mandriva MDVSA-2010:198 2010-10-07
Debian DSA-2094-1 2010-08-19
Red Hat RHSA-2010:0631-01 2010-08-17
Pardus 2010-112 2010-08-12
CentOS CESA-2010:0610 2010-08-11
Red Hat RHSA-2010:0610-01 2010-08-10
Red Hat RHSA-2010:0606-01 2010-08-05

Comments (none posted)

openconnect: man-in-the-middle attack

Package(s):openconnect CVE #(s):
Created:August 11, 2010 Updated:August 11, 2010
Description: From the Fedora advisory:

This update enables validation of the VPN server's SSL certificate by default, to defend against a potential man-in-the-middle attack.

Alerts:
Fedora FEDORA-2010-12253 2010-08-07
Fedora FEDORA-2010-12257 2010-08-07

Comments (none posted)

php5: denial of service

Package(s):php5 CVE #(s):CVE-2010-1917
Created:August 6, 2010 Updated:December 2, 2010
Description: From the Debian advisory:

The fnmatch function can be abused to conduct denial of service attacks (by crashing the interpreter) by the means of a stack overflow.

Alerts:
Gentoo 201110-06 2011-10-10
CentOS CESA-2010:0919 2010-12-01
CentOS CESA-2010:0919 2010-11-30
Red Hat RHSA-2010:0919-01 2010-11-29
SUSE SUSE-SR:2010:017 2010-09-21
Ubuntu USN-989-1 2010-09-20
openSUSE openSUSE-SU-2010:0599-1 2010-09-10
Slackware SSA:2010-240-04 2010-08-30
Fedora FEDORA-2010-11428 2010-07-27
Fedora FEDORA-2010-11481 2010-07-27
Fedora FEDORA-2010-11428 2010-07-27
Fedora FEDORA-2010-11481 2010-07-27
Fedora FEDORA-2010-11428 2010-07-27
Fedora FEDORA-2010-11481 2010-07-27
Debian DSA-2089-1 2010-08-06
openSUSE openSUSE-SU-2010:0678-1 2010-09-29
SUSE SUSE-SR:2010:018 2010-10-06

Comments (none posted)

socat: arbitrary code execution

Package(s):socat CVE #(s):CVE-2010-2799
Created:August 9, 2010 Updated:March 7, 2011
Description: From the Debian advisory:

A stack overflow vulnerability was found in socat that allows an attacker to execute arbitrary code with the privileges of the socat process.

Alerts:
Fedora FEDORA-2011-0098 2011-01-04
Mandriva MDVSA-2010:183 2010-09-15
Fedora FEDORA-2010-13403 2010-08-24
Fedora FEDORA-2010-13412 2010-08-24
Debian DSA-2090-1 2010-08-06

Comments (none posted)

strongswan: code execution

Package(s):strongswan CVE #(s):CVE-2010-2628
Created:August 10, 2010 Updated:August 17, 2010
Description: From the openSUSE advisory:

Remote unauthenticated attackers could cause a buffer overflow in strongswan's IKE deamon by using specially crafted certificates or identify information. Attackers could potentially exploit that to execute code.

Alerts:
SUSE SUSE-SR:2010:015 2010-08-17
openSUSE openSUSE-SU-2010:0496-1 2010-08-10

Comments (none posted)

wget: code execution

Package(s):wget CVE #(s):CVE-2010-2252
Created:August 5, 2010 Updated:October 14, 2011
Description: From the Debian advisory:

It was discovered that wget, a command line tool for downloading files from the WWW, uses server-provided file names when creating local files. This may lead to code execution in some scenarios.

Alerts:
Gentoo 201110-10 2011-10-13
Pardus 2011-29 2011-02-12
MeeGo MeeGo-SA-10:36 2010-11-03
Mandriva MDVSA-2010:170 2010-09-02
Ubuntu USN-982-1 2010-09-02
Debian DSA-2088-1 2010-08-05

Comments (none posted)

Page editor: Jake Edge

Kernel development

Brief items

Kernel release status

The 2.6.36 merge window is still open, so no development kernel release is available yet. See the article below for a summary of the merges made in the last week.

Four stable kernels were released on August 10: 2.6.27.50, 2.6.32.18, 2.6.34.3, and 2.6.35.1.

Comments (none posted)

Quotes of the week

I don't think the situation is in fact deteriorating. We're shipping decent releases, growing our user base, within and without the kernel developer community, and still have plenty of major feature areas to work on. We have not seen regressive LKML obstructions, though admittedly that is a low standard when it comes to serving the community.
-- SystemTap maintainer Frank Eigler

If my corporate overlords told me I had to use my Exchange "messaging" account for external email communication, they would get a quite clear 'no' in response. My response may also contain suggestions that they use certain other objects for purposes for which they were not designed.

Seriously, just use an external email account and ignore the broken corporate policy. 'Policy' is just a euphemism for not having to think for yourself.

-- David Woodhouse

Comments (3 posted)

Kernel development news

2.6.36 merge window: the sequel

By Jonathan Corbet
August 11, 2010
As of this writing, some 6700 non-merge changesets have been accepted for the 2.6.36 development cycle. These changes bring a lot of fixes and a number of new features, some of which have been in the works for some time. The most interesting changes since last week's summary are summarized here.

User-visible changes include:

  • The ext3 filesystem, once again, defaults to the (safer) "ordered" mode at mount time. This reverses the change (to "writeback" mode) made in 2009, which was typically overridden by distributions.

  • The out-of-memory killer has been rewritten. The practical result is that the system may choose different processes to kill in out-of-memory situations, and the user-space API for adjusting how attractive processes appear to the OOM killer has changed.

  • The fanotify mechanism has been merged. Fanotify allows a user-space daemon to obtain notification of file operations and, perhaps, block access to specific files. It is intended for use with malware scanning applications, but there are other potential uses (hierarchical storage management, for example) as well.

  • There is a new system call for working with resource limits:

        int prlimit64(pid_t pid, unsigned int resource, 
                      const struct rlimit64 *new_rlim, struct rlimit64 *old_rlim);
    

    It is meant to (someday) replace setrlimit(); the differences include the ability to modify limits belonging to other processes and the ability to query and set a limit in a single operation.

  • The TTY driver has gained support for the EXTPROC mode supported by BSD for the last 20 years or so. This option was originally developed to facilitate telnet's "linemode", but it is useful for contemporary protocols as well.

  • New drivers:

    • Processors and systems: Ingenic JZ4740 SOC systems, Trapeze ITS GPR boards, ifm PDM360NG boards, Freescale P1022DS reference boards, TQM mcp8xx-based boards, TI TNETV107X-based systems, OMAP4430-based PandaBoards, NVIDIA Tegra-based systems, and Tilera TILEPro and TILE64 processors (a whole new architecture).

    • Block: QLogic ISP82XX host adaptors, AppliedMicro 460EX processor on-chip SATA controllers, Samsung S3C/S5P board PATA controllers, and Moorestown NAND Flash controllers.

    • Media: EasyCAP USB video adapters, Softlogic 6x10 MPEG codec cards, Winbond/Nuvoton NUC900-based audio controllers, Cirrus Logic CS42L51 codecs, Cirrus Logic EP93xx series audio devices, Marvell Kirkwood I2S audio devices, Ingenic JZ4740-based audio devices, SmartQ board audio devices, Wolfson Micro WM8741 codecs, and Samsung S5P FIMC video postprocessors.

    • Miscellaneous: Silicon Image sil164 TMDS transmitters, TI DSP bridge devices, PCILynx TSB12LV21/A/B controllers (as a FireWire sniffer; the user-space side has also been added under tools/firewire), Bosch Sensortec BMP085 digital pressure sensors, ROHM BH1780GLI ambient light sensors, Honeywell HMC6352 compasses, Summit Microelectronics SMM665 six-channel active DC output controller/monitor devices, JEDEC JC 42.4 compliant temperature sensors, Intel Topcliff PCH DMA controllers, Intel Moorestown DMAC1 and DMAC2 controllers, Intel Moorestown MAX3110 and MAX3107 UARTs, Intel Medfield UARTs, Quatech SSU-100 USB serial ports, and ARM Primecell SP805 watchdog timers.

Changes visible to kernel developers include:

  • The SCSI layer now supports runtime power management, but almost no work has been done (yet) to push that support down into individual drivers.

  • The MIPS architecture now has kprobes support.

  • The KGDB debugger is now supported with the Microblaze architecture.

  • There are a few new build-time configuration commands: listnewconfig outputs a list of new configuration options, oldnoconfig sets all new configuration options to "no" without asking, alldefconfig sets all options to their default values, and savedefconfig writes a minimal configuration file in defconfig. (This patch adding the first two options above also introduces a new Whatevered-by: patch tag, with unknown semantics).

  • There is a new scripts/coccinelle directory containing a number of Coccinelle "semantic patches" which perform various useful checks. They can be run with "make coccicheck".

  • The kmemtrace ftrace plugin is gone; "perf kmem" should be used instead. The ksym plugin has also been superseded by perf, and, thus, removed.

  • There is a new function for short, blocking delays:

        void usleep_range(unsigned long min, unsigned long max);
    

    This function will sleep (uninterruptibly) for a period between min and max microseconds. It is based on hrtimers, so the timing will be more precise than obtained with msleep().

  • The new IRQF_NO_SUSPEND flag for request_irq() will cause the interrupt line not to be disabled during suspend; IRQF_TIMER can no longer be (mis)used for this purpose.

  • The concurrency-managed workqueues patch set has been merged, completely changing the way workqueues are implemented. One immediate user-visible result will be that there should be far fewer kernel threads running on most systems. All users of the "slow work" API have been converted to concurrency-manged workqueues, so the slow work mechanism has been removed from the kernel.

  • The cpuidle mechanism has been enhanced to allow for the set of available idle states to change over time. Details can be found in this patch.

  • The Blackfin architecture has gained dynamic ftrace support.

  • There is a new super_operations method called evict_inode(); it handles all of the necessary work when an in-core inode is being removed. It should be used instead of clear_inode() and delete_inode().

  • The inotify mechanism has been removed from inside the kernel; the fsnotify mechanism must be used instead. (Of course, the user-space inotify interface is still supported).

  • The Video4Linux2 layer has gained a new framework which simplifies the handling of controls; see this commit and Documentation/video4linux/v4l2-controls.txt for details.

  • The open() and release() functions in struct block_device_operations are now called without the big kernel lock held. Additionally, the locked_ioctl() function has gone away; all block drivers must implement their own locking there as well.

  • The domain name resolution code has been pulled out of the CIFS filesystem and made generic. It works by using the key mechanism to request DNS resolution from user space; see Documentation/networking/dns-resolver.txt for details.

The merge window remains open as of this writing, so we may yet see more interesting features merged for 2.6.36. Watch this space next week for the final merge window updates for this development cycle.

Comments (4 posted)

The 2010 Linux Storage and Filesystem Summit, day 1

By Jonathan Corbet
August 9, 2010
The fourth Linux storage and filesystem summit was held August 8 and 9 in Boston, immediately prior to LinuxCon. This time around, a number of developers from the memory management community were present as well. Your editor was also there; what follows are his notes from the first day of the summit.

Testing tools

The first topic of the workshop was "testing and tools," led by Eric Sandeen. The 2009 workshop identified a generic test suite as something that the community would very much like to have. One year later, quite a bit of progress has been made in the form of the xfstests package. As the name suggests, this test suite has its origins in the XFS filesystem, and it is still somewhat specific to XFS. But, with the addition of generic testing last May, about 70 of the 240 tests are now generic. Xfstests is concerned primarily with regression testing; it is not, generally, a performance-oriented test suite. Tests tend to get added when somebody stumbles across a bug and wants to verify that it's fixed - and that it stays fixed. Xfstests also does not have any real facility for the creation of test filesystems; tools like Impressions are best used for that purpose.

About 40 new tests have been added to xfstests over the last year; it is now heavily used in ext4 development. Most tests look for specific bugs; there isn't a whole lot of coverage for extreme situations - millions of files in one directory and such. Those tests just tend to take too long.

It was emphasized that just running xfstests is not enough on its own; tests must be run under most or all reasonable combinations of mount options to get good coverage. Ric Wheeler also pointed out that different types of storage have very different characteristics. Most developers, he fears, tend to test things on their laptops and call the result good. Testing on other types of storage, of course, requires access to the hardware; USB sticks are easy, but not all developers can test on enterprise-class storage arrays.

Tests which exercise more of the virtual memory and I/O paths would also be nice. There is one package which covers much of this ground: FIO, available from kernel.dk. Destructive power failure testing is another useful area which Red Hat (at least) is beginning to do. There has also been some work done using hdparm to corrupt individual sectors on disk to see how filesystems respond. A wishlist item was better latency measurement, with an emphasis on seeing how long I/O requests sit within drivers which do their own queueing. It was suggested that what is really needed is some sort of central site for capturing wishlist ideas for future tests; then, whenever somebody has some time available, those ideas are available.

In an attempt to better engage the memory management developers in the room, it was asked: how can we make tests which cover writeback? The key, it seems, is to choose a workload which is large enough to force writeback, but not so large that it pushes the system into heavy swapping. One simple test is a large tar run; while that is happening, monitor /proc/vmstat to see when writeback kicks in, and when things get bad enough that direct reclaim is invoked. An arguably more representative test can be had with sysbench; again, the key is to tune it so that the shared buffers fit within physical memory.

But, as Nick Piggin pointed out, anything that dirties memory is, in the end, a writeback test. The key is to find ways of making tests which adequately model real-world workloads.

Memory-management testing

Your editor is messing with the timing here, but the session on testing of memory management changes fits well with the above. So please ignore the fact that this session actually happened after lunch.

The question here is simple: how can memory management changes be tested to satisfy everybody? This is a subject which has been coming up for years; memory management changes seem to be especially subject to "have you tested with this other kind of workload?" questions. Developers find this frustrating; it never seems to be possible to do enough testing to satisfy everybody, especially since people asking for testing of different workloads are often unable or unwilling to supply an actual test program.

It was suggested that the real test should be "put the new code on the Google cluster and see if the Internet breaks." There are certain practical difficulties with this approach, however. So the question remains: how can a developer conclude that a memory management change actually works? Especially in a situation where "works" means different things to different people? There is far too wide a variety of workloads to test them all. Beyond that, memory management changes often involve tradeoffs - making one workload better may mean making another one worse. Changes which make life better for everybody are rare.

Still, it was agreed that a standard set of tests would help. Some suggestions were made, including hackbench, netperf, various database benchmarks (pgbench or sysbench, for example), and the "compilebench" test which is popular with kernel developers. There was also some talk of microbenchmarks; Nick Piggin noted that microbenchmarks are terrible when it comes to arguing for the inclusion of a change, but they can be useful for the detection of performance regressions.

Sometimes running a single benchmark is not enough; many memory management problems are only revealed when the system comes under a combination of stresses. And Andrea Arcangeli made the point that, in the end, only one test really matters: how much time does it take the system to complete running a workload which exceeds the amount of physical RAM available?

There was some discussion of the challenges involved in tracking down problems; Mel Gorman stated that the debugability of the virtual memory subsystem is "a mess." Tracepoints can be useful for this purpose, but they are hard to get merged, partly due to Andrew Morton's hostility to tracepoints in general. There is also ongoing concern about the ABI status of tracepoints; what happens when programs (perhaps being run by large customers of enterprise distributions) depend on tracepoints which expose low-level kernel details? Those tracepoints may no longer make sense after the code changes, but breaking them may not be an option.

Filesystem freeze/thaw

The filesystem freeze feature enables a system administrator to suspend writes to a filesystem, allowing it to be backed up or snapshotted while in a consistent state. It had its origins in XFS, but has since become part of the Linux VFS layer. There are a few issues with how freeze works in current kernels, though.

The biggest of these problems is unmounting - what happens when the administrator unmounts a frozen filesystem? In current kernels, the whole thing hangs, leaving the system with an unusable, un-unmountable filesystem - behavior which does not further the Linux World Domination goal. So four possible solutions were proposed:

  1. Simply disallow the unmounting of frozen filesystems. Al Viro stated that this solution is not really an option; there are cases where the unmount cannot be disallowed. When the final process exits the namespace in which the filesystem is mounted is one of those cases. Disallowing unmounts would also break the useful umount -l option, which is meant to work at all times.

  2. Keep the filesystem frozen across the unmount, so that the filesystem would still be frozen after the next mount. The biggest problem here is that there may be changes that the filesystem code needs to write to the device; if the system reboots before that can happen, bad things can result.

  3. Automatically thaw filesystems on unmount.

  4. Add a new ioctl() command which will cause the thawing of an unmounted filesystem.

Al suggested a variant on #3, in the form of a new freeze command. The proper way to handle freeze is to return a file descriptor; as long as that file descriptor is held open, the filesystem remains frozen. This solves the "last process exits" problem because the file descriptor will be closed as the process exits, automatically causing the filesystem to be thawed. Also, as Al noted, the kill command is often the system recovery tool of choice for system administrators, so having a suitably-targeted kill cause a frozen filesystem to be thawed makes sense.

There seemed to be a consensus that the file descriptor approach is the best long-term solution. Meanwhile, though, there are tools based on the older ioctl() commands which will take years to replace in the field. So we might also see an implementation of #4, above, to help in the near future.

Barriers

Contemporary filesystems go to great lengths to avoid losing data - or corrupting things - if the system crashes. To that end, quite a bit of thought goes into writing things to disk in the correct order. As a simple example, operations written to a filesystem journal must make it to the media before the commit record which marks those operations as valid. Otherwise, the filesystem could end up replaying a journal with random data, with an outcome that few people would love.

All of that care is for nothing, though, if the storage device reorders writes on their way to the media. And, of course, reordering is something that storage devices do all the time in the name of increasing performance. The solution, for years now, has been "barrier" operations; all writes issued before a barrier are supposed to complete before any writes issued after the barrier. The problem is that barriers have not always been well supported in the Linux block subsystem, and, when they are supported, they have a significant impact on performance. So, even now, many systems run with barriers disabled.

Barriers have been discussed among the filesystem and storage developers for years; it was hoped that this year, with the memory management developers present as well, some better solutions might be found.

There was some discussion about the various ways of implementing barriers and making them faster. The key point in the discussion, though, was the assertion that barriers are not really the same as the memory barriers they were patterned after. There are, instead, two important aspects to block subsystem barriers: request ordering and forcing data to disk. That led, eventually, to one of the clearest decisions in the first day of the summit: barriers, as such, will be no more. The problem of ordering will be placed entirely in the hands of filesystem developers, who will ensure ordering by simply waiting for operations to complete when needed. There will be no ordering issues as such in the block layer, but block drivers will be responsible for explicitly flushing writes to physical media when needed.

Whether this decision will lead to better-performing and more robust filesystem I/O remains to be seen, but it is a clearer description of the division of responsibilities than has been seen in the past.

Transparent hugepages

At this point, the summit split into three tracks for storage, filesystem, and memory management developers. Your editor followed the memory management track; with luck, we'll eventually have writeups from the other tracks as well.

Andrea Arcangeli presented his transparent hugepages work, starting with a discussion of the advantages of hugepages in general. Hugepages are a feature of many contemporary processors; they allow the memory management subsystem to use larger-than-normal page sizes in parts of the virtual address range. There are a number of advantages to using hugepages in the right places, but it all comes down to performance.

A hugepage takes much of the pressure off the processor's translation lookaside buffer (TLB), speeding memory access. When a TLB miss happens anyway, a 2MB hugepage requires traversing three levels of page table rather than four, saving a memory access and, again, reducing TLB pressure. The result is a doubling of the speed with which initial page faults can be handled, and better application performance in general. There can be some costs, especially when entire hugepages must be cleared or copied; that can wipe out much of the processor's cache. But this cost tends to be overwhelmed by the performance advantages that hugepages bring.

Those advantages, incidentally, are multiplied when hugepages are used on systems hosting virtualized guests. Using hugepages in this situation can eliminate much of the remaining cost of running virtualized systems.

Hugepages on Linux are currently accessed through the hugetlbfs filesystem, which was discussed in great detail by Mel Gorman on LWN earlier this year. There are some real limitations associated with hugetlbfs, though: hugepages are not swappable, they must be reserved at boot time, there is no mixing of page sizes in the same virtual memory area, etc. Many of these problems could be fixed, but, as Andrea put it, hugetlbfs is becoming a sort of secondary - and inferior - Linux virtual memory subsystem. It is time to turn hugepages into first-class citizens in the real Linux VM.

Transparent hugepages eliminate much of the need for hugetlbfs by automatically grouping together sections of a process's virtual address space into hugepages when warranted. They take away the hassles of hugetlbfs and make it possible for the system to use hugepages with no need for administrator intervention or application changes. There seems to be a fair amount of interest in the feature; a Google developer said that the feature is attractive for internal use.

At the core of the patch is a new thread called khugepaged, which is charged with scanning memory and creating hugepages where it makes sense. Other parts of the VM can split those hugepages back into normally-sized pieces when the need arises. Khugepaged works by allocating a hugepage, then using the migration mechanism to copy the contents of the smaller component pages over. There was some talk of trying to defragment memory and "collapse in place" instead, but it doesn't seem worth the effort at this point. The amount of work to be done would be about the same except in the special case where a hugepage had been split and was being grouped back together before much had changed - a situation which is expected to be relatively rare.

Andrea put up a number of benchmarks showing how transparent hugepages improve performance; the all-important kernel compile benchmark (seen as a sort of worst case for hugepages) is 2.5% faster. Various other benchmarks show bigger improvements.

Transparent hugepages, it seems, will be enabled by default in the RHEL 6 kernel. Andrea would really like to get the feature into 2.6.36, but the merge window is already well advanced and it's not clear that things will work out that way. There is still a need to convince Linus that the feature is worthwhile, and, perhaps, some work to be done to enable the feature on SPARC systems.

mmap_sem

The memory map semaphore (mmap_sem) is a reader-writer semaphore which protects the tree of virtual memory area (VMA) structures describing each address space. It is, Nick Piggin says, one of the last nasty locking issues left in the virtual memory subsystem. Like many busy, global locks, mmap_sem can cause scalability problems through cache line bouncing. In this case, though, simple contention for the lock can be a problem; mmap_sem is often held while disk I/O is being performed. With some workloads, the amount of time that mmap_sem is held can slow things down significantly.

Various groups, including developers at HP and Google, have chipped away at the mmap_sem problem in the past, usually by trying to drop the semaphore in various paths. These patches have all run into the same problem, though: Linus hates them. In particular, he seems to dislike the additional complication added to the retry paths which must be followed when things change while the lock is dropped. So none of this work has gotten into the mainline.

There have also been some unfair rwsem proposals aimed at reducing mmap_sem contention; these have run aground over fears of writer starvation.

According to Nick, the real problem is the red-black tree used to track allocated address space; the data structure is cache-unfriendly and requires a global lock for its protection. His idea is to do away with this rbtree and associate VMAs directly with the page table entries, protecting them with the PTE locks. This approach would eliminate much of the locking entirely, since the page tables must be traversable without locks, and solve the mmap_sem problem.

That said, there are some challenges. A VMA associated with a page table entry can cover a maximum of 2MB of address space; larger areas would have to be split into (possibly a large number of) smaller VMAs. It's not clear how this mechanism would then interact with hugepages. The instantiation of large VMAs would require the creation of the full range of PTEs, which is not required now; that could hurt applications with very sparsely-populated memory areas. Growing VMAs would have its own challenges. There is also the issue of free space allocation, a problem which might be solved by preallocating ranges of addresses to each thread sharing an address space. In summary, the list of obstacles to be overcome before this idea becomes practical looks somewhat daunting.

The developers in the room seemed to not be entirely comfortable with this approach, but nobody could come up with a fundamental reason why it would not work. So we'll probably be seeing patches from Nick exploring this idea in the future.

copyfile()

The reflink() system call was originally proposed as a sort of fast copy operation; it would create a new "copy" of a file which shared all of the data blocks. If one of the files were subsequently written to, a copy-on-write operation would be performed so that the other file would not change. LWN readers last heard about this patch last September, when Linus refused to pull it for 2.6.32. Among other things, he didn't like the name.

So now reflink() is back as copyfile(), with some proposed additional features. It would make the same copy-on-write copies on filesystems that support it, but copyfile() would also be able to delegate the actual copy work to the underlying storage device when it makes sense. For example, if a file is being copied on a network-mounted filesystem, it may well make sense to have the server do the actual copy work, eliminating the need to move the data over the network twice. The system call might also do ordinary copies within the kernel if nothing faster is available.

The first question that was asked is: should copyfile() perhaps be an asynchronous interface? It could return a file descriptor which could be polled for the status of the operation. Then, graphical utilities could start a copy, then present a progress bar showing how things were going. Christoph Hellwig was adamant, though, that copyfile() should be a synchronous operation like almost all other Linux system calls; there is no need to create something weird and different here. Progress bars neither justify nor require the creation of asynchronous interfaces.

There was also opposition to the mixing of the old reflink() idea with that of copying a file. There is little perceived value in creating a bad version of cp within the kernel. The two ideas were mixed because it seems that Linus seems to want it that way, but, after this discussion, they may yet be split apart again.

Dirty limits

Jan Kara led a short discussion on the problem of dirty limits. The tuning knob found at /proc/sys/vm/dirty_ratio contains a number representing a percentage of total memory. Any time that the number of dirty pages in the system exceeds that percentage, processes which are actually writing data will be forced to perform some writeback directly. This policy has a couple of useful results: it helps keep memory from becoming entirely filled with dirty pages, and it serves to throttle the processes which are creating dirty pages in the first place.

The default value for dirty_ratio is 20, meaning that 20% of memory can be dirty before processes are conscripted into writeback duties. But that turns out to be too low for a number of applications. In particular, it seems that many Berkeley DB applications exhibit behavior where they dirty a lot of pages all over memory; setting dirty_ratio too low causes a lot of excessive I/O and serious performance issues. For this reason, distributions like RHEL raise this limit to 40% by default.

But 40% is not an ideal number either; it can lead to a lot of wasted memory when the system's workloads are mostly sequential. Lots of dirty pages can also cause fsync() calls to take a very long time, especially with the ext3 filesystem. What's really needed is a way to set this parameter in a more automatic, adaptive way, but exactly how that should be done is not entirely clear.

What is likely to happen in the short term is that a user-space daemon will be written to experiment with various policies for dirty_ratio. Some VM tracepoints can be used to catch events and tune things accordingly. A system which is handling a lot of fsync() calls should probably have a lower value of dirty_ratio, for example. In the absence of reasons to the contrary, the daemon can try to nudge the limit higher and try to see if applications perform better. This kind of heuristic experimentation has its hazards, but there does not seem to be a better method on offer at the moment.

Topology and alignment

There was a brief session on storage device topology issues; unfortunately, it was late in the day and your editor's notes are increasingly fuzzy. Much of the discussion had to do with 4K-sector disks. There are still issues, it seems, with drives which implement strange sector alignments in an attempt to get better performance with some versions of Windows. Linux can cope with these drives, but only if the drives themselves export information on what they are doing. Not all hardware provides that information, unfortunately.

Meanwhile, the amount of software which does make use of the topology information exported through the kernel is increasing. Partitioning tools are getting smarter, and the device mapper now uses this information properly. The readahead code will be tweaked to create properly-aligned requests when possible.

Lightning talks

The last session of the day was dedicated to three lightning talks. The first, by Matthew Wilcox, had to do with merging of git trees. Quite a bit of work in the VM/filesystem/storage area depends on changes made in a number of different trees. Making those trees fit together can be a bit of a challenge. That problem can be solved in linux-next, but those solutions do not necessarily carry over into the mainline, where trees may be pulled in just about any order - or not at all. The result is a lot of work and merge-window scrambling by developers, who are getting a little tired of it.

So, it was asked, is it time for a git tree dedicated to storage as a whole, and a storage maintainer to go with it? The idea was to create something like David Miller's networking tree, which is the merge point for almost all networking-related patches. James Bottomley made the mistake of suggesting that this kind of discussion could not go very far without a volunteer to manage that tree; he was then duly volunteered for the job.

The discussion moved on to how this tree would work, and, in particular, whether its maintainer would become the "overlord of storage," or whether it would just be a more convenient place to work out merge conflicts. If its maintainer is to be a true overlord, a fairly hardline approach will need to be taken with regard to when patches would have to be ready for merging. It's not clear whether the storage community is ready to deal with such a maintainer. So, for the near future, James will run the tree as a merge point to see whether that helps developers get their code into the mainline. If it seems like there is need for a real storage maintainer, that question will be addressed after a development cycle or two.

Dan Magenheimer presented his Cleancache proposal, mostly with an eye toward trying to figure out a way to get it merged. There is still some opposition to it, and its per-filesystem hooks in particular. It's hard to see how those hooks can be avoided, though; Cleancache is not suitable for all filesystems and, thus, may not be a good fit for the VFS layer. The crowd seemed reasonably amenable to merging the patches, but the chief opponent - Christoph Hellwig - was not in the room at the time. So no real conclusions have been reached.

The final lightning talk came from Boaz Harrosh, who talked about "stable pages." Currently, pages which are currently under writeback can be modified by filesystem code. That's potentially a data integrity problem, and it can be fatal in situations where, for example, checksums of page contents are being made. That is why the RAID5 code must copy all pages being written to an array; changing data would break the RAID5 checksums. What, asked Boaz, would break if the ability to change pages under writeback were withdrawn?

The answer seems to be that nothing would break, but that some filesystems might suffer performance impacts. The only way to find out for sure, though, is to try it. As it happens, this is a relatively easy experiment to run, so filesystem developers will probably start playing with it sometime soon.

That was the end of the first day of the summit; reports from the second day will be posted as soon as they are ready.

Comments (38 posted)

The 2010 Linux Storage and Filesystem Summit, day 2

By Jonathan Corbet
August 10, 2010
The second day of the 2010 Linux Storage and Filesystem Summit was held on August 9 in Boston. Those who have not yet read the coverage from day 1 may want to start there. This day's topics were, in general, more detailed and technical and less amenable to summarization here. Nonetheless, your editor will try his best.

Writeback

The first session of the day was dedicated to the writeback issue. Writeback, of course, is the process of writing modified pages of files back to persistent store. There have been numerous complaints over recent years that writeback performance in Linux has regressed; the curious reader can refer to this article for some details, or this bugzilla entry for many, many details. The discussion was less focused on this specific problem, though; instead, the developers considered the problems with writeback as a whole.

Sorin Faibish started with a discussion of some research that he has done in this area. The challenges for writeback are familiar to those who have been watching the industry; the size of our systems - in terms of both memory and storage - has increased, but speed of those systems has not increased proportionally. As a result, writing back a given percentage of a system's pages takes longer than it once did. It is always easier for the writeback system to fail to keep up with processes which are dirtying pages, leading to poor performance.

His assertion is that the use of watermarks to control writeback is no longer appropriate for contemporary systems. Writeback should not wait until a certain percentage of memory is dirty; it should start sooner, and, crucially, be tied to the rate with which processes are dirtying pages. The system, he says, should work much more aggressively to ensure that the writeback rate matches the dirty rate.

From there, the discussion wandered through a number of specific issues. Linux writeback now works by flushing out pages belonging to a specific file (inode) at a time, with the hope that those pages will be located nearby on the disk. The memory management code will normally ask the filesystem to flush out up to 4MB of data for each inode. One poorly-kept secret of Linux memory management is that filesystems routinely ignore that request - they typically flush far more data than requested if there are that many dirty pages. It's only by generating much larger I/O requests that they can get the best performance.

Ted Ts'o wondered if blindly increasing writeback size is the best thing to do. 4MB is clearly too small for most drives, but it may well be too large for a filesystem located on a slow USB drive. Flushing large amounts of data to such a filesystem can stall any other I/O to that device for quite some time. From this discussion came the idea that writeback should not be based on specific amounts of data, but, instead, should be time-based. Essentially, the backing device should be time-shared between competing interests in a way similar to how the CPU is shared.

James Bottomley asked if this idea made sense - is it ever right to cut off I/O to an inode which still has contiguous, dirty pages to write? The answer seems to be "yes." Consider a process which is copying a large file - a DVD image or something even larger. Writeback might not catch up with such a process until the copy is done, which may not be for a long time into the future; meanwhile, all other users of that device will be starved. That is bad for interactivity, and it can cause long delays before other files are flushed to disk. Also, the incremental performance benefit of extending large I/O operations tend to drop off over time. So, in the end, it's necessary to switch to another inode at some point, and making the change based on wall-clock time seems to be the most promising approach.

Boaz Harrosh raised the idea of moving the I/O scheduler's intelligence up to the virtual memory management level. Then, perhaps, application priorities could be used to give interactive processes privileged access to I/O bandwidth. Ted, instead, suggested that there may be value in allowing the assignment of priorities to individual file descriptors. It's fairly common for an application to have files it really cares about, and others (log files, say) which matter less. The problem with all of these ideas, according to Christoph Hellwig, is that the kernel has far too many I/O submission paths. The block layer is the only place where all of those I/O operations come together into a single place, so it's the only place where any sort of reasonable I/O control can be applied. A lot of fancy schemes are hard to implement at that level, so, even if descriptor-based priorities are a good idea (not everybody was convinced), it's not something that can readily be done now. Unifying the I/O submission paths was seen as a good idea, but it's not something for the near future.

Jan Kara asked about how results can be measured, and against what requirements will they be judged? Without that information, it is hard to know if any changes have had good effects or not. There are trivial cases, of course - changes which slow down kernel compiles tend to be caught early on. But, in general, we have no way to measure how well we are doing with writeback. So, in the end, the first action item is likely to be an attempt to set down the requirements and to develop some good test cases. Once it's possible to decide whether patches make sense, there will probably an implementation of some sort of time-based writeback mechanism.

Solid-state storage devices

There were two sessions on solid-state storage devices (SSDs) at the summit; your editor was able to attend only the first. The situation which was described there is one we have been hearing about for a couple of years at least. These devices are getting faster: they are heading toward a point where they can perform one million I/O operations per second. That said, they still exhibit significant latency on operations (though much less than rotating drives do), so the only way to get that kind of operation count is to run a lot of operations in parallel. "A lot" in this case means having something like 100 operations in flight at any given time.

Current SSDs work reasonably well with Linux, but there are certainly some problems. There is far too much overhead in the ATA and SCSI layers; at that kind of operation rate, microseconds hurt. The block layer's request queues are becoming a bottleneck; it's currently only possible to have about 32 concurrent operations outstanding on a device. The system needs to be able to distribute I/O completion work across multiple CPUs, preferably using smart controllers which can direct each completion interrupt to the CPU which initiated a specific operation in the first place.

For "storage-attached" SSDs (those which look like traditional disks), there are not a lot of problems at the filesystem level; things work pretty well. Once one gets into bus-attached devices which do not look like disks, though, the situation changes. One participant asserted that, on such devices, the ext4 filesystem could not be expected to get reasonable performance without a significant redesign. There is just too much to do in parallel.

Ric Wheeler questioned the claim that SSDs are bringing a new challenge for the storage subsystem. Very high-end enterprise storage arrays have achieved this kind of I/O rate for some years now. One thing those arrays do is present multiple devices to the system, naturally helping with parallelism; perhaps SSDs could be logically partitioned in the same way.

Resizing guest memory

A change of pace was had in the memory management track, where Rik van Riel talked about the challenges involved in resizing the memory available to virtualized guests. There are four different techniques in use currently:

  • Memory hotplug by way of simulated hardware hotplug events. This mechanism works well for adding memory to guests, but it cannot really be used to take memory back. Hot remove simply does not work well, because there's always some sort of non-movable allocation which ends up in the space which would be removed.

  • Ballooning, wherein a special driver in the guest allocates pages and retires them from use, essentially handing them back to the host. Memory can be fed back into the guest by having the balloon driver free the pages it has allocated. This mechanism is simple, if somewhat slow, but simple management policies are scarce.

  • Transcendent memory techniques like cleancache and frontswap, which can be used to adjust memory availability between virtual guests.

  • Page hinting, whereby guests mark pages which can be discarded by the host. These pages may be on the guest's free list, or they may simply be clean pages. Should the guest try to access such a page after the host has thrown it away, that guest will receive a special page fault telling it that it needs to allocate the page anew. Hinting techniques tend to bring a lot of complexity with them.

The real question of interest in this session seemed to be the "rightsizing" of guests - giving each guest just enough memory to optimize the performance of the system as a whole. Google is also interested in this problem, though it is using cgroup-based containers instead of full virtualization. It comes down to figuring out what a process's minimal working set size is - a problem which has resisted attempts at solution for decades.

Mel Gorman proposed one approach to determine a guest's working set size. Place that guest under memory pressure, slowly shrinking its available memory over time. There will come a point where the kernel starts scanning for reclaimable pages, and, as the pressure grows, a point where the process starts paging in pages which it had previously used. That latter point could be deemed to be the place where the available memory had fallen below the working set size. It was also suggested that page reactivations - when pages are recovered from the inactive list and placed back into active use - could also be the metric by which the optimal size is determined.

Nick Piggin was skeptical of such schemes, though. He gave the example of two processes, one of which is repeatedly working through a 1GB file, while the other is working through a 1TB file. If both processes currently have 512MB of memory available, they will both be doing significant amounts of paging. Adjusting the memory size will not change that behavior, leading to the conclusion that there's not much to be done - until the process with the smaller file gets 1GB of memory to work with. At that point, its paging will stop. The process working with the larger file will never reach that point, though, at least on contemporary systems. So, even though both processes are paging at the same rate, the initial 512MB memory size is too small for one process, but is just fine for the other.

The fact that the problem is hard has not stopped developers from trying to improve the situation, though, so we are likely to see attempts made at dynamically resizing guests in an attempt to work out their optimal sizes.

I/O bandwidth controllers

Vivek Goyal led a brief session on the I/O bandwidth controller problem. Part of that problem has been solved - there is now a proportional-weight bandwidth controller in the mainline kernel. This controller works well for single-spindle drives, perhaps a bit less so with large arrays. With larger systems, the single dispatch queue in the CFQ scheduler becomes a bottleneck. Vivek has been working on a set of patches to improve that situation for a little while now.

The real challenge, though, is the desired maximum bandwidth controller. The proportional controller which is there now will happily let a process consume massive amounts of bandwidth in the absence of contention. In most cases, that's the right result, but there are hosting providers out there who want to be able to keep their customers within the bandwidth limits they have paid for. The problem here is figuring out where to implement this feature. Doing it at the I/O scheduler level doesn't work well when there are devices stacked higher in the chain.

One suggestion is to create a special device mapper target which would do maximum bandwidth throttling. There was some resistance to that idea, partly because some people would rather avoid the device mapper altogether, but also due to practical problems like the inability of current Linux kernels to insert a DM-based controller into the stack for an already-mounted disk. So we may see an attempt to add this feature at the request queue level, or we may see a new hook allowing a block I/O stream to be rerouted through a new module on the fly.

The other feature which is high on the list is support for controlling buffered I/O bandwidth. Buffered I/O is hard; by the time an I/O request has made it to the block subsystem, it has been effectively detached from the originating process. Getting around that requires adding some new page-level accounting, which is not a lightweight solution.

Reclaim topics

Back in the memory management track, a number of reclaim-oriented topics were covered briefly. The first of these is per-cgroup reclaim. Control groups can be used now to limit total memory use, so reclaim of anonymous and page-cache pages works just fine. What is missing, though, is the sort of lower-level reclaim used by the kernel to recover memory: shrinking of slab caches, trimming the inode cache, etc. A cgroup can consume considerable resources with this kind of structure, and there is currently no mechanism for putting a lid on such usage.

Zone-based reclaim would also be nice; that is evidently covered in the VFS scalability patch set, and may be pushed toward the mainline as a standalone patch.

Reclaim of smaller structures is a problem which came up a few times this afternoon. These structures are reclaimed individually, but the virtual memory subsystem is really only concerned with the reclaim of full pages. So reclaiming individual inodes (or dentries, or whatever) may just serve to lose useful cached information and increase fragmentation without actually freeing any memory for the rest of the system. So it might be nice to change the reclaim of structures like dentries to be more page-focused, so that useful chunks of memory can be returned to the system.

The ability to move these structures around in memory, freeing pages through defragmentation, would also be useful. That is a hard problem, though, which will not be amenable to a quick solution.

There is an interesting problem with inode reclaim: cleaning up an inode also clears all related page cache pages out of the system. There can be times when that's not what's really called for. It can free vast amounts of memory when only small amounts are needed, and it can deprive the system of cached data which will just need to be read in again in the near future. So there may be an attempt to change how inode reclaim works sometime soon.

There are some difficulties with how the page allocator works on larger systems; free memory can go well below the low watermark before the system notices. That is the result of how the per-CPU queues work; as the number of processors grows, the accounting of the size of those queues gets fuzzier. So there was talk of sending inter-processor interrupts on occasion to get a better count, but that is a very expensive solution. Better, perhaps, is just to iterate over the per-CPU data structures and take the locking overhead.

Slab allocators

Christoph Lameter ran a discussion on slab allocators, talking about the three allocators which are currently in the kernel and the attempts which are being made to unify them. This is a contentious topic, but there was a relative lack of contentious people in the room, so the discussion was subdued. What happens will really depend on what patches Christoph posts in the future.

O_DIRECT

A brief session touched on a few problems associated with direct I/O. The first of these is an obscure race between get_user_pages() (which pins user-space pages in memory so they can be used for I/O) and the fork() system call. In some cases, a fork() while the pages are mapped can corrupt the system. A number of fixes have been posted, but they have not gotten past Linus. The real fix will involve fixing all get_user_pages() callers and (the real point of contention) slowing down fork(). The race is a real problem, so some sort of solution will need to find its way into the mainline.

Why, it was asked, do applications use direct I/O instead of just mapping file pages into their address space? The answer is that these applications know what they want to do with the hardware and do not want the virtual memory system getting in the way. This is generally seen as a valid requirement.

There is some desire for the ability to do direct I/O from the virtual memory subsystem itself. This feature could be used to support, for example, swapping over NFS in a safe way. Expect patches in the near future.

Finally, there is a problem with direct I/O to transparent hugepages. The kernel will go through and call get_user_pages_fast() for each 4K subpage, but that is unnecessary. So 512 mapping calls are being made when one would do. Some kind of fix will eventually need to be made so that this kind of I/O can be done more efficiently.

Lightning talks

Once again, the day ended with lightning talk topics. Matthew Wilcox started by asking developers to work at changing more uninterruptible waits into "killable" waits. The difference is that uninterruptible waits can, if they wait for a long time, create unkillable processes. System administrators don't like such processes; "kill -9" should really work at all times.

The problem is that making this change is often not straightforward; it turns a function call which cannot fail into one which can be interrupted. That means that, for each change, a new error path must be added which properly unwinds any work which had been done so far. That is typically not a simple change, especially for somebody who does not intimately understand the code in question, so it's not the kind of job that one person can just take care of.

It was suggested that iSCSI drives - which can cause long delays if they fall off the net - are a good way of testing this kind of code. From there, the discussion wandered into the right way of dealing with the problems which result when network-attached drives disappear. They can often hang the system for long periods of time, which is unfortunate. Even worse, they can sometimes reappear as the same drive after buffers have been dropped, leading to data corruption. The solution to all of this is faster and better recovery when devices disappear, especially once it becomes clear that they will not be coming back anytime soon. Additionally, should one of those devices reappear after the system has given up on it, the storage layer should take care that it shows up as a totally new device. Work will be done to this end in the near future.

Mike Rubin talked a bit about how things are done at Google. There are currently about 25 kernel engineers working there, but few of them are senior-level developers. That, it was suggested, explains some of the things that Google has tried to do in the kernel.

There are two fundamental types of workload at Google. "Shared" workloads work like classic mainframe batch jobs, contending for resources while the system tries to isolate them from each other. "Dedicated workloads" are the ones which actually make money for Google - indexing, searching, and such - and are very sensitive to performance degradation. In general, any new kernel which shows a 1% or higher performance regression is deemed to not be good enough.

The workloads exhibit a lot of big, sequential writes and smaller, random reads. Disk I/O latencies matter a lot for dedicated workloads; 15ms latencies can cause phone calls to the development group. The systems are typically doing direct I/O on not-too-huge files, with logging happening on the side. The disk is shared between jobs, with the I/O bandwidth controller used to arbitrate between them.

Why is direct I/O used? It's a decision which dates back to the 2.2 days, when buffered I/O worked less well than it does now. Things have gotten better, but, meanwhile, Google has moved much of its buffer cache management into user space. It works much like enterprise database systems do, and, chances are, that will not change in the near future.

Google uses the "fake NUMA" feature to partition system memory into 128MB chunks. These chunks are assigned to jobs, which are managed by control groups. The intent is to firmly isolate all of these jobs, but writeback still can cause interference between them.

Why, it was asked, does Google not use xfs? Currently, Mike said, they are using ext2 everywhere, and "it sucks." On the other hand, ext4 has turned out to be everything they had hoped for. It's simple to use, and the migration from ext2 is straightforward. Given that, they feel no need to go to a more exotic filesystem.

Mark Fasheh talked briefly about "cluster convergence," which really means sharing of code between the two cluster filesystems (GFS2 and OCFS2) in the mainline kernel. It turns out that there is a surprising amount of sharing happening at this point, with the lock manager, management tools, and more being common to both. The biggest difference between the two, at this point, is the on-disk format.

The cluster filesystems are in a bit of a tough place. Neither has a huge group dedicated to its development, and, as Ric Wheeler pointed out, there just isn't much of a hobbyist community equipped with enterprise-level storage arrays out there. So these two projects have struggled to keep up with the proprietary alternatives. Combining them into a single cluster filesystem looks like a good alternative to everybody involved. Practical and political difficulties could keep that from happening for some years, though.

There was a brief discussion about the DMAPI specification, which describes an API to be used to control hierarchical storage managers. What little support exists in the kernel for this API is going away, leaving companies with HSM offerings out in the cold. There are a number of problems with DMAPI, starting with the fact that it fails badly in the presence of namespaces. The API can't be fixed without breaking a range of proprietary applications. So it's not clear what the way forward will be.

Closing

[Group photo] The summit was widely seen as a successful event, and the participation of the memory management community was welcomed. So there will be a joint summit again for storage, filesystem, and memory management developers next year. It could happen as soon as early 2011; the participants would like to move the event back to the (northern) spring, and waiting for 18 months for the next gathering seemed like too long.

Comments (22 posted)

Patches and updates

Kernel trees

Core kernel code

Device drivers

Filesystems and block I/O

Memory management

Networking

Architecture-specific

Security-related

Virtualization and containers

Miscellaneous

Page editor: Jonathan Corbet

Distributions

Illumos: new hope for the OpenSolaris community?

August 11, 2010

This article was contributed by Koen Vervloesem

On August 3, storage vendor Nexenta Systems hosted a conference call [MP3] announcing Illumos, a project to free the OpenSolaris distribution. Garrett D'Amore, a former Sun and Oracle Solaris developer and now senior director of engineering at Nexenta, disclosed the goal of the project: Illumos will replace all proprietary code that is still in OpenSolaris by open source code. More information can be found in the announcement slides [PDF].

Although its name suggests something else, OpenSolaris has never been completely open source. Some critical components are only available as binaries because of copyright issues. The most critical part is the internationalization (i18n)) framework of libc, but also the NFS lock manager, portions of the cryptographic framework, and numerous critical drivers are proprietary. This presents a lot of challenges to downstream OpenSolaris distributions, such as Nexenta and Belenix. We only have to look at Darwin, Apple's open source part of Mac OS X, to see what the effect can be of some critical components that are closed: Darwin never really took off as an independent operating system.

So the first short-term goal of Illumos is open sourcing all of OpenSolaris, or more correctly: all of OS/Net (ON), which is the OpenSolaris kernel, C library, drivers, and some basic commands. Illumos will track OpenSolaris ON closely, but will replace all closed bits by open source alternatives, while still staying 100% ABI compatible. Right now, Illumos libc is completely open: Garrett developed libc_i18n by borrowing code from FreeBSD, picking up a little from NetBSD, and finishing it with his own code.

At the moment, Garrett's code is the only one that has been integrated in Illumos, but there are a number of other projects that are near to integration. For example, Anil Gulecha developed GRUB and boot graphics, Steve Stallion and David Gwynne have been working on some open source replacements for drivers, Roland Mainz, Olga Kryzhanovska and Rich Lowe have replaced some user-space utilities, and Jörg Schilling has been working on archivers.

But even when all these projects get integrated, there is still the issue that only Sun's proprietary compiler Sun Studio can currently build a bootable OpenSolaris kernel. The intention is to enable a fully open toolset, probably using GCC. So a fully open source self-hosting OpenSolaris derivative is not yet possible, but it should become a reality by the end of 2010. A preliminary version of Illumos using the closed bits and compiled with Sun Studio already boots.

A fork or not a fork?

So Illumos is not really a fork, it's more of a downstream project that is happy to contribute to upstream ON. Oracle employee Joerg Moellenkamp explains the situation like this:

When you really want to call it a fork, then it's a fork of the OS/NET consolidation, not of the kernel. I think of it as a set of changes to the vanilla Solaris kernels like they exist in the Linux world (All vendors have their own kernels with modules that aren't in the mainline kernel and nobody talks about a forked Linux).

However, the ability to fork is important for OpenSolaris, as Bryan Cantrill explains:

OpenSolaris didn't go far enough: even though the right to fork was understood, there was not enough attention paid to the power to fork. As a result, the operating system never quite got to being 100% open: there remained some annoying (but essential) little bits that could not be opened for one historical (i.e., legal) reason or another. When coupled with the fact that Sun historically had a monopoly or near-monopoly on Solaris engineering talent, the community was entirely deprived of the oxygen that it would have needed to exercise its right to fork.

So although Illumos is not aimed at creating an OpenSolaris fork, according to Bryan the power to fork is essential for the vitality of the OpenSolaris community. Jörg Schilling looks at this in an historical context:

After some years, the community still had problems to contribute into the OpenSolaris code base from Sun. As Sun did publish source and binaries for OpenSolaris on a regular [basis], nobody was interested in a fork. Then Oracle bought Sun and stopped publishing binaries for recent OpenSolaris releases and stopped talking to the community at the same time. Now more and more people started to talk about forking, but they felt helpless.

A lack of communication

Jörg is right in his description. When Oracle acquired Sun, they talked a lot about MySQL and Java, but said virtually nothing about (Open)Solaris. In February 2010, Peter Tribble complained that Oracle hadn't even mentioned the distribution in a five-hour webcast after the acquisition of Sun. And at the times they did mention OpenSolaris, it was in vague terms. For example, at the OpenSolaris Annual Meeting on IRC in February 2010, Oracle's Dan Roberts said:

Oracle will continue to make OpenSolaris available as open source, and Oracle will continue to actively support and participate in the community. [...] Oracle will also continue to deliver OpenSolaris releases, including the upcoming OpenSolaris 2010.03 release.

Almost half a year later, there hasn't been an OpenSolaris 2010.03 (which was due in March 2010); there still isn't a newer release than OpenSolaris 2009.06. Oracle didn't respond to an open letter by Ben Rockwood, a respected OpenSolaris community member. In April 2010, Oracle stopped shipping free OpenSolaris CDs. In general, Oracle's silence continued, and a lot of community members were getting fed up with the insecurity. This led the OpenSolaris Governing Board to the decision to issue an ultimatum to Oracle: if Oracle doesn't appoint a liaison who has the authority to talk about the future of OpenSolaris by August 16, the OGB will disband itself. OGB chair John Plocher explained the issues:

Without the Oracle part of the partnership at the table, there is effectively nothing for the OGB - or development community - to do. The flagship OpenSolaris distro is absent, the IPS [Image Packaging System] repositories are stagnant, the build instructions no longer work for the sources that exist, even the architectural reviews of community-developed components are being held behind Oracle's closed doors. It is as if the spirit of open, collaborative development centered around the Solaris operating system has died.

Oracle's communication blackout, combined with its disengagement from and disenfranchisement of the community has made it extremely difficult for the OGB to continue in its role of being an advocate for the collective improvement of OpenSolaris.

Oracle didn't respond to the ultimatum, and the long radio silence about the future of OpenSolaris has already taken some victims. PulsarOS, a file server operating system, has switched from OpenSolaris to Linux. The high-integrity operating system AuroraUX was first an OpenSolaris distribution but has quietly shifted to become a DragonFlyBSD derivative. And the other OpenSolaris derivatives are stagnant or only reach a small niche, with the exception of Nexenta.

The community

The Illumos project (which Garrett started about two months ago, even before the OGB's ultimatum) wants to tackle these issues, not only by writing code but also by building an independent community. In this first phase, Garrett will serve as the benevolent dictator of Illumos, but this dictatorship is a temporary measure. He will appoint some initial members for an Administrative Council, which will work on non-technical matters such as defining a code of conduct. Garrett is not only head of this Administrative Council, but also the technical lead: head of the Developer Council. This will be made up of developers with commit rights, and it is a meritocracy: anyone can join in and will be judged by their code. Development will be consensus driven, but the technical lead is the final arbiter if consensus is not possible. A process to replace the technical lead will be discussed in the future.

Nexenta is currently the main sponsor of Illumos (Garrett's efforts have been funded by Nexenta), but the goal is to have a code base that is free from the control of any company. Corporate entities are welcome, though, to contribute. About one week before the launch of Illumos, Garrett invited Oracle to participate in Illumos. He contacted Bonnie Corwin, but she told him that she simply did not know who was responsible for decisions relating to OpenSolaris. She also said that she would endeavor to find out, but Garrett hasn't heard anything back. Right now Illumos has about a dozen developers, many of them well-known people in the OpenSolaris community, working for Nexenta Systems, hosting company Joyent, Greenviolet, Belenix, Schillix, and Everycity.

Some have argued that Illumos is purely driven by emotional feelings about OpenSolaris, by people who can't stand thinking about the death of the operating system, but this is far from true. For example, Nexenta and Joyent really depend on OpenSolaris for their business. They obviously don't like that they ultimately depend on what Oracle decides to do with OpenSolaris. So from a business point of view, it makes perfect sense that they want to free OpenSolaris and create an active developer community, independent from Oracle's good will. If they succeed, these companies can be much less concerned about their business continuity.

Of course, they have much less development manpower than Oracle (the vast majority of OpenSolaris development has always come from paid Sun/Oracle engineers), but they have some really good ex-Sun developers, like Garrett D'Amore (Nexenta's senior director of engineering), Richard Elling (Nexenta's senior director of solutions engineering), Bryan Cantrill (Joyent's vice president of engineering), and so on. Other notable backers of Illumos are Dennis Clarke, who is the founder of the Blastwave project, and Jörg Schilling, who built the first OpenSolaris distribution (Schillix) even before Sun did. By uniting manpower in an open source community, they can tackle their shared concerns together.

From source to distribution

This isn't the first time that an external code repository was created by the OpenSolaris community. For example, the Genunix web site that was created in 2006 at the request of the OpenSolaris Community Advisory Board (the precursor of the OpenSolaris Governing Board) maintained an independent SVN repository for some time, and they also have some Mercurial repositories, for instance for OpenSolaris OS/Net. The latter is an active clone that tracks Oracle's repository.

However, the repository of Illumos is the first one that will be backed by substantial development by non-Oracle people. This allows OpenSolaris distributions that adopt Illumos as their base to depend less on Oracle. Illumos is also insurance against Oracle pulling the plug on OpenSolaris: if Oracle stops supplying the source code to OpenSolaris, distributions can still build upon Illumos. Of course, the project would then become an OpenSolaris fork by necessity, and it's uncertain whether it has the critical mass to continue without Oracle.

For now, it's just a project of open sourcing the whole OS/Net code, but in time Illumos also wants to become a repository for experimental innovations and patches that are not accepted by Oracle. For instance, integration of other architectures like S390, PowerPC, or ARM could be possible in Illumos, because OpenSolaris has no interest in other architectures than x86/x86_64 and SPARC.

All source code that will be integrated into Illumos will first be restricted to BSD/MIT licenses and CDDL (Common Development and Distribution License) with a signed SCA (Sun Contributor Agreement), which gives Sun and the contributor joint copyrights on the code. Other possibilities for licenses are still under investigation, and a critical requirement is of course that the license must be acceptable for merging back into the upstream OpenSolaris code. Illumos is looking specifically at the Apache 2.0 license, because, in Garrett's words, "it offers the freedoms of BSD/MIT without the baggage of copyleft, while still providing a protection against submarine patenting." Moreover, Garrett pinpoints one property that he prefers in any license used in Illumos:

Ideally I'd like it also affords the same privileges to everyone else as to Oracle. This makes the CDDL with SCA somewhat problematic, so I am encouraging folks to look at alternatives.

Illumos is focused only on ON, so it's not an OpenSolaris distribution. However, the project may expand in the future and host some affiliated projects, such as Illumos distributions. For example, there could be a binary ISO with a minimal Illumos system to bootstrap a real Illumos distribution. Although Garrett is a Nexenta employee, Illumos is not focused on Nexenta: for example it imposes no specific package manager, although it has a preference for IPS (Image Packaging System), the OpenSolaris package management system. IPS manifests are the canonical data format in Illumos, but there's a tool to generate .deb files from IPS, which Nexenta uses for its OpenSolaris/Ubuntu hybrid. This system can be extended to generate RPM, SVR4, or any other type of packages from the IPS packages. This is a choice that each derivative distribution can make.

Too little, too late?

In the Linux world, Illumos wasn't received with much enthusiasm. For example, according to Joe "Zonker" Brockmeier OpenSolaris should really die, and he calls Illumos "a misguided attempt to keep the Solaris legacy OS alive for another generation". While Zonker rightly questions Sun's strategy with respect to OpenSolaris, your author finds this death wish to be a bridge too far. But no one can deny that Illumos will face a lot of issues.

A project to liberate OpenSolaris probably should have happened five years ago when Sun released the bulk of the Solaris system code. If Illumos had started then, right from the start of OpenSolaris, the closed bits would have been replaced by now, with less hurdles for OpenSolaris derivatives and a more open community as its consequences. So one could ask the question: is Illumos coming too late? The OpenSolaris community has already lost a lot of users and derivatives because of the uncertain future of the distribution, and only time will tell whether Illumos can still change this downward trend.

Comments (8 posted)

New Releases

MeeGo 1.0 Update for Netbooks

MeeGo has released a second update to MeeGo 1.0 for Netbooks. "This update has over 70 bug fixes and is recommended for all users running MeeGo 1.0 for Netbooks. Please notice that this update is an incremental to the first update."

Full Story (comments: none)

Ubuntu Maverick Alpha-3 released

The third alpha of Ubuntu's Maverick Meerkat is available for testing. This alpha is available in the following flavors: Ubuntu Desktop, Server, and Netbook, Ubuntu Server for UEC and EC2, Kubuntu Desktop and Netbook, Xubuntu, Edubuntu DVD, Ubuntu Studio, and Ubuntu ARM.

Full Story (comments: none)

Distribution News

Quote of the week

So, DebConf time is over once again. The two weeks worth of fifty weeks waiting are left behind once again, and it's back to get back to normal. DebConf was great - Yes, it always is, and that's what we are all saying, but hey - Seriously! Being in the same building than 300 crazed developers is always fun, and it's always better than last year's fun.
-- Gunnar Wolf

Comments (none posted)

Debian GNU/Linux

Bits from the (chilly) Debian release team

The big news in these bits is that "Squeeze" (aka Debian 6.0) has been frozen. "As mentioned in the previous mail, we would freeze when various transitions are completed or being handled. We now feel that this stage has been reached. This means that we have stopped the automatic migration of packages from unstable to testing. In other words, Squeeze has frozen. Thanks are due to everyone who has helped get us to this point."

Full Story (comments: 11)

(seasoned) bits from the DPL: comm, derivatives, delegations, money

Debian Project Leader Stefano Zacchiroli reports on his whereabouts up to July 2010. Topics include communication, derivatives, delegations, money, and forthcoming events.

Full Story (comments: none)

Fedora

Fedora Board Meeting Recap 2010-08-06

Click below for a recap of the August 6, 2010 meeting of the Fedora Advisory Board. Topics include MeeGo spin, Upcoming FUDCons, fedoracommunity.org domains, and Vision for Fedora?. See also Máirín Duffy's summary of the meeting.

Full Story (comments: none)

New Distributions

Uberstudent: The students' Linux (ghacks)

ghacks introduces a new distribution called UberStudent. "Uberstudent is a Linux distribution, built upon Ubuntu, that targets students in higher-education settings. It's goal is to become a perfect platform to aid in the process of education. It is, essentially, a learning platform and to this end it succeeds with aplomb, elegance, and power."

Comments (none posted)

Newsletters and articles of interest

Distribution newsletters

Comments (none posted)

Ubuntu: "We have no plans to fork GNOME" (derStandard.at)

The Austrian Newspaper 'Der Standard' has an interview with Jono Bacon, the Ubuntu community manager. "derStandard.at: Lot's of the work Ubuntu has been doing recently - app indicators, Messaging Menu, MeMenu - hasn't really been done upstream and is only used in Ubuntu. Are you going separate ways? Jono Bacon: Actually that's all upstream work, it's just that Ayatana is the upstream. So the code's completely open source, the bug tracking is open, people can hack on it. For the app indicators we also had a lot of community involvement, it was based on a Freedesktop.org spec, worked on with consultancy from KDE, we invited GNOME developers to participate in the Freedesktop discussion and proposed them to the GNOME community for inclusion, but it's not up to us, if they take it or not. It's kind of similar how other distros have done it in the past, like Novell when they developed their own main menu with Slab, they felt it added value to their distro." (Thanks to Michael Kofler)

Comments (105 posted)

Interview with New Fedora Project Leader Jared Smith (Linux.com)

Henry Kingman interviews newly appointed Fedora Project Leader Jared Smith for Linux.com. "Smith: Most of my participation in the Fedora community has been on my own personal time. I've been using Fedora ever since it was created, and was using Red Hat Linux before that in both professional and personal capacities. But it wasn't until a few years ago that I really started becoming an active contributor and not just an end user."

Comments (none posted)

Spotlight on Linux: openSUSE 11.3 (Linux Journal)

Linux Journal shines a spotlight on openSUSE 11.3. "The distribution itself is rock stable, features plenty of software and includes one of the best control centers in existence. New releases, seen about every 6 to 12 months, usually come in 32- and 64-bit versions with a choice of exhaustive install DVD or desktop-specific installable live CDs. It uses the RPM package management format and maintains large repositories of software. Community repos include proprietary drivers, codecs and extra software."

Comments (none posted)

Hands-on: Jolicloud 1.0 makes Web apps equal desktop citizens (ars technica)

Ars technica has a review of Jolicloud 1.0. "There are a lot of good ideas on display in Jolicloud 1.0, but the nascent product still feels incomplete. If the company behind Jolicloud can expand on the current implementation and fill in some of the gaps, it has the potential to be a real winner. I like where they are taking the user experience and I think that there are a lot of great things that they can do to make the launcher richer if they take full advantage of HTML's inherent strengths."

Comments (none posted)

Spin Your Own Debian with Live Studio (Linux Journal)

Susan Linton takes a look at Debian Live Studio. "Debian developer, Chris Lamb, has created a web-based service to allow users to build their own customized live operating systems. After selecting your preferred options, the server builds and readies your image. Users can select from CD, DVD, USB, or Netboot images. Debian Live Studio requires registration, but is free of cost to use and consists of 100% free software."

Comments (none posted)

Page editor: Rebecca Sobol

Development

GUADEC: Banshee project reaches out for contributors

By Jake Edge
August 11, 2010

Lowering the barriers for contributors is something that a lot of projects are trying to do, but the Banshee media player project has gone further than many as Gabriel Burt reported in his talk at GUADEC. Essentially, Banshee has tried to make it as easy as possible for users—not necessarily programmers—to quickly and easily fix a small bug or add an extension. The project is clearly making an outreach effort to grow its community and some of the techniques being used might be helpful for other projects looking to do the same.

[Gabriel Burt]

Burt is one of the four maintainers of Banshee and he started his talk by remembering back to when he was struggling to figure out how to get involved. At that time, he watched the postings on various Planet aggregators and eventually got his start by looking at the GNOME Human Interface Guidelines (HIG), which led him to his first patch. He noticed that the string "Eject When Finished" didn't follow the HIG, so he grepped for that string and changed it to "Eject when finished".

That was a simple fix, but he also had to get the code and the dependencies, so that he could build Banshee. One of the things that has been addressed since then is that there is extensive help on the web site that describe how to get started, install the dependencies, and get the code. "If you run [Banshee], and find a bug, you can get started easily" to fix it, he said.

There are still things that need to be done to the Banshee interface for HIG compliance, and "you don't need to be a C# programmer" to fix them. But he also demonstrated how quickly one can add a new feature to Banshee with a little programming knowledge.

Live on the GUADEC stage, Burt modified the "deduplicate" feature—which detects artist or album names that are textually different, but refer to the same entity—to add genre deduplication. By making a few changes to the AlbumDuplicateSolver.cs file, mostly consisting of changing "Album" to "Genre", along with some minor Makefile modifications, he was able to add the new functionality. Some queries needed changing, which he said "might take ten minutes" to come up with, but the rest of the changes took just a few minutes. Adding a derivative feature like that is a "good way to get started contributing to Banshee".

Banshee also has an extension framework that allows easy addition of new functionality, but the project has taken it a step further. Executing ./create-extension Foo will set up everything needed for the extension, including doing a git add for the skeleton code and enabling the extension in Banshee. You can have a "working extension in two minutes", Burt said. Once that is done, editing the code in MonoDevelop gives access to all of the Banshee and .NET classes and methods via tab-completion, simplifying the development process as well.

Over the last two-and-a-half years of development, Banshee has averaged 32 commits, ten bugs fixed, and one new contributor every week. There are active IRC channels, forums, and mailing lists for the project. Burt noted that the timezone coverage of the four maintainers is quite good since they live in Sydney, Luxembourg, and both coasts of the US. In short, there are lots of opportunities for those who want to get involved to hook up with the team.

Because extensions are available via a Gitorius repository and there are lot of folks running Banshee development versions straight from the git repositories, contributions will be quickly picked up by others: "within days, lots of people will be using your work". Banshee has a one-month release cycle, so "within a month, thousands will be using it". He estimated that some 55,000 users picked up the monthly releases, and that within six months, "millions" would be using the code because Banshee is installed by default on several distributions and is available in the package repositories for many more.

At the end of the talk, Burt pointed out some of the more recent Banshee extensions that integrate with various services, including the Amazon MP3 store. It can be browsed in Banshee, as it uses WebKit for the browser functionality, and songs can be downloaded directly into your music library. Through the affiliate program, 10% of any purchases go to the project, which donates all of that money to the GNOME Foundation. He also mentioned Miro Guide and Internet Archive extensions as other useful ways to get audio and video content.

Obviously, Burt's talk was another part of the outreach effort for the project. "I hope you'll join in and help us having fun hacking on Banshee", he said. Though Mono and C# leave a bad taste in the mouths of some, Banshee is clearly trying to overcome that by making the project as accessible as possible for new users. But beyond that, there is a clear sense that Banshee is not about making any kind of political or social statement, it's about the enjoyment of hacking on a cutting-edge, multimedia application. Other projects could certainly follow its lead and potentially grow their communities as well.

Comments (1 posted)

Brief items

Development quotes of the week

I can only say that thousands of free software developers are alive today, and I don't think free software development is fatal for many of them.
-- Richard Stallman

Please, help me bring my boyfriend back from "Linux Land." His name is Zach. If you find him, you may have to shut off the computer you find him in front of to get him to speak in anything other than "C." Sometimes he will speak to you in French, but thats only because he has his phone in French. I don't speak French so this too has become a wedge in our relationship. This is a severe issue. Please fix this.
-- "Ilana" files a bug report

Comments (2 posted)

BackupPC 3.2.0

BackupPC 3.2.0 was released on July 31. "BackupPC is a high-performance, enterprise-grade system for backing up Linux, WinXX and MacOSX PCs and laptops to a server's disk. BackupPC is highly configurable and easy to install and maintain." The new release has several new features, including an FTP xfer method, more options for the server backup command, better error reporting in the web interface, and more. There are a "significant number" of bug fixes as well.

Comments (none posted)

CouchDB 1.0.0 data loss bug

If you are running CouchDB 1.0.0, you'll want to have a look at this notice in the near future. "Over the weekend of August 7th-8th, 2010 we discovered and fixed a nasty bug in CouchDB 1.0.0. The problem was subtle (cancelling a timer, without deleting the reference to it) but the ramifications are not: once the bad code path is triggered, subsequent writes to the database are never committed. This means there is potential data-loss for users of 1.0.0." There is an update available, but the right sequence of steps should be followed to minimize the chances of lost data in the update process.

Comments (2 posted)

GNOME 2.31.6 released

GNOME 2.31.6 development release is available for testing. "Ahah! This release is one day late, and I can already hear some people saying we're not good at respecting our schedule! I must admit that straight after GUADEC is not the best timing for a GNOME release ;-)"

Full Story (comments: 1)

KDE Releases Development Platform, Applications and Plasma Workspaces 4.5.0

KDE has announced the release of the Plasma Desktop and Netbook workspaces, the KDE Development Platform and a large number of applications available in their 4.5.0 versions. "In this release, the KDE team focused on stability and completeness of the Desktop experience. More than 16,000 bugs have been fixed, and many feature requests have been filled. The result for the user is a system that feels faster, takes less time to "think", and works more reliably."

Full Story (comments: 15)

Multitouch protocol specification v1

The first public draft of the multitouch protocol specification - the description of how the X Window System will work between multitouch input devices and applications - is available. Lots of low-level detail which will be of interest to developers who want to develop multitouch-aware applications. "If you can poke holes into the spec, come up with use-cases that are not covered or have general questions that aren't answered, please point them out. The sooner, the better."

Full Story (comments: none)

OpenOffice.org 3.3 Beta Release Available

The OpenOffice.org Community has announced the availability of a beta release of its upcoming 3.3 version. "This first preview is for everyone interested in the new features and enhancements of the final 3.3 release, expected later this year."

Full Story (comments: 41)

Newsletters and articles

Development newsletters from the last week

Comments (none posted)

Van de Ven: On the changing role of PowerTOP

On his blog, Arjan van de Ven reflects on the changes in how the PowerTOP utility is being used, with an eye toward restructuring it. "So it's now time to rethink some of the code and make things much more scalable for adding new checks and features. In addition, the output also needs to improve to be more useful as a diagnostics tool. I'm thinking about adding a 'generate a report' option, that basically gives a complete report card of the system. [...] This doesn't mean I want to leave the end user behind; not at all. But in terms of new features, with all the low hanging fruit taken care of, some of the things PowerTOP needs to do are just a lot more technical than what PowerTOP 1.0 offered."

Comments (22 posted)

Genode 10.08 Comes with Gallium3D, MadWifi, Qt4.6.3 (OSNews)

OSNews covers the release of the Genode OS Framework version 10.08. "Today, the Genode OS Framework has seen another feature-rich release, introducing support for hardware-accelerated graphics by the means of Gallium3D, wireless networking via the MadWifi communication stack, a new block-device infrastructure, and Qt4 version 4.6.3. Genode is a modular framework for building special-purpose operating systems, currently supporting 6 different kernels. With the new release, its device-driver coverage reaches a new level and brings the project one step closer towards the goal of shaping Genode to a general-purpose OS."

Comments (4 posted)

Page editor: Jonathan Corbet

Announcements

Non-Commercial announcements

Financial Support for United States PostgreSQL User Groups

The United States PostgreSQL Association has announced that funds are available for US based PostgreSQL User Groups. "Specifically the United States PostgreSQL Association is providing a lump sum pool of 1000.00 USD per month to PostgreSQL User Groups in the United States. This is a trial and will be available from September 2010 through December 2010 after which point it will be reviewed for further applicability."

Full Story (comments: none)

Commercial announcements

Qualcomm Innovation Center joins the Linux Foundation as a Platinum member

At LinuxCon on August 10, The Linux Foundation announced that Qualcomm Innovation Center Inc. (QuIC) has joined the organization as a Platinum member. "QuIC, a wholly owned subsidiary of Qualcomm Incorporated, is focused on developing software for mobile open source platforms, optimizing open source software for mobile technologies, and enabling developers to build applications easily for the millions of devices powered by Qualcomm's chipsets. [...] The use of Linux in mobile and electronic devices has been soaring. Linux is now the underpinning operating system (OS) for Android, MeeGo, and WebOS, among others, and today accounts for one of the highest growth rates of any OS in the device market, according to recent research conducted by ABI Research ('Linux for Mobile Devices', July 2010). "

Comments (1 posted)

Articles of interest

Linux Foundation Launches Open Compliance Program (Linux.com)

Over at Linux.com, Joe 'Zonker' Brockmeier reports on the newly formed Open Compliance Program. "More than 30 companies have joined with The Linux Foundation today to launch the Open Compliance Program (OCP), an initiative to help companies ensure that their products comply with the requirements of FOSS licenses. The program comprises a set of open source tools to enable compliance efforts, a self-assessment checklist, training and consulting services, and a directory of compliance officers at participating companies." Also of note is GPL enforcer Bradley M. Kuhn's reaction entitled "May They Make Me Superfluous": "If this Linux Foundation (LF) program is successful, I may get something I've wished for since the first enforcement I ever worked on back in late 1998: I'd like to never do GPL enforcement again. I admit I talk a lot about GPL enforcement. It's indeed been a major center of my work for twelve years, but I can't say I've ever really liked doing it."

Comments (4 posted)

Copyright assignment - Once bitten, twice shy (The H)

The H has put up a lengthy look at copyright assignment policies. There may not be much new here for LWN readers, but it's a reasonable summary of the anti-assignment position. "But it isn't just the paperwork that causes friction. The code may be released under a copyleft licence, but once the copyright has been re-assigned, you have surrendered your rights, and the new owner is able to apply all the toxic conditions and ramifications that apply to any closed source licence in a simultaneous release of your code. The developer has the option of forking the code, but this is not what you are aiming for when you contribute to a free software project."

Comments (28 posted)

Eben Moglen's LibrePlanet 2010 Keynote (Groklaw)

Groklaw has a transcript of Eben Moglen's keynote at LibrePlanet. "Freedom will from here out be endangered. Freedom will be attacked, freedom will be undermined, freedom will be evaded in various ways -- some of them clever some of them stupid -- but from here on out, the relationship between technological sophistication, agility, reliability, adaptability and low cost, means that freedom has acquired an extraordinary set of unintentional allies. They may not care about freedom at all, but they no longer have a choice but to further freedom's interests." Video is also available.

Comments (29 posted)

How the Hold Up Problem Explains the Flash Wars (GigaOM)

GigaOM looks at one aspect of software freedom while never using that term; instead, the article is, on its face, about economics and Flash. "The hold up problem is particularly severe in the IT sector. Building an Internet company on a foundation consisting of proprietary software owned by others is akin to building a house without owning the land under it. When software is sold in binary form, the buyer is subject to hold up by the vendor; if the software needs to be changed in the future, such changes can only be done with the cooperation of the original vendor at the price that the original vendor demands. By relying on open source, a company can invest in developing its product without fear of being held up down the road. Therefore, open source is an economically powerful solution to the hold up problem."

Comments (17 posted)

New Books

Autotools--New from No Starch Press

No Starch Press has released "Autotools - Creating Portable Software Just Got Easier" by John Calcote.

Full Story (comments: none)

Build Your Own Wicked WordPress Themes--New from SitePoint

SitePoint has released "Build Your Own Wicked WordPress Themes" by Raena Jackson Armitage, Alan Cole, Brandon R. Jones, and Jeffrey Way.

Full Story (comments: none)

Resources

FSFE Newsletter August 2010

The Free Software Foundation Europe newsletter for August 2010 is out. "The focus of this edition is Free Software in the public sector: on a national level within the United Kingdom, in the Italian region of Bozen, and in the Austrian city of Linz. We introduce a new definition and of mnemonic Open Standards, and invite you to participate in upcoming local Free Software events."

Full Story (comments: none)

Contests and Awards

Open Source for America Celebrates its First Anniversary with Awards

Open Source for America has announced an awards program to recognize those that have been most influential in advancing its goals of educating decision makers in the U.S. Federal government about the advantages of using free and open source software. "[I]f you know someone who's contributions should be recognized, why don't you nominate them for one of the awards? It will only take a few moments of your time. You can find the awards categories and rules here."

Comments (none posted)

Calls for Presentations

MeeGo Conference 2010 Call for Session Proposals

The first MeeGo Conference will take place in Dublin, Ireland, November 15-17, 2010. The call for session proposals closes August 23, 2010. "Do you want to speak at the first MeeGo Conference in Dublin, Ireland on November 15 - 17? Now is your chance! The call for session proposals has started, and anyone who wants to speak at the conference must submit a proposal. Proposals from community members, Intel, Nokia, the Linux Foundation and others will all be given equal consideration."

Comments (none posted)

Upcoming Events

DjangoCon US 2010

DjangoCon US 2010 takes place September 7-9, 2010 in Portland, Oregon. The conference will be followed by a three day sprint.

Full Story (comments: none)

Events: August 19, 2010 to October 18, 2010

The following event listing is taken from the LWN.net Calendar.

Date(s)EventLocation
August 21
August 22
Free and Open Source Software Conference St. Augustin, Germany
August 23
August 27
European DrupalCon Copenhagen, Denmark
August 28 PyTexas 2010 Waco, TX, USA
August 31
September 1
LinuxCon Brazil 2010 São Paulo, Brazil
August 31
September 3
OOoCon 2010 Budapest, Hungary
September 6
September 9
Free and Open Source Software for Geospatial Conference Barcelona, Spain
September 7
September 9
DjangoCon US 2010 Portland, OR, USA
September 8
September 10
CouchCamp: CouchDB summer camp Petaluma, CA, United States
September 10
September 12
Ohio Linux Fest Columbus, Ohio, USA
September 11 Open Tech 2010 London, UK
September 13
September 15
Open Source Singapore Pacific-Asia Conference Sydney, Australia
September 16
September 17
Magnolia-CMS Basel, Switzerland
September 16
September 17
3rd International Conference FOSS Sea 2010 Odessa, Ukraine
September 16
September 18
X Developers' Summit Toulouse, France
September 17
September 18
FrOSCamp Zürich, Switzerland
September 17
September 19
Italian Debian/Ubuntu Community Conference 2010 Perugia, Italy
September 18 Software Freedom Day 2010 Everywhere, Everywhere
September 18
September 19
WordCamp Portland Portland, OR, USA
September 21
September 24
Linux-Kongress Nürnberg, Germany
September 23 Open Hardware Summit New York, NY, USA
September 24
September 25
BruCON Security Conference 2010 Brussels, Belgium
September 25
September 26
PyCon India 2010 Bangalore, India
September 27
September 28
Workshop on Self-sustaining Systems Tokyo, Japan
September 27
September 29
Japan Linux Symposium Tokyo, Japan
September 29 3rd Firebird Conference - Moscow Moscow, Russia
September 30
October 1
Open World Forum Paris, France
October 1 Firebird Day Paris - La Cinémathèque Française Paris, France
October 1
October 2
Open Video Conference New York, NY, USA
October 3
October 4
Foundations of Open Media Software 2010 New York, NY, USA
October 4
October 5
IRILL days - where FOSS developers, researchers, and communities meet Paris, France
October 7
October 9
Utah Open Source Conference Salt Lake City, UT, USA
October 8
October 9
Free Culture Research Conference Berlin, Germany
October 11
October 15
17th Annual Tcl/Tk Conference Chicago/Oakbrook Terrace, IL, USA
October 12 Eclipse Government Day Reston, VA, USA
October 12
October 13
Linux Foundation End User Summit Jersey City, NJ, USA
October 16 FLOSS UK Unconference Autumn 2010 Birmingham, UK
October 16 Central PA Open Source Conference Harrisburg, PA, USA

If your event does not appear here, please tell us about it.

Page editor: Rebecca Sobol

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds