LWN.net Weekly Edition for January 27, 2011

Unhosted web applications: a new approach to freeing SaaS

January 26, 2011

This article was contributed by Nathan Willis

Free software advocates have been pushing hard against the growing trend of commercial Software-as-a-Service (SaaS) — and the resulting loss of autonomy and software freedom — for several years now. A new project named Unhosted takes a different approach to the issue than that used by better-known examples like Diaspora and StatusNet. Unhosted is building a framework in which all of a web application's code is run on the client-side, and users have the freedom to choose any remote data storage location they like. The storage nodes use strong encryption, and because they are decoupled from the application provider, users always have the freedom to switch between them or to shut off their accounts entirely.

The Unhosted approach

An outline of the service model envisioned by Unhosted can be found on the project's Manifesto page, written by founder Michiel de Jong. "A hosted website provides two things: processing and storage. An unhosted website only hosts its source code (or even just a bootloader for it). Processing is done in the browser, with ajax against encrypted cloud storage."

In other words, the manifesto continues, despite the availability of the Affero GPL (AGPL), which requires making source code available to network end-users, licensing alone is not enough to preserve user freedom because proprietary SaaS sites require users to upload their data to "walled silos" run by the service provider. An Unhosted application is a JavaScript program that runs in the browser, but accesses online storage on a compliant storage node. It does not matter to the application whether the storage node is run by the application provider, the user, or a third party.

Storage nodes are essentially commodity infrastructure, but in order to preserve user freedom, Unhosted requires that applications encrypt and sign the data they store. The project defines an application-layer protocol called Unhosted JSON Juggling Protocol (UJJP, sometimes referred to as UJ) for applications to communicate with storage nodes, for requesting and exchanging objects in JavaScript Object Notation (JSON) format.

As the FAQ explains, this constitutes a distinctly different model than most other free software SaaS projects. Most (like StatusNet and Diaspora) focus on federation, which allows each user to run his or her own node, and requires no centralized authority linking all of the user accounts. The down side of the federated systems are that they may still require the users to entrust their data to a remote server.

Eben Moglen's FreedomBox, on the other hand, focuses on putting the storage under the direct control of the user (specifically, stored at home on a self-managed box). This is a greater degree of freedom, but home-hosting is less accessible from the Internet at large than most web services because it often depends on Dynamic DNS. Home-hosting is also vulnerable to limited upstream bandwidth and common ISP restrictions on running servers.

Unhosted, therefore, attempts to preserve the "accessible anywhere" nicety of popular Web 2.0 services, but de-link the application from the siloed data.

Connecting applications to storage

Obviously, writing front-end applications entirely in HTML5 and JavaScript is not a new idea. The secret sauce of Unhosted is the connection method that links the application to the remote storage node — or, more precisely, that links the application to any user-defined storage node. The system relies on Cross-Origin Resource Sharing (CORS), a W3C Working Draft mechanism by which a server can opt-in to make its resources available to requests originating from other servers.

In the canonical "web mail" example, the Unhosted storage node sees a cross-origin request from the webmail application, checks the source, user credentials, and request type against its access control list, and returns the requested data only if the request is deemed valid. UJJP defines the operations an application can perform on the storage node, including creating a new data store, setting and retrieving key-value pairs, importing and exporting data sets, and completely deleting a data store.

Security-wise, each application only has access to its own data store, not the user's entire storage space, and CORS does allow each storage node to determine a policy about which origins it will respond to. But beyond that, the system also relies on the fact that the user has access to all of the application source code, because it runs in the browser. Thus it is up to the user to notice if the application does something sinister like relay user credentials to an untrusted third party. Dealing with potentially obfuscated JavaScript may be problematic for users, but it is still an improvement over server-side processing, which happens entirely out of sight.

Finally, each application needs a way to discover which storage node a user account is associated with, preferably without prompting the user for the information every time. The current Unhosted project demo code relies on Webfinger-based service discovery, which uniquely associates a user account with an email address. The user would log in to the application with an email address, the application would query the address's Webfinger identity to retrieve a JSON-formatted array of Unhosted resource identifiers, and connect to the appropriate one to find the account's data store.

This is not a perfect solution, however, because it depends on the email service provider supporting Webfinger. Other proposed mechanisms exist, including using Jabber IDs and Freedentity.

The tricky bits

Currently, one of the biggest sticking points in the system is protecting the stored data without making the system arduous for end users. The present model relies on RSA encryption and signing for all data stores. Although the project claims this is virtually transparent for users, it gets more difficult when one Unhosted application user wishes to send a message to another user. Because the other user is on a different storage node, that user's public key needs to be retrieved in order to encrypt the message. But the system cannot blindly trust any remote storage node to authoritatively verify the other user's identity — that would be trivial to hijack. In response, the Unhosted developers are working on a "fabric-based public key infrastructure" that enables users to deterministically traverse through a web-of-trust from one user ID to another. Details on that part of the system are still forthcoming.

It is also an open question as to what sort of storage engine makes a suitable base for an Unhosted storage node. The demo code includes servers written in PHP, Perl, and Python that all run on top of standard HTTP web servers. On the mailing list, others have discussed a simple way to implement Unhosted storage on top of WebDAV, but there is no reason that a storage node could not be implemented on top of a distributed filesystem like Tahoe, or a decentralized network like Bittorrent.

Perhaps the most fundamental obstacle facing Unhosted is that it eschews server-side processing altogether. Consequently, no processing can take place while the user is logged out of the application. Logged out could simply mean that the page or tab is closed, or an application could provide a logout mechanism that disconnects from the storage node, but continues to perform other functions. This is fine for interactive or message-based applications like instant messaging, but it limits the type of application that can be fit into the Unhosted mold. Judging by the mailing list, the project members have been exploring queuing up operations on the storage node side, which could enable more asynchronous functionality, but Unhosted is still not a replacement for every type of SaaS.

Actual code and holiday bake-offs

The project has a Github repository, which is home to some demonstration code showing off both parts of the Unhosted platform — although it loudly warns users that it is not meant for production use. The "cloudside" directory includes an example Unhosted storage node implementation, while the "wappside" directory includes three example applications designed to communicate with the storage node.

The storage node module speaks CORS and is written in PHP with a MySQL back-end. It does not contain any server-side user authentication, so it should not be deployed outside the local area network, but it works as a sample back-end for the example applications.

The example application set includes a JavaScript library named unhosted.js that incorporates RSA data signing and signature verification, encryption and decryption, and AJAX communication with the CORS storage node. There is a separate RSA key generation Web utility provided as a convenience, but it is not integrated into the example applications.

The example named "wappblog" is a simple blog-updating application. It creates a client-side post editor that updates the contents of an HTML file on a storage node, which is then retrieved for reading by a separate page. The "wappmail" application is a simple web mail application, which requires you to set up multiple user accounts, but shows off the ability to queue operations — incoming messages are stored and processed when each user logs in.

The third example is an address book, which demonstrates the fabric-based PKI system (although the documentation warns "it's so new that even I don't really understand how it works, and it's mainly there for people who are interested in the geeky details").

A more practical set of example applications are the third-party projects written for Unhosted's "Hacky Holidays" competition in December. The winning entry was Nathan Rugg's Scrapbook, which allows users to manipulate text and images on an HTML canvas, and shows how an Unhosted storage node can be used to store more than just plain text. Second place was shared between the instant messenger NezAIM and the note-taking application Notes.

The fourth entry, vCards, was deemed an honorable mention, although it used some client-side security techniques that would not work in a distributed environment in the real world (such as creating access control lists on the client side). The author of vCards was commended by the team for pushing the envelope of the protocol, though — he was one of the first to experiment with queuing operations so that one Unhosted application could pass messages to another.

Hackers wanted

At this stage, Unhosted is still primarily a proof-of-concept. The storage node code is very young, and has not been subjected to much real-world stress testing or security review. The developers are seeking input for the next (0.3) revision of UJJP, in which they hope to define better access control mechanisms for storage nodes (in part to enable inter-application communication) as well as a REST API.

On a bad day, I see "unresponsive script" warnings in Firefox and think rich client-side JavaScript applications sound like a terrible idea, but perhaps that is missing the bigger picture. StatusNet, Diaspora, and the other federated web services all do a good job of freeing users from reliance on one proprietary application vendor — but none of them are designed to make the storage underneath a flexible, replaceable commodity. One of the Unhosted project's liveliest metaphors for its storage de-coupling design is that it provides "a grease layer" between the hosted software and the servers that host it. That is an original idea, whether the top layer is written in JavaScript, or not.

Comments (10 posted)

The first stable release of LibreOffice

By Jake Edge
January 26, 2011

After just four months since splitting away from the OpenOffice.org project, LibreOffice has made its first stable release. While LibreOffice 3.3 and the (also just released) OpenOffice.org 3.3 share most of the same code, LibreOffice has started to differentiate itself from its progenitor. It has also built an impressive community in that time, and will be included in the next releases of the major community distributions. From almost any angle, it looks like LibreOffice is on a roll.

There are quite a few new features, as well as bug fixes, in the new release. Some of them may not seem all that new, at least to those who have been using the OpenOffice.org 3.3 release candidates. For some time, Linux users have generally been getting a much-enhanced version of OpenOffice.org based on the builds maintained by the Go-oo project. Since Go-oo has essentially morphed into the LibreOffice project, much of the new functionality will be found in both LibreOffice 3.3 and OpenOffice.org 3.3 on many distributions.

For example, the SVG import feature for Writer and Draw that is listed as a LibreOffice-only feature also appears in the latest OpenOffice.org for Fedora 14 (which is based on OpenOffice.org 3.3-rc9 plus the Go-oo patches). It may be that Windows and Mac OS X users are the most likely to notice big differences, depending on where they were getting their version of OpenOffice.org (from Sun/Oracle or Go-oo). It should also be noted that the SVG import feature still has some bugs to be excised. On an import of the SVG of the LWN penguin, both LibreOffice and OpenOffice.org took many minutes (on the order of ten) to render the SVG, and the rendering was incorrect. Both GIMP and Inkscape render it in seconds (or less) and are both in agreement that it should look much the way it does in the upper left of this page.

I gave LibreOffice 3.3 a try on Fedora 14. Not finding any experimental LibreOffice yum repository in a quick search, I decided to go ahead and download the tarball, which provided a handful of (unsigned) RPMs. After installing those, there is an additional "desktop-integration" RPM to install, which conflicted with the various OpenOffice.org packages that were still installed. After a moment's thought, I went ahead and removed OpenOffice.org, which proved uneventful as LibreOffice is a drop-in replacement.

Working with various documents and spreadsheets in LibreOffice was also uneventful, which is no real surprise. It's not clear what differences there are between Fedora's OpenOffice.org 3.3 and LibreOffice 3.3, but they were not particularly evident in the (fairly simple) documents that I worked with. For power users, perhaps there are more obvious differences. But there is also no reason to go back to OpenOffice.org that I can see. Apart from the LibreOffice start-up splash screen, it really isn't apparent that you aren't running OpenOffice.org.

Lots of Linux users will likely be using LibreOffice soon anyway, as Ubuntu, openSUSE, and Fedora all plan to ship it in their next release. openSUSE 11.4 is currently scheduled for March, so it may well be the first of those to switch over to LibreOffice. But Ubuntu 11.04 ("Natty Narwhal") and Fedora 15 won't be far behind, with the former scheduled for April and the latter for May. Debian "Squeeze" (6.0) will still be shipping OpenOffice.org (3.2.x), which is not surprising given the stability that Debian likes to bake into its releases.

Looking beyond the 3.3 release, LibreOffice has a fairly aggressive release schedule, with plans for 3.3.1 in February and 3.3.2 in March, both of which will consist mostly of bug fixes. There are also plans for a more major 3.4 release in May. Over time, the plan is to do major releases every six months and to try to align them with distribution release cycles by making those releases in March and September.

The biggest new feature in LibreOffice 3.3 is probably the SVG import mentioned earlier. Another is the ability to have spreadsheets with up to a million rows (the previous limit was 64K). Many of the rest seem like they will be popular with a much smaller subset of LibreOffice users, though the improved support for importing MS Works, Lotus Word Pro, and WordPerfect formats will likely strike a chord with folks who have to deal with documents in those formats.

Many of the new features listed seem like longstanding bugs (or misfeatures) of OpenOffice.org that are finally being addressed. An easier to use title page dialog box, a tree view for headings, better slide layout handling for Impress, radio button widgets in the menus, auto-correction that correctly matches the case of the word replaced, and so on, all seem like things that have been bugging users for some time but weren't getting addressed in the OpenOffice.org releases.

The ability to address some of these warts is part of why LibreOffice exists. The large number of patches that were carried along by the Go-oo project was not going to be a sustainable model, and the development style of the OpenOffice.org project made it unable, or unwilling, to incorporate many of those kinds of changes, at least quickly. The LibreOffice developers have clearly learned from that experience and are trying to fix these kinds of things as quickly as reasonable patches are submitted.

One of the goals of the LibreOffice project is to be welcoming to new developers and their patches. That's part of the reason that there is no contributor agreement required for LibreOffice patches. But the welcoming approach goes beyond that. The now semi-famous list of "easy hacks" as an easy introduction for developers (and others) is a perfect example. Many projects would probably find it easier to get people involved by maintaining a similar list.

There is also an active development mailing list, with discussions about all kinds of patches, bugs, and features. There are other mailing lists for users, design, marketing, documentation, and so on, along with an active #documentfoundation IRC channel on Freenode.

Some friction is to be expected in the formative stages of a new project and LibreOffice is not immune to that. The OOXML debate was one such incident. In addition, steering committee member Florian Effenberger alludes to some unhappiness in the community about the role of that committee. Project governance is by no means a solved problem, and community members will often disagree about the direction of the project and its leadership. That certainly isn't just a problem for new projects as the current turmoil in the FFmpeg project will attest.

OpenOffice.org is still chugging along, but a look at its mailing lists suggests that there is far less enthusiasm in that community than LibreOffice's. That may not be a good way to measure, or even estimate, a community's fervor, but it definitely seems like the wind has gone out of OpenOffice.org's sails. Oracle has an interest in continuing Oracle Open Office (formerly StarOffice)—the commercial offshoot of OpenOffice.org—development, but one has to wonder how long it will be willing to maintain an open source edition.

Because of the "corporate" development style and the contributor agreement requirements for OpenOffice.org—two of the major reasons that LibreOffice forked—it seems likely that external contributions, such as they were, will be on the decline. The two projects have the same LGPLv3 license, so code can theoretically migrate between them, but new features that go into LibreOffice may not make their way into OpenOffice.org because of the contributor agreement. That means that LibreOffice can cherry-pick features from OpenOffice.org, at least as long as the code bases don't diverge too much, while OpenOffice.org has to either forgo or reimplement them. Should LibreOffice be successful, it will provide a pretty clear object lesson on the perils of requiring contributor agreements.

Overall, the progress made by LibreOffice has been very impressive. Obviously the Go-oo project (and OpenOffice.org itself) gave the LibreOffice founders a good starting point—and a lot of lessons and experience—but that doesn't diminish what has been accomplished at all. One can only imagine the strides that will be made over the next year or two. It will no doubt be interesting to see where it goes from here.

Comments (1 posted)

LCA: Vint Cerf on re-engineering the Internet

By Jonathan Corbet
January 25, 2011

Vint Cerf is widely credited as one of the creators of the Internet. So, when he stood up at linux.conf.au in Brisbane to say that the net is currently in need of some "serious evolution," the attendees were more than prepared to listen. According to Vint, it is not too late to create a better Internet, despite the fact that we have missed a number of opportunities to update the net's infrastructure. Quite a few problems have been discovered over the years, but the solutions are within our reach.

His talk started back in 1969, when he was hacking SIGMA 7 to make it talk to the ARPAnet's first Internet message processor (IMP). The net has grown a little since then; current numbers suggest that there are around 768 million connected machines - and that doesn't count the vast numbers of systems with transient connections or which are hiding behind corporate firewalls. Nearly 2 billion users have access to the net. But, Vint said, that just means that the majority of the world's population is still waiting to connect to the net.

From the beginning, the net was designed around the open architecture ideas laid down by Bob Kahn. Military requirements were at the top of the list then, so the designers of the net created a system of independent networks connected via routers with no global control. Crucially, the designers had no particular application in mind, so there are relatively few assumptions built into the net's protocols. IP packets have no understanding of what they carry; they are just hauling loads of bits around. Also important was the lack of any country-based addressing scheme. That just would not make sense in the military environment, Vint said, where it can be very difficult to get an address space allocation from a country which one is currently attacking.

The openness of the Internet was important: open source, open access, and open standards. But Vint was also convinced from an early date that commercialization of the Internet had to happen. There was no way that governments were going to pay for Internet access for all their citizens, so a commercial ecosystem had to be established to build that infrastructure.

The architecture of the network has seen some recent changes. At the top of the list is IPv6. Vint was, he said, embarrassed to be the one who decided, in 1977, that 32 bits would be more than enough for the net's addressing system. Those 32 bits are set to run out any day now, so, Vint said, if you're not doing IPv6, you should be. We're seeing the slow adoption of non-Latin domain names and the DNSSEC protocol. And, of course, there is the increasing prominence of mobile devices on the net.

One of the biggest problems which has emerged from the current Internet is security. He was "most disturbed" that many of the problems are not technical, they are a matter of suboptimal user behavior - bad passwords, for example. He'd like to see the widespread use of two-factor authentication on the net; Google is doing that internally now, and may try to support it more widely for use with Google services. The worst problems, he said, come from "dumb mistakes" like configuration errors.

So where are security problems coming from? Weak operating systems are clearly a part of the problem; Vint hoped that open-source systems would help to fix that. The biggest problem at the moment, though, is browsers. Once upon a time, browsers were simple rendering engines which posed little threat; now, though, they contain interpreters and run programs from the net. The browser, he said, has too much privilege in the system; we need a better framework in which to securely run web-based applications. Botnets are a problem, but they are really just a symptom of easily-penetrated systems. We all need to work on the search for better solutions.

Another big issue is privacy. User choices are a part of the problem here; people put information into public places without realizing that it could come back to haunt them later. Weak protection of information by third parties is also to blame, though. But, again, technology isn't the problem; it's more of a policy issue within businesses. Companies like Google and others have come into possession of a great deal of privacy-sensitive information; they need to protect it accordingly.

Beyond that, there's the increasing prevalence of "invasive devices," including cameras, devices with location sensors, and more. It is going to be increasingly difficult to protect our privacy in the future; he expressed worries that it may simply not be possible.

There was some talk about clouds. Cloud computing, he said, has a lot of appeal. But each cloud is currently isolated; we need to work on how clouds can talk to each other. Just as the Internet was created through the connection of independent networks, perhaps we need an "intercloud" (your editor's term - he did not use it) to facilitate collaboration between clouds.

Vint had a long list of other research problems which have not been solved; there was not time to talk about them all. But, he says, we have "unfinished work" to deal with. This work can be done on the existing network - we do not need to dump it and start over.

So what is this unfinished work? "Security at all levels" was at the top of the list; if we can't solve the security problem, it's hard to see that the net will be sustainable in the long run. We currently have no equivalent to the Erlang distribution to describe usage at the edges of the network, making provisioning and scaling difficult. The quality of service (and network neutrality) debate, he said, will be going on for a very long time. We need better distributed algorithms to take advantage of mixed cloud environments.

There were, he said, some architectural mistakes made which are now making things harder. When the division was made between the TCP and IP layers, it was decided that TCP would use the same addressing scheme as IP. That was seen as a clever design at the time; it eliminated the need to implement another layer of addressing at the TCP level. But it was a mistake, because it binds higher-level communications to whatever IP address was in use when the connection was initiated. There is no way to move a device to a new address without breaking all of those connections. In the designers' defense, he noted, the machines at the time, being approximately room-sized, were not particularly mobile. But he wishes they had seen mobile computing coming.

Higher-level addressing could still be fixed by separating the address used by TCP from that used by IP. Phone numbers, he said, once were tied to a specific location; now they are a high-level identifier which can be rebound as a phone moves. The same could be done for network-attached devices. Of course, there are problems to be solved - for example, it must be possible to rebind a TCP address to a new IP address in a way which does not expose users to session hijacking. This sort of high-level binding would also solve the multi-homing and multipath problems; it would be possible to route a single connection transparently through multiple ISPs.

Vint would also like to see us making better use of the net's broadcast capabilities. Broadcast makes sense for real-time video, but it could be applied in any situation where multiple users are interested in the same content - for software updates, for example. He described the use of satellites to "rain packets" to receivers; it is, he said, something which could be done today.

Authentication remains an open issue; we need better standards and some sort of internationally-recognized indicators of identity. Internet governance was on the list; he cited the debate over network censorship in Australia as an example. That sort of approach, he said, is "not very effective." He said there may be times when we (for some value of "we") decide that certain things should not be found on the net; in such situations, it is best to simply remove such materials when they are found. There is no hope in any attempt to stop the posting of undesirable material in the first place. Governance, he said, will only become more important in the future; we need to find a way to run the net which preserves its fundamental openness and freedom.

Performance: That just gets harder as the net gets bigger; it can be incredibly difficult to figure out where things are going wrong. He said that he would like a button marked "WTF" on his devices; that button could be pressed when the net isn't working to obtain a diagnosis of what the problem is. But, to do that, we need better ways of identifying performance problems on the net.

Addressing: what, he asked, should be addressable on the Internet? Currently we assign addresses to machines, but, perhaps, we should assign addresses to digital objects as well? A spreadsheet could have its own address, perhaps. One could argue that a URL is such an address, but URLs are dependent on the domain name system and can break at any time. Important objects should have locators which can last over the long term.

Along those lines, we need to think about the long-term future of complex digital objects which can only be rendered with computers. If the software which can interpret such an object goes away, the objects themselves essentially evaporate. He asked: will Windows 3000 be able to interpret a 1997 Powerpoint file? We should be thinking about how these files will remain readable over the course of thousands of years. Open source can help in this regard, but proprietary applications matter too. He suggested that there should be some way to "absorb" the intellectual property of companies which fail, making it available so that files created by that company's software remain readable. Again, Linux and open source have helped to avoid that problem, but they are not a complete solution. We need to think harder about how we will preserve our "digital stuff"; he is not sure what the solution will look like.

Wandering into more fun stuff, Vint talked a bit about the next generation of devices; a network-attached surfboard featured prominently. He talked some about the sensor network in his house, including the all-important temperature sensor which sends him a message if the temperature in his wine cellar exceeds a threshold. But he'd like more information; he knows about temperature events, or whether somebody entered the room, but there's no information about what may have happened in the cellar. So maybe it is time to put RFIDs on the bottles themselves. But that won't help him to know if a specific bottle has gotten too warm; maybe it's time to put sensors into the corks to track the state of the wine. Then he could unerringly pick out a ruined bottle whenever he had to give a bottle of wine to somebody who is unable to tell the difference.

The talk concluded with some discussion of the interplanetary network. There was some amusing talk of alien porn and oversized ovipositors, but the real problem is one of arranging for network communications within the solar system. The speed of light is too slow, meaning that the one-way latency to Mars is, at a minimum, about three minutes (and usually quite a bit more). Planetary rotation can interrupt communications to specific nodes; rotation, he says, is a problem we have not yet figured out how to solve. So we need to build tolerance of delay and disruption deep into our protocols. Some thoughts on this topic have been set down in RFC 4838, but there is more to be done.

We should also, Vint said, build network nodes into every device we send out into space. Even after a device ceases to perform its primary function, it can serve as a relay for communications. Over time, we could deploy a fair amount of network infrastructure in space with little added cost. That is a future he does not expect to see in its full form, but he would be content to see its beginning.

There was a question from the audience about bufferbloat. It is, Vint said, a "huge problem" that can only be resolved by getting device manufacturers to fix their products. Ted Ts'o pointed out that LCA attendees had been advised (via a leaflet in the conference goodie bag) to increase buffering on their systems as a way of getting better network performance in Australia; Vint responded that much harm is done by people who are trying to help.

Comments (107 posted)

LCA: IP address exhaustion and the end of the open net

By Jonathan Corbet
January 26, 2011

Geoff Huston is the Chief Scientist at the Asia Pacific Network Information Centre. His frank linux.conf.au 2011 keynote took a rather different tack than Vint Cerf's talk did the day before. According to Geoff, Vint is "a professional optimist." Geoff was not even slightly optimistic; he sees a difficult period coming for the net; unless things happen impossibly quickly, the open net that we often take for granted may be gone forevermore.

The net, Geoff said, is based on two "accidental technologies": Unix and packet switching. Both were new at their time, and both benefited from open-source reference implementations. That openness created a network which was accessible, neutral, extensible, and commercially exploitable. As a result, proprietary protocols and systems died, and we now have a "networking monoculture" where TCP/IP dominates everything. Openness was the key: IPv4 was as mediocre as any other networking technology at that time. It won not through technical superiority, but because it was open.

But staying open can be a real problem. According to Geoff, we're about to see "another fight of titans" over the future of the net; it's not at all clear that we'll still have an open net five years from now. Useful technologies are not static, they change in response to the world around them. Uses of technologies change: nobody expected all the mobile networking users that we now have; otherwise we wouldn't be in the situation we're in where, among other things, "TCP over wireless is crap."

There are many challenges coming. Network neutrality will be a big fight, especially in the US. We're seeing more next-generation networks based around proprietary technologies. Mobile services tend to be based on patent-encumbered, closed applications. Attempts to bundle multiple types of services - phone, television, Internet, etc. - are pushing providers toward closed models.

The real problem

But the biggest single challenge by far is the simple fact that we are out of IP addresses. There were 190 million IP addresses allocated in 2009, and 249 million allocated in 2010. There are very few addresses left at this time: IANA will run out of IPv4 addresses in early February, and the regional authorities will start running out in July. The game is over. Without open addressing, we don't have an open network that anybody can join. That, he said, is "a bit of a bummer."

This problem was foreseen back in 1990; in response, a nice plan - IPv6 - was developed to ensure that we would never run out of network addresses. That plan assumed that the transition to IPv6 would be well underway by the time that IPv4 addresses were exhausted. Now that we're at that point, how is that plan going? Badly: currently 0.3% of the systems on the net are running IPv6. So, Geoff said, we're now in a position where we have to do a full transition to IPv6 in about seven months - is that feasible?

To make that transition, we'll have to do more than assign IPv6 addresses to systems. This technology will have to be deployed across something like 1.8 billion people, hundreds of millions of routers, and more. There's lots of fun system administration work to be done; think about all of the firewall configuration scripts which need to be rewritten. Geoff's question to the audience was clear: "you've got 200 days to get this done - what are you doing here??"

Even if the transition can be done in time, there's another little problem: the user experience of IPv6 is poor. It's slow, and often unreliable. Are we really going to go through 200 days of panic to get to a situation which is, from a user point of view, worse than what we have now? Geoff concludes that IPv6 is simply not the answer in that time frame - that transition is not going to happen. So what will we do instead?

One commonly-suggested approach is to make much heavier use of network address translation (NAT) in routers. A network sitting behind a NAT router does not have globally-visible addresses; hiding large parts of the net behind multiple layers of NAT can thus reduce the pressure on the address space. But it's not quite that simple.

Currently, NAT routers are an externalized cost for Internet service providers; they are run by customers and ISP's need not worry about them. Adding more layers of NAT will force ISPs to install those routers. And, Geoff said, we're not talking about little NAT routers - they have to be really big NAT routers which cannot fail. They will not be cheap. Even then, there are problems: multiple levels of NAT will break applications which have been carefully crafted to work around a single NAT router. How NAT routers will play together is unclear - the IETF refused to standardize NAT, so every NAT implementation is creative in its own way.

It gets worse: adding more layers of NAT will break the net in fundamental ways. Every connection through a NAT router requires a port on that router; a single web browser can open several connections in an attempt to speed page loading. A large NAT router will have to handle large numbers of connections simultaneously, to the point that it will run out of port numbers. Ports numbers are only 16 bits, after all. So ISPs are going to have to think about how many ports they will make available to each customer; that number will converge toward "one" as the pressure grows. Our aperture to the net, Geoff said, is shrinking.

So perhaps we're back to IPv6, somehow. But there is no compatibility between IPv4 and IPv6, so systems will have to run both protocols during the transition. The transition plan, after all, assumed that it would be completed before IPv4 addresses ran out. But that plan did not work; it was, Geoff said, "economic rubbish." But we're going to have to live with the consequences, which include running dual stacks for a transition period that, he thinks, could easily take ten years.

During that time, we're going to have to figure out how to make our existing IPv4 addresses last longer. Those addresses, he said, are going to become quite a bit more expensive. There will be much more use of NAT, and, perhaps, better use of current private addresses. Rationing policies will be put into place, and governmental regulation may well come into play. And, meanwhile, we know very little about the future we're heading into. TCP/IP is a monoculture, there is nothing to replace it. We don't know how long the transition will take, we don't know who the winners and losers will be, and we don't know the cost. We live in interesting times.

An end to openness

Geoff worried that, in the end, we may never get to the point where we have a new, IPv6 network with the same degree of openness we have now. Instead, we may be heading toward a world where we have privatized large parts of the address space. The problem is this: the companies which have lost the most as the result of the explosion of the Internet - the carriers - are now the companies which are expected to fund and implement the transition to IPv6. They are the ones who have to make the investment to bring this future around; will they really spend their money to make their future worse? These companies have no motivation to create a new, open network.

So what about the companies which have benefited from the open net: companies like Google, Amazon, and eBay? They are not going to rescue us either for one simple reason: they are now incumbents. They have no incentive to spend money which will serve mainly to enable more competitors. They are in a position where they can pay whatever it takes to get the address space they need; a high cost to be on the net, is, for them, a welcome barrier to entry that will keep competition down. We should not expect help from that direction.

So perhaps it is the consumers who have to pay for this transition. But Geoff did not see that as being realistic either. Who is going to pay $20/month more for a dual-stack network which works worse than the one they have now? If one ISP attempts to impose such a charge, customers will flee to a competitor which does not. Consumers will not fund the change.

So the future looks dark. The longer we toy with disaster, Geoff said, the more likely it is that the real loser will be openness. It is not at all obvious that we'll continue to have an open net in the future. He doesn't like that scenario; it is, he said, the worst possible outcome. We all have to get out there, get moving, and fix this problem.

One of the questions asked was: what can we do? His answer was that we really need to make better software: "IPv6 sucks" currently. Whenever IPv6 is used, performance goes down the drain. Nobody has yet done the work to make the implementations truly robust and fast. As a result, even systems which are capable of speaking both protocols will try to use IPv4 first; otherwise, the user experience is terrible. Until we fix that problem, it's hard to see how the transition can go ahead.

Comments (198 posted)

Page editor: Jonathan Corbet

Security

Web tracking and "Do Not Track"

By Jake Edge
January 26, 2011

Web site visits are increasingly being tracked by advertisers and others ostensibly to better target advertising. But recording which sites we visit as we click our way around the web is not only an invasion of privacy, but one that has multiple avenues for abuse. Both Mozilla and Google have recently announced browser features that could reduce or eliminate tracking—at least for advertisers that comply.

Using a wide variety of techniques: browser or Flash cookies, web "bugs", JavaScript trickery, browser fingerprinting, and so forth, advertising and tracking companies are getting a detailed look at the web sites we visit. Most web advertising also provides a means to track web site visitors on a wide variety of sites, not just the single site where that particular ad appears. It is somewhere between difficult and impossible for users to stop this behavior, if they even know it is taking place. The information is then stored by these third-parties for their use—or to sell to others

What privacy advocates would like is a way for users to opt-out of tracking. It would be better still if users had to opt-in to tracking, but an initiative like that is vanishingly unlikely because of opposition from advertising/tracking companies. A subset of advertising companies have come together in a group called the Network Advertising Initiative (NAI), which provides an opt-out service to disable tracking by member companies. That web page gives an eye-opening list of advertisers and the status of their cookies in your browser. On can then choose which to opt-out from (with a helpful "Select All" button if one is willing to turn on JavaScript for that site).

There are a number of problems with the NAI approach, as outlined in a recent Electronic Frontier Foundation (EFF) blog posting. The biggest problem from a privacy perspective is that some members interpret opting out differently than others:

Some tracking companies recognize that an "opt out" should be an opt out from being tracked, others insist on interpreting the opt out as being an opt out for receiving targeted advertising. In other words, the NAI allows its members to to tell people that they've opted out, when in fact their web browsing is still being observed and recorded indefinitely.

Another problem is that the opt-out choice is recorded in a cookie for each different advertising or tracking company, so one must visit that page frequently as additional companies join the NAI. Privacy conscious users will also periodically delete their cookies, which also necessitates revisiting that page. Overall, it is a fairly fragile solution.

Google's idea is to provide a Chrome extension ("Keep My Opt-Outs") that blocks the deletion of the opt-out cookies (both browser and Flash cookies) so that users can still delete the rest of their cookies without having to re-up at the NAI web site. It is fundamentally just a list of cookies that shouldn't be deleted, and that list will need to be updated periodically, presumably through the extension update mechanism. It is similar to the Beef TACO (Targeted Advertising Cookie Opt-Out) Firefox extension, though TACO handles more than just the NAI-listed companies' cookies.

Keep My Opt-Outs and TACO are useful today, though they can't address the problem of differing interpretations of the opt-out. Mozilla has gone a step further and implemented a more sweeping change with its "Do Not Track" HTTP header. Do Not Track is going to require buy-in from other browsers and the tracking companies before it can even work, but it "solves" the problem in a much simpler way.

The basic idea is straightforward: a user can indicate that they do not wish to be tracked and Firefox will send a Do Not Track HTTP header with every request. That header could be interpreted by the tracking companies as the equivalent of their opt-out cookies. It would be even better if they interpreted it to mean what it clearly says and would turn off all tracking, rather than just turning off targeted (i.e. behavioral) advertising. The latter will undoubtedly take some major convincing—or regulatory pressure.

Using an HTTP header for this purpose is a far superior technical solution in that users (or their browsers) don't have to keep track of lists of advertisers and their cookies, while clearly indicating to the web sites that the user has requested that tracking be disabled. No new cookies need to be installed or preserved and violators will be fairly easily spotted. While the EFF has made it clear that it is backing the Do Not Track header approach, there are still several groups that will need to be convinced: advertising networks, tracking companies, and browser makers (some of which run their own ad networks: Google and Apple).

Though there are already Firefox extensions that implement the X-Do-Not-Track header (and the related X-Behavioral-Ad-Opt-Out header), like Universal Behavioral Advertising Opt-out and NoScript, but, for now at least, they are just "feel good" extensions. It remains to be seen if the NAI and other advertisers/trackers start to handle these headers. One might guess they would be resistant—probably will be—but there's no real reason to believe that users would opt-out in droves. There are also reasonable arguments that Do Not Track will have a minimal impact on online advertising.

Of course, even if there were, miraculously, full adoption by advertisers or, rather less miraculously, regulations from the US Federal Trade Commission (FTC) and other, similar, agencies that require advertisers to adopt it, there will still be some amount of tracking. Whether those violators are outside of the FTC's jurisdiction or just flying below the radar, clickstream information has value and there will always be those trying to extract that value. Unfortunately, there doesn't seem to be any possible technical—or regulatory—solution to that particular problem.

Comments (7 posted)

Brief items

Security quotes of the week

Advocates for data retention typically focus narrowly on the benefits afforded to law enforcement without accounting for the massive costs and extreme security risks that come with storing significant quantities of data about every Internet user — databanks that will prove to be irresistible not only to government investigators but also civil litigants (read: ex-spouses, insurance companies, disgruntled neighbors) and malicious hackers of every stripe. A legal obligation to log users' Internet use, paired with weak federal privacy laws that allow the government to easily obtain those records, would dangerously expand the government's ability to surveil its citizens, damage privacy, and chill freedom of expression.

-- Electronic Frontier Foundation in its Deeplinks blog

We first jumped on the OpenID bandwagon back in 2007 when it was seen as a promising way to make logging into websites simpler. What we've learned over the past three years is that it didn't actually make anything any simpler for the vast majority of our customers. Instead it just made things harder. Especially when people were having problems with the often flaky OpenID providers and couldn't log into their account. OpenID has been a burden on support since the day it was launched.

-- 37signals drops OpenID support

Comments (12 posted)

EFF: Don't Sacrifice Security on Mobile Devices

The Electronic Frontier Foundation has sent out a release on mobile device security, noting that open devices can be made more secure even if the original vendor is not interested. "By contrast, mobile systems lag far behind the established industry standard for open disclosure about problems and regular patch distribution. For example, Google has never made an announcement to its android-security-announce mailing list, although of course they have released many patches to resolve many security problems, just like any OS vendor. But Android open source releases are made only occasionally and contain security fixes unmarked, in among many other fixes and enhancements."

Comments (22 posted)

New vulnerabilities

awstats: arbitrary code injection

Package(s):

awstats

CVE #(s):

CVE-2010-4369

Created:

January 24, 2011

Updated:

February 21, 2011

Description:

From the Ubuntu advisory:

It was discovered that AWStats did not correctly filter the LoadPlugin configuration option. A local attacker on a shared system could use this to inject arbitrary code into AWStats.

Alerts:

Mandriva	MDVSA-2011:033	awstats	2011-02-21
Ubuntu	USN-1047-1	awstats	2011-01-24

Comments (none posted)

dpkg: symlink attack

Package(s):

dpkg

CVE #(s):

CVE-2011-0402

Created:

January 24, 2011

Updated:

January 26, 2011

Description:

From the CVE entry:

dpkg-source in dpkg before 1.14.31 and 1.15.x allows user-assisted remote attackers to modify arbitrary files via a symlink attack on unspecified files in the .pc directory.

Alerts:

Fedora	FEDORA-2011-0362	dpkg	2011-01-13
Fedora	FEDORA-2011-0345	dpkg	2011-01-13

Comments (none posted)

fuse: denial of service

Package(s):

fuse

CVE #(s):

CVE-2010-3879

Created:

January 20, 2011

Updated:

April 29, 2013

Description:

From the Ubuntu advisory:

It was discovered that FUSE could be tricked into incorrectly updating the mtab file when mounting filesystems. A local attacker, with access to use FUSE, could unmount arbitrary locations, leading to a denial of service.

Alerts:

Mandriva	MDVSA-2013:155	fuse	2013-04-29
Mageia	MGASA-2012-0339	fuse	2012-11-23
Scientific Linux	SL-fuse-20110720	fuse	2011-07-20
Red Hat	RHSA-2011:1083-01	fuse	2011-07-20
SUSE	SUSE-SR:2011:005	hplip, perl, subversion, t1lib, bind, tomcat5, tomcat6, avahi, gimp, aaa_base, build, libtiff, krb5, nbd, clamav, aaa_base, flash-player, pango, openssl, subversion, postgresql, logwatch, libxml2, quagga, fuse, util-linux	2011-04-01
openSUSE	openSUSE-SU-2011:0264-1	fuse	2011-03-31
openSUSE	openSUSE-SU-2011:0265-1	fuse	2011-03-31
Fedora	FEDORA-2011-0854	util-linux-ng	2011-01-28
Ubuntu	USN-1045-2	util-linux	2011-01-19
Ubuntu	USN-1045-1	fuse	2011-01-19

Comments (none posted)

libuser: default user password

Package(s):

libuser

CVE #(s):

CVE-2011-0002

Created:

January 20, 2011

Updated:

April 21, 2011

Description:

From the Red Hat advisory:

It was discovered that libuser did not set the password entry correctly when creating LDAP (Lightweight Directory Access Protocol) users. If an administrator did not assign a password to an LDAP based user account, either at account creation with luseradd, or with lpasswd after account creation, an attacker could use this flaw to log into that account with a default password string that should have been rejected. (CVE-2011-0002)

Alerts:

CentOS	CESA-2011:0170	libuser	2011-02-04
Mandriva	MDVSA-2011:019	libuser	2011-01-26
Fedora	FEDORA-2011-0320	libuser	2011-01-12
Fedora	FEDORA-2011-0316	libuser	2011-01-12
Red Hat	RHSA-2011:0170-01	libuser	2011-01-20
CentOS	CESA-2011:0170	libuser	2011-04-20

Comments (none posted)

openoffice.org: multiple vulnerabilities

Package(s):

openoffice.org

CVE #(s):

CVE-2010-3450 CVE-2010-3451 CVE-2010-3452 CVE-2010-3453 CVE-2010-3454 CVE-2010-3689 CVE-2010-4253 CVE-2010-4643

Created:

January 26, 2011

Updated:

May 9, 2011

Description:

From the Debian advisory:

During an internal security audit within Red Hat, a directory traversal vulnerability has been discovered in the way OpenOffice.org 3.1.1 through 3.2.1 processes XML filter files. If a local user is tricked into opening a specially-crafted OOo XML filters package file, this problem could allow remote attackers to create or overwrite arbitrary files belonging to local user or, potentially, execute arbitrary code. (CVE-2010-3450)

During his work as a consultant at Virtual Security Research (VSR), Dan Rosenberg discovered a vulnerability in OpenOffice.org's RTF parsing functionality. Opening a maliciously crafted RTF document can caus an out-of-bounds memory read into previously allocated heap memory, which may lead to the execution of arbitrary code. (CVE-2010-3451)

Dan Rosenberg discovered a vulnerability in the RTF file parser which can be leveraged by attackers to achieve arbitrary code execution by convincing a victim to open a maliciously crafted RTF file. (CVE-2010-3452)

As part of his work with Virtual Security Research, Dan Rosenberg discovered a vulnerability in the WW8ListManager::WW8ListManager() function of OpenOffice.org that allows a maliciously crafted file to cause the execution of arbitrary code. (CVE-2010-3453)

As part of his work with Virtual Security Research, Dan Rosenberg discovered a vulnerability in the WW8DopTypography::ReadFromMem() function in OpenOffice.org that may be exploited by a maliciously crafted file which allowins an attacker to control program flow and potentially execute arbitrary code. (CVE-2010-3454)

Dmitri Gribenko discovered that the soffice script does not treat an empty LD_LIBRARY_PATH variable like an unset one, may lead to the execution of arbitrary code. (CVE-2010-3689)

A heap based buffer overflow has been discovered with unknown impact. (CVE-2010-4253)

A vulnerability has been discovered in the way OpenOffice.org handles TGA graphics which can be tricked by a specially crafted TGA file that could cause the program to crash due to a heap-based buffer overflow with unknown impact. (CVE-2010-4643)

Alerts:

Gentoo	201408-19	openoffice-bin	2014-08-31
SUSE	SUSE-SR:2011:007	NetworkManager, OpenOffice_org, apache2-slms, dbus-1-glib, dhcp/dhcpcd/dhcp6, freetype2, kbd, krb5, libcgroup, libmodplug, libvirt, mailman, moonlight-plugin, nbd, openldap2, pure-ftpd, python-feedparser, rsyslog, telepathy-gabble, wireshark	2011-04-19
openSUSE	openSUSE-SU-2011:0337-1	libreoffice	2011-04-18
openSUSE	openSUSE-SU-2011:0336-1	libreoffice	2011-04-18
CentOS	CESA-2011:0182	openoffice.org	2011-05-07
Fedora	FEDORA-2011-0837	openoffice.org	2011-01-27
Mandriva	MDVSA-2011:027	openoffice.org	2011-02-14
Pardus	2011-34	openoffice	2011-02-12
CentOS	CESA-2011:0181	openoffice.org	2011-02-04
Ubuntu	USN-1056-1	openoffice.org	2011-02-02
Red Hat	RHSA-2011:0183-01	openoffice.org	2011-01-28
Red Hat	RHSA-2011:0182-01	openoffice.org	2011-01-28
Red Hat	RHSA-2011:0181-01	openoffice.org	2011-01-28
Debian	DSA-2151-1	openoffice.org	2011-01-26

Comments (none posted)

perl-Convert-UUlib: buffer overflow

Package(s):

perl-Convert-UUlib

CVE #(s):

Created:

January 20, 2011

Updated:

January 26, 2011

Description:

From the Fedora advisory:

Fix a one-byte-past-end-write buffer overflow in UURepairData (reported, analysed and testcase provided by Marco Walther)

Alerts:

Fedora	FEDORA-2011-0052	perl-Convert-UUlib	2011-01-03
Fedora	FEDORA-2011-0062	perl-Convert-UUlib	2011-01-03

Comments (none posted)

request-tracker: unsalted password hashing

Package(s):

request-tracker3.6

CVE #(s):

CVE-2011-0009

Created:

January 24, 2011

Updated:

May 25, 2012

Description:

From the Debian advisory:

It was discovered that Request Tracker, an issue tracking system, stored passwords in its database by using an insufficiently strong hashing method. If an attacker would have access to the password database, he could decode the passwords stored in it.

Alerts:

Debian

DSA-2150-1

request-tracker3.6

2011-01-22

Comments (none posted)

tomcat: cross-site scripting

Package(s):

tomcat6

CVE #(s):

CVE-2010-4172

Created:

January 24, 2011

Updated:

May 19, 2011

Description:

From the Ubuntu advisory:

It was discovered that Tomcat did not properly escape certain parameters in the Manager application which could result in browsers becoming vulnerable to cross-site scripting attacks when processing the output. With cross-site scripting vulnerabilities, if a user were tricked into viewing server output during a crafted server request, a remote attacker could exploit this to modify the contents, or steal confidential data (such as passwords), within the same domain.

Alerts:

Gentoo	201206-24	tomcat	2012-06-24
Red Hat	RHSA-2011:0791-01	tomcat6	2011-05-19
SUSE	SUSE-SR:2011:003	gnutls, tomcat6, perl-CGI-Simple, pcsc-lite, obs-server, dhcp, java-1_6_0-openjdk, opera	2011-02-08
openSUSE	openSUSE-SU-2011:0082-2	tomcat6	2011-02-03
openSUSE	openSUSE-SU-2011:0082-1	tomcat6	2011-01-28
Ubuntu	USN-1048-1	tomcat6	2011-01-24

Comments (none posted)

wordpress: cross-site scripting

Package(s):

wordpress

CVE #(s):

CVE-2010-4536

Created:

January 20, 2011

Updated:

January 26, 2011

Description:

From the Red Hat bugzilla entry:

A Cross-site scripting(XSS) flaw was found in KSES, which is the wordpress HTML sanitation library.

Alerts:

Fedora	FEDORA-2011-0315	wordpress	2011-01-12
Fedora	FEDORA-2011-0306	wordpress	2011-01-12

Comments (none posted)

wordpress-mu: multiple cross-site scripting vulnerabilities

Package(s):

wordpress-mu

CVE #(s):

CVE-2010-4536

Created:

January 24, 2011

Updated:

January 26, 2011

Description:

From the CVE entry:

Multiple cross-site scripting (XSS) vulnerabilities in KSES, as used in WordPress before 3.0.4, allow remote attackers to inject arbitrary web script or HTML via vectors related to (1) the & (ampersand) character, (2) the case of an attribute name, (3) a padded entity, and (4) an entity that is not in normalized form.

Alerts:

Fedora	FEDORA-2011-0352	wordpress-mu	2011-01-13
Fedora	FEDORA-2011-0335	wordpress-mu	2011-01-13

Comments (none posted)

Page editor: Jake Edge

Kernel development

Brief items

Kernel release status

The current development kernel is 2.6.38-rc2, released on January 21. "Anyway. -rc2 is out there, and the only reason it's reasonably sized is that it was a short -rc2 - I think a few of the pull requests I got were a bit larger than I would have been happy with. And I might as well warn people that because the laptop I'm bringing with me is pitifully slow, I'm also planning on going into 'anal' mode, and not even bother pulling from trees unless they are clearly -rc material. IOW, don't try to push large pushes on me. I won't take them, and they can wait for 39." See the announcement for the short changelog, or the full changelog for all the details.

Stable updates: there have been no stable or longterm updates released in the last week.

Comments (none posted)

Quotes of the week

We care about everything. If the objective was to make life easier for ourselves, we'd all be on the golf course.

-- Andrew Morton

No way in hell do I want the situation of "the system is screwed, so let's overwrite the disk" to be something the kernel I release might do. It's crazy.

-- Linus Torvalds

Nice to see it gone - it seemed such a good idea in Linux 1.3

-- Alan Cox won't miss the BKL

Comments (3 posted)

The real BKL end game

By Jonathan Corbet
January 26, 2011

The removal of the big kernel lock (BKL) has been on the kernel community's "to do" list almost since that lock was first added to make the kernel work on multiprocessor systems. Over time, the significance of the lock has diminished as finer-grained locking was added to various kernel subsystems, but the BKL itself has endured. Getting rid of it for good remained desirable because the BKL can still cause unwanted latencies at times. There's also a certain amount of pride involved in completing the job. That completion has been long in coming, though; once the worst performance issues associated with the BKL were resolved, interest in doing the low-level work needed to finish the job declined.

Two years or so ago, though, developers started working on BKL removal again. Some of this work was motivated by the realtime tree, where patience with latency sources is rather more limited. Still, it seemed like completion remained a distant goal; hundreds of BKL call sites remained in the kernel.

Then Arnd Bergmann took on the task of eliminating the BKL entirely. His cleanup work has been going on for some time; if he has his way, this patch set (or something derived from it) will remove the BKL entirely in 2.6.39. To get there, about a dozen modules need to be addressed. Some of them (i830, autofs3, and smbfs) are simply to be removed. Others (appletalk and hpfs) are to be moved to the staging tree for near-term removal, though there is some resistance to that idea. The remaining modules are to be fixed in some way. Once that's taken care of, the final patch in the series removes the lock itself. It will not be missed.

Comments (24 posted)

Kernel development news

LCA: Server power management

By Jonathan Corbet
January 26, 2011

Power management is often seen as a concern mostly for embedded and mobile systems. They worry about power management because we want our phones to run for longer between recharges and our laptops to not inflict burns on our thighs. But power management is equally important for data centers, which are currently responsible for about 3% of the total power consumption in the US. Keeping the net going in the US requires about ~~15TW~~ 15GW of power - the dedicated output of about 15 nuclear power plants. Clearly there would be some real value to saving some of that power. Matthew Garrett's talk during the Southern Plumbers Miniconf at linux.conf.au 2011 covered the work that is being done in that area and where Linux stands relative to other operating systems.

Much of the power consumed by data centers is not directly controllable by Linux - it is overhead which is consumed outside of the computers themselves. About one watt of power is consumed by overhead for each watt consumed by computation. This overhead includes network infrastructure and power supply loss, but the biggest component is air conditioning. So the obvious thing to do here is to create more efficient cooling and power infrastructure. Running at higher ambient temperatures, while uncomfortable for humans, can also help. The best contemporary data centers have been able to reduce their overhead to about 20% - a big improvement. Cogeneration techniques - using heat from data centers to warm buildings, for example - can reduce that overhead even further.

But we still have trouble. A 48-core system, Matthew says, will draw about 350W when it is idle; a rack full of such systems will still pull a lot of power. What can be done? Most power management attention has been focused on the CPU, which is where a lot of the power goes. As a result, an idle Intel CPU now draws approximately zero watts of power - it is "terrifying" how well it works. When the CPU is working, though, the situation is a bit different; the power consumption is about 20W per core, or about 960W for a busy 48-core system.

The clear implication is that we should keep the CPUs idle whenever possible. That can be tricky, though; it is hard to write software which does nothing. Or - as Matthew corrected himself - it's hard to write useful software which does nothing.

There are some trends which can be pointed to in this area. CPU power management is essentially solved; Linux is quite good at it. In fact, Linux is better than any other operating system with regard to CPU power; we have more time in deep idle states and fewer wakeups than others. So interest is shifting toward memory power management. If all of the CPUs in a package within the system can be idled, the associated memory controller will go idle as well. It's also possible to put memory into "self-refresh" mode if it is idle, reducing power use while preserving the contents. In other situations, running memory at a lower clock rate can reduce power usage. There will be a lot of work in this area because, at this point, memory looks like the biggest, lowest-hanging fruit.

Even more power can be saved by simply turning a system off; that is where virtualization comes into play. If applications are run on virtualized servers, those servers can be consolidated onto a small number of machines during times of low load, allowing the other machines to be powered down. There is a great deal of customer interest in this capability, but there is still work to be done; in particular, we need fast guest migration, which is a hard problem to solve.

The other hard problem is the fact that optimal power behavior may make tradeoffs which enterprise customers may be unwilling to make. Performance matters for these people, and, if that means expending more energy, they are willing to pay that cost. As an example, consider the gettimeofday() system call which, while having been ruthlessly optimized, can still be slower than some people would like. Simply reading the processor's time stamp counter (TSC) can be faster. The problem is that the TSC can become unreliable in the presence of power management. Once upon a time, changing the CPU frequency would change the rate of the TSC, but that problem has been solved by the CPU vendors for a few years now. So TSC problems are no longer an excuse to avoid lowering the clock frequency.

Unfortunately, that is not too useful, because it rarely makes sense to run a CPU at a lower frequency; best results usually come from running at full speed and spending more time in a sleep state ("C state"). But C states can stop the TSC altogether, once again creating problems for performance-sensitive users. In response, manufacturers have caused the TSC to run even when the CPU is sleeping. So, while virtualization remains a hassle, systems running on bare metal can expect the TSC to work properly in all power management states.

But that still doesn't satisfy some performance-sensitive users because deep C states create latency. It can take a millisecond to wake a CPU out of a deep sleep - that is a very long time in some applications. We have the pm_qos mechanism which can let the kernel know whether deep sleeps are acceptable at any given time, allowing power management to happen when latency is not an immediate concern. Not a perfect solution, but that may be as good as it gets for now.

Another interesting feature of contemporary CPUs is the "turbo" mode, which can allow a CPU to run in an overclocked mode for a period of time. Using this mode can get work done faster, allowing longer sleeps and better power behavior, but it depends on good power management if it is to work at all. If a core is to run in turbo mode, all other cores on the same die must be in a sleep state. The end result is that turbo mode can give good results for single-threaded workloads.

Some effort is going into powering down unused hardware components - I/O controllers, for example - even though the gains to be had in this area are relatively small. Many systems have quite a few USB ports, many of which are entirely unused. Versions 1 and 2 of the USB specification make powering down those port hard; even worse, those ports will repeatedly wake the CPU even if nothing is plugged in. USB 3 is better in this regard.

Unfortunately, even in this case, it's hard to power down the ports because it is a feature which is poorly specified, poorly documented, and poorly implemented. The reliability of the hardware varies; Windows tends not to use the PCI power management event infrastructure, so it often simply does not work. This problem has been solved by polling the hardware once every second; that is "the least bad thing" they could come up with. The result is better power behavior, but also up to one second of latency before the system responds to the plugging-in of a new USB device. Since, as Matthew noted, that one second is probably less than the user already lost while trying to insert the plug upside-down, it shouldn't be a problem.

Similar things can be done with other types of hardware - firewire ports, audio devices, SD ports, etc. It's just a matter of figuring out how to make it work. There is also some interest in reducing the power consumption of graphics processors (GPUs), even though enterprise systems tend not to have fancy GPUs. The level of support varies from one GPU to the next, but work is being done to improve power consumption for most of them.

Work for the future includes better CPU frequency governor development; we need to do better at ramping up the processor's frequency when there is work to be done. The scheduler needs tweaks to do a better job of consolidating jobs onto one package, allowing others to be powered down. And there is the continued exploitation of other power management features in hardware; there are a lot of them that we are not using. On the other hand, others are not using those features either, so they probably do not work.

In summary: Linux is doing pretty well with regard to enterprise-level power management; the GPU is the only place where we perform worse than Windows does. But we can always do better, so work will continue in that direction.

Comments (8 posted)

Concurrent code and expensive instructions

January 26, 2011

This article was contributed by Paul McKenney

Introduction

Symmetric multiprocessing (SMP) code often requires expensive instructions, including atomic operations and memory barriers, and often causes expensive cache misses. Yet some SMP code can be extremely cheap and fast, using no expensive instructions at all. Examples of cheap SMP code include per-CPU counters and RCU read-side critical sections. So why can't all SMP code be cheap? Is it just that we aren't smart enough to spot clever ways of implementing other algorithms, for example, concurrent stacks and queues? Is it that we might be able to implement concurrent stacks and queues without expensive instructions, but only at the cost of mind-bending complexity? Or is it simply impossible to implement concurrent stacks and queues without using expensive instructions?

My traditional approach has been to place my faith in two observations: (1) if you beat your head against a wall long enough, one of two things is bound to happen, and (2) I have a hard head. Although this approach has worked well, something less painful would be quite welcome. And so it was with great interest that I read a paper entitled "Laws of Order: Expensive Synchronization in Concurrent Algorithms Cannot be Eliminated" by Attiya et al., with the “et al.” including Maged Michael, whom I have had the privilege of working with for quite some time.

It is important to note that the title overstates the paper's case somewhat. Yes, the paper does present some laws requiring that many concurrent algorithms use expensive instructions, however, all laws have their loopholes, including the Laws of Order. So while we do need to understand the Laws of Order, we most especially need to understand how to fully exploit their loopholes.

To arrive at the Laws of Order, this paper first expands the definition of commutativity to include sequential composition, which in the C language can best be thought of as the “;” operator. In this case, commutativity depends not just on the operator, but on the operands, which for our purposes can be thought of as calls to arbitrary C functions. For example, the statements:

    atomic_inc(&x); 
    atomic_inc(&y);

are commutative: the values of x and y are the same regardless of the order of execution. In contrast:

    atomic_set(&x, 1); 
    atomic_set(&x, 2);

are non-commutative: the value of x will be either 1 or 2, depending on which executes first.

These examples execute sequentially, but the paper considers concurrent execution. To see this, consider a concurrent set that has these operations:

a set-member-addition function (call it set_add()) that returns an indication of whether the element to be added was already in the set,
a set-member-test function (call it set_member()), and
a set-member-removal function (call it set_remove()) that returns an indication of whether anything had actually been removed.

Then concurrently testing two distinct members is commutative: the order in which set_member(&s, 1) and set_member(&s, 2) execute will not affect the return values from either, and the final value of set s will be the same in either case. Therefore, it is not necessary for the two invocations to coordinate with each other. The fact that coordination is not required means that there is some hope that expensive instructions are not needed to implement set_member().

In contrast, concurrent invocation of set_add(&s, 1) and set_member(&s, 1) would not be commutative if the set s initially did not contain the value 1. The set_member() invocation would return true only if it executed after the set_add(). Some coordination between the two functions is clearly required.

The most important results of the paper rely on a strong form of non-commutativity, which is strangely enough termed “strong non-commutativity”, which we can abbreviate to “SNC”. The example in the previous paragraph is not SNC because, while set_add() can affect set_member(), the reverse is not the case. In contrast, an SNC pair of functions would each affect the other's result. For example, consider set_add(&s, 1) and set_remove(&s, 1), where the set s is initially empty. If set_add(&s, 1) executes first, then both functions will indicate success, and set s will be empty. On the other hand, if set_remove(&s, 1) executes first, then only the set_add(&s, 1) will indicate success and the set s will contain 1. In this case, the return value of set_remove() is affected by the order of execution. On the other hand, if the set s initially contains 1, it will be set_add(&s, 1) whose return value is affected. Therefore, the order of execution can affect the return value of both functions, and these functions are therefore SNC.

Quick Quiz 1: Is atomic_add_return() SNC? In other words, are multiple concurrent calls to this function SNC?
Answer

The key result of the paper is that under certain conditions, the implementation of a pair of SNC functions must contain a heavyweight instruction, where a “heavyweight instruction” can either be an atomic read-modify-write instruction or a heavyweight memory barrier. In the Linux kernel, only smp_mb() qualifies as a heavyweight memory barrier.

The “certain conditions” are:

Both functions in the pair must be deterministic, in other words, the final state (including return values) must be a strict function of the initial state and order of execution.
The functions must be linearizable.

Interestingly enough, although the paper requires that the implementation of an SNC, deterministic, and linearizable pair of functions each contain at least one heavyweight instruction, it does not require that this instruction be executed on each invocation.

Quick Quiz 2: Imagine an increment function that is not permitted to lose counts even when multiple invocations execute concurrently, and that does not return the value of the counter. Must the implementation of such a function contain an atomic read-modify-write instruction or a heavyweight memory barrier?
Answer

So if we want our code to run fast, we have four ways to avoid heavyweight instructions:

Formulate the API to be non-SNC.
Design the implementation so that any required heavyweight instructions almost never need to actually be executed.
Accept non-determinism.
Accept non-linearizability. The paper ignores this possibility, possibly due to the common academic view that non-linearizable algorithms are by definition faulty.

Interestingly enough, relativistic programming has long suggested use of several of these approaches to attain good performance and scalability. The “Laws of Order” therefore provides a good theoretical basis for understanding why relativistic programming is both desirable and necessary.

Let's take a look at some examples, starting with a memory allocator. Given that concurrent calls to kmalloc() are not supposed to return a pointer to the same block of memory, we have to conclude that kmalloc() is SNC and thus might need heavyweight instructions in its implementation.

Quick Quiz 3: How can we avoid the use of heavyweight instructions in the implementation of kmalloc? If it turns out to be impossible to completely avoid their use, how can we reduce the frequency of their execution?
Answer

The second example is of course RCU. Let's focus on the rcu_read_lock(), rcu_read_unlock(), synchronize_rcu(), rcu_dereference() and rcu_assign_pointer() API members. The rcu_read_lock() function is unaffected by any of the other members, so any pair that includes rcu_read_lock() is non-SNC, which is why this function need not include any heavyweight instructions. The same is true of rcu_read_unlock().

Interestingly enough, synchronize_rcu() is affected by both rcu_read_lock() and rcu_read_unlock(), in that the former can prevent synchronize_rcu() from returning and the latter can enable it to return. However, neither rcu_read_lock() nor rcu_read_unlock() is affected by synchronize_rcu(). This means that synchronize_rcu() is non-SNC and might therefore have an implementation that does not use heavyweight instructions. However, such an implementation seems quite implausible if you include the actions of the updater both before and after the call to synchronize_rcu() in conjunction with the RCU read-side critical section. The paper, though, considers only data flowing via those function's arguments and return value. It would be interesting to see a generalization of this work that includes side effects.

My guess is that for a given code fragment to be non-SNC, any conceivable API would need to be non-SNC. If my guess is correct, then the full RCU update is non-SNC with respect to any RCU read-side critical section containing rcu_dereference(). The reasoning is that the return value from rcu_dereference() can be affected by the RCU update, and the duration for which synchronize_rcu() blocks can be affected by rcu_read_lock() and rcu_read_unlock().

Quick Quiz 4: Are there any conditions in which rcu_read_unlock() will be SNC with respect to synchronize_rcu()?
Answer

Finally, let us look at the set implementation that includes set_add(), set_member(), and set_remove(). We saw that set_add() and set_remove() were SNC.

Quick Quiz 5: Is there any way to implement set_add() and set_remove() without using heavyweight instructions?
Answer

Of course, this paper does have a few shortcomings, many of which fall under the rubric of “future work”:

The paper describes the theoretical limitations at great length, but does not describe many ways of avoiding them. However, I am quite confident that the Linux kernel community will be more than able to produce good software engineering solutions that work around these limitations. In fact, there is a lot to be said for letting the theoreticians worry about limitations and letting us hackers worry about solving problems in spite of those limitations.
The paper focuses almost exclusively on reordering carried out by the CPU. It turns out that reordering due to compiler optimizations can be at least as “interesting” as CPU reordering. These sorts of compiler optimizations are allowed by the current C-language standard, which permits the compiler to assume that there is only one thread in the address space. Within the Linux kernel, the barrier() directive restricts the compiler's ability to move code, and this directive (or its open-coded equivalent) is used in locking primitives, atomic operations, and memory barriers.
There is some uncertainty about exactly what properties of code must be SNC for this paper's results to hold. The paper focuses almost exclusively on function arguments and return values, but my guess is that the list of properties is quite general. For example, an unconditional lock-acquisition primitive certainly seems like it should be covered by this paper's result, but such primitives do not return a value. Can the fact that the second of two concurrent acquisitions simply fails to return be considered to be evidence of the SNC nature of lock acquisition? If not, exactly why not? If so, exactly what is the set of effects that must be taken into account when judging whether or not this code fragment is SNC?
This seems to be a future-work topic.
A bit of thought about the results of this paper give clear reasons why it is often so hard to parallelize existing sequential code. Sequential code inflicts no penalties for the use of SNC APIs, so SNC APIs can be expected to appear in sequential code even when a non-SNC API might have served just as well. After all, what programmer could resist the temptation to make set_add() return an indication of whether the element was already in the set? The paper would have done well to state this point clearly.
The paper fails to call out non-linearizability as a valid loophole to its laws of order.
An interesting open question: What are the consequences of using one of the loopholes of the laws of order? In my limited personal experience, leveraging non-linearizability and privatization permits full generality (for example, parallel memory allocators), while leveraging non-SNC and non-determinism results in specialized algorithms (for example, RCU). It would be quite interesting to better understand any theoretical and software-engineering limitations imposed by these loopholes.
The paper overreaches a bit when it states that:

For synthesis and verification of concurrent algorithms, our result is potentially useful in the sense that a synthesizer or a verifier need not generate or attempt to verify algorithms that do not use RAW [smp_mb()] and AWAR [atomic read-modify-write operations] for they are certainly incorrect.
As we have seen, it is perfectly legal for a concurrent algorithm to avoid use of these operations as long as that algorithm is either: (1) non-SNC, (2) non-deterministic, or (3) non-linearizable. There are a few other places where the limitations on the main result are not stated as carefully as they should be. Given that the rest of the paper seems quite accurate and on-point, I would guess that this sentence is simply an honest error that slipped through the peer-review process. We all make mistakes.

Although I hope that these shortcomings will be addressed, I hasten to add that they are insignificant compared to the huge step forward that this paper represents.

In summary, the “Laws of Order” paper shines some much-needed light on the question of whether heavyweight instructions are needed to implement a given concurrent algorithm. Although I am not going to say that this paper fully captures my parallel-programming intuition, I am quite happy that it does land within a timezone or two, which represents a great improvement over previous academic papers. But the really good news is that the limitations called out in this paper have some interesting loopholes that can be exploited in many cases. If the Linux kernel community pays careful attention to both the limitations and the loopholes called out in this paper, I am confident that the community's already-impressive parallel-programming capabilities will become even more formidable.

Acknowledgments

I owe thanks to Maged Michael, Josh Triplett, and Jon Walpole for illuminating discussions and for their review of this paper, and to Jim Wasko for his support of this effort.

Legal Statement

This work represents the view of the author and does not necessarily represent the view of IBM.

Linux is a registered trademark of Linus Torvalds.

Other company, product, and service names may be trademarks or service marks of others.

Answers to Quick Quizzes

Quick Quiz 1: Is atomic_add_return() SNC? In other words, are multiple concurrent calls to this function SNC?

Answer: Yes. Suppose that an atomic_t variable named a is initially zero and that a pair of concurrent atomic_add_return(1, &a) functions execute. The first one to execute will return zero, and the second one will return one. Each instance's return value is therefore affected by the order of execution, which indicates strong non-commutativity.

This may seem strange, given that addition is commutative. And in fact the final value of a will be two regardless of order of execution.

To see the reasoning behind the definition of SNC, consider atomic_inc(&a), which also adds one to a but does not return the initial value. In this case, because there are no return values, the invocations of atomic_inc(&a) cannot possibly affect each others' return values.

Therefore, atomic_inc(&a) is non-SNC.

It is interesting to note that the designers of the Linux kernel's suite of atomic operations had an intuitive understanding of the results of this paper. The atomic operations that return a value (and thus are more likely to be SNC) are the ones that are required to provide full memory ordering.

Back to Quick Quiz 1.

Answer: No. Although this function is deterministic and linearizable, it is non-SNC. And in fact such a function could be implemented via a “split counter” that uses per-CPU non-atomic variables. Because each CPU increments only its own variable, counts are never lost. To get the aggregate value of the counter, simply sum up the individual per-CPU variables.

Of course, it might be necessary to disable preemption and/or interrupts across the increments, but such disabling requires neither atomic read-modify-write instructions nor heavyweight memory barriers.

However, the linearizability of this function depends on the counter always being incremented by the value 1. To see this, imagine a counter with an initial value of zero to which three CPUs are concurrently adding the values 3, 5, and 7, and that meanwhile three other CPUs are reading out the counter's value. Because there are no ordering guarantees, these three other CPUs might see the additions in any order. One of these CPUs might add the per-CPU variables and obtain a sum of 3, another might obtain a sum of 5, and the third might obtain a sum of 7. These three results are not consistent with any possible ordering of the additions, so this counter is not linearizable.

However, for a great many uses, this lack of linearizability is not a problem.

Back to Quick Quiz 2.

Quick Quiz 3: How can we avoid the use of heavyweight instructions in the implementation of kmalloc()? If it turns out to be impossible to completely avoid their use, how can we reduce the frequency of their execution?

Answer: The usual approach is to observe that a given pair of invocations of kmalloc() invocations will be SNC only if it is possible for them to be satisfied by the same block of memory. The usual way to greatly reduce the probability of a pair of kmalloc() invocations fighting over the same block of memory is to maintain per-CPU pools of memory blocks, which is what the Linux kernel's implementation of kmalloc() actually does. Heavyweight instructions are executed only if a given CPU's pool either becomes exhausted or overflows. This approach is related to the paper's suggestion of using “single-owner” algorithms.

It might be possible to avoid heavyweight instructions by introducing non-determinism, for example, by making kmalloc() randomly fail. This can certainly be accomplished by making kmalloc() unconditionally return NULL if the CPU's pool was exhausted, but such an implementation might not prove to be fully satisfactory to its users. Coming up with a reasonable implementation that uses non-determinism to avoid heavyweight instructions is left as an exercise for the adventurous reader.

Similarly, eliminating heavyweight instructions by introducing non-linearizability is left as an exercise for the adventurous reader.

Back to Quick Quiz 3.

Quick Quiz 4: Are there any conditions in which rcu_read_unlock() will be SNC with respect to synchronize_rcu()?

Answer: In some implementations of RCU, synchronize_rcu() can interact directly with rcu_read_unlock() when the grace period has extended too long, either via force_quiescent_state() machinations our via RCU priority boosting. In these implementations, rcu_read_unlock() will be SNC with respect to synchronize_rcu(). The Linux kernel's Preemptible Tree RCU is an example of such an implementation, as can be seen by examining the rcu_read_unlock_special() function in kernel/rcutree_plugin.h. This code executes rarely, thus using the second loophole called out above (“Design the implementation so that any required heavyweight instructions almost never need to actually be executed”).

Back to Quick Quiz 4.

Quick Quiz 5: Is there any way to implement set_add() and set_remove() without using heavyweight instructions?

Answer: This can be done easily for sets containing small integers if there is no linearizability requirement. The set is represented as a dense array of bytes so that each potential member of the set maps to a specific byte. The set_add() function would set the corresponding byte to one, the set_remove() function would clear the corresponding byte to zero, and the set_member() function would test the corresponding byte for non-zero.

This implementation is non-linearizable because different CPUs might well disagree on the order that members were added to and removed from the set.

Back to Quick Quiz 5.

Comments (11 posted)

A tale of two SCSI targets

January 22, 2011

This article was contributed by Goldwyn Rodrigues

At the end of 2010, the LIO project was chosen to replace STGT as the in-kernel SCSI target implementation. There were two main contenders (LIO and SCST) which tried to get their code into the Linux kernel tree. This article will compare the two projects and try to describe what these implementations have to offer.

What are SCSI targets?

The SCSI subsystem uses a sort of client-server model. Typically a computer is the client or "initiator," requesting that blocks be written to or read from a "target," which is usually a data storage device. The SCSI target subsystem enables a computer node to behave as a SCSI storage device, responding to storage requests by other SCSI initiator nodes. This opens up the possibility of creating custom SCSI devices and putting intelligence behind the storage.

An example of an intelligent SCSI target is Data Domain's online backup appliance, which supports de-duplication (thus saving space). The appliance, functioning as a SCSI target, is a computer node which intelligently writes only those blocks which are not already stored, and increases the reference counts of the blocks which are already present, thus writing only the blocks which have changed since the last backup. On the other side of the SCSI link, the initiator sees the appliance as a normal, shared SCSI storage device and uses its regular backup application to write to the target.

The most common implementation of the SCSI target subsystem is an iSCSI server, which uses a standard TCP/IP encapsulation of SCSI to export a SCSI device over the network. Most SCSI target projects started with the idea supporting iSCSI targets before supporting other protocols. Since only a network interface is needed to act as both an iSCSI initiator and an iSCSI target, supporting iSCSI doesn't require any special hardware beyond a network port, which almost every computer has these days. However, most SCSI targets can be supported with existing initiator cards, so if you have a Fibre, SAS, or Parallel SCSI card, it should be possible to use one of the SCSI target projects to make your computer into a SCSI target for the particular SCSI bus supported by the card.

Current Status

The Linux kernel SCSI subsystem currently uses STGT to implement the SCSI target functionality; STGT was introduced into the Linux kernel at the end of 2006 by Fujita Tomonori. It has a library in the kernel which assists the in-kernel target drivers. All target processing happens in user space, which may lead to performance bottlenecks.

Two out-of-tree kernel SCSI target solutions were contenders to replace STGT: LIO and SCST. SCST has been pushing to be included in the Linux kernel since at least 2008. It was decided then that the STGT project could serve the kernel for a little longer. As time passed, the design limitations of STGT were encountered and a replacement sought. The main criteria for a replacement SCSI target subsystem defined by James Bottomley, the SCSI maintainer, were:

That it would be a drop in replacement for STGT (our current in-kernel target mode driver), since there is room for only one SCSI target infrastructure.
That it used a modern sysfs-based control and configuration plane.
That the code was reviewed as clean enough for inclusion.

The first condition proved to be too restrictive; it was not possible to avoid breaking the ABI entirely. So the current goal, instead, is to find a way to gracefully transition STGT users to the new interface.

Hints of LIO replacing the STGT project came in the 2010 Linux Storage and Filesystem Summit. Christoph Hellwig volunteered to review and clean up the code; he managed to reduce the code-base by around 10,000 lines to make it ready to merge into the kernel.

Comparison

Both projects have drawn comparison charts of their feature lists which are available on their respective web sites: LIO and SCST. However, before exploring the differences, lets compare the similarities. Both projects implement an in-kernel SCSI target core. They provide local SCSI targets similar to loop devices, which comes in handy for using targets in virtualized environments. Both projects support iSCSI, which was one of the initial and main motivations for both projects.

Back-storage handlers are available on both projects in kernel space as well as for user space. Back-storage handlers allow target administrators to control how devices are exported to the initiators. For example, a pass-through handler allows exporting the SCSI hardware as it is, instead of masking the details of that hardware, while a virtual-disk handler allows exporting of files as virtual disk to the initiator.

Both projects support Persistent Reservations (PR); a feature for I/O fencing and failover/retakeover of storage devices in high-availability clusters. Using the PR commands, an initiator can establish, preempt, query, or reset a reservation policy with a specified target. During a failover takeover, the new virtual resource can reset the reservation policy of the old virtual resource, making device takeover easier and faster.

SCST

The main users of the SCSI target subsystem are storage companies providing storage solutions to the industry. Most of these storage solutions are plug-and-play appliances which can be attached to the storage network and used with little or no configuration. SCST boasts of a wider user base, which probably comes from the fact that they have wider range of transport support.

SCST supports both Qlogic and Emulex fibre channel cards whereas LIO supports only Qlogic target drives for now, and it is still in its beta stages of development. SCST supports the SCSI RDMA Protocol (SRP), and claims to be ahead in terms of development with respect to Fibre Channel over Ethernet (FCoE), LSI's Parallel/Wide SCSI Fibre Channel, and Serial Attached SCSI (SAS). It already has support for IBM's pSeries Virtual SCSI. Companies such as Scalable Informatics, Storewize, and Open-e have developed PnP appliance products which rely on these target transports based on SCST.

SCST supports notifications of session changes using asynchronous event notification (AEN). AEN is a protocol feature that may be used by SCSI targets to notify a SCSI initiator of events that occur in the target, when the target is not serving a request. This enables initiators to be notified of changes at the target end, such as devices added, removed, resized, or media changes. This way the initiators can see any target changes in a plug-and-play manner.

The SCST developers claim that their design conforms to more SCSI standards in terms of robustness and safety. The SCSI protocol requires that if an initiator clears a reservation held by another initiator, the reservation holder must be notified about the reservation clearance or else several initiators could change reservation data, ultimately corrupting it. SCST is capable of implementing safe RESERVE/RELEASE operations on devices to avoid such corruption.

According to the SCSI protocol, the initiator and target can communicate with each other to decide on the transfer size. An incorrect transfer size communicated by the initiator can lead to target device lockups or a crash. SCST safeguards against miscommunication of transfer sizes or transfer directions to avoid such a situation. The code claims to have a good memory management policy to avoid out-of-memory (OOM) situations. It can also limit the number of initiators that can connect to the target to avoid resource usage by too many connections. It also offers per-portal visibility control, which means that it can be configured in such a way that a target is visible to a particular subset of initiators only.

LIO

The LIO project began with the iSCSI design as its core objective, and created a generic SCSI target subsystem to support iSCSI. Simplicity has been a major design goal and hence LIO is easier to understand. Beyond that, the LIO developers have shown more willingness to work with the kernel developers as James pointed out to SCST maintainer Vladislav Bolkhovitin:

Look, let me try to make it simple: It's not about the community you bring to the table, it's about the community you have to join when you become part of the linux kernel. The interactions in the wider community are critical to the success of an open source project. You've had the opportunity to interact with a couple of them: sysfs we've covered elsewhere, but in the STGT case you basically said, here's our interface, use it. LIO actually asked what they wanted and constructed something to fit. Why are you amazed then when the STGT people seem to prefer LIO?

The LIO project also boasts of features which are either not present in SCST or are in early development phases. For example, LIO supports asymmetric logical unit assignment (ALUA). ALUA allows a target administrator to manage the access states and path attributes of the targets. This allows the multipath routing method to select the best possible path to optimize usage of available bandwidth, depending on the current access states of the targets. In other words, the path taken by the initiator in a multipath environment can be manipulated by target administrator by changing the access states.

LIO supports Management Information Base (MIB) which makes management of SCSI devices simpler. The SCSI target devices export management information values described in SCSI MIB RFC-4455 which is picked up by an SNMP agent. This feature extends to iSCSI devices and is beneficial in managing a storage network with multiple SCSI devices.

An error in the iSCSI connection can happen at three different levels: the session, digest, or connection level. Error recovery can be initiated at each of these levels, which makes sure that the recovery is made at the current level, and the error does not pass through to the next one. Error recovery starts with detecting a broken connection. In reponse, the iSCSI initiator driver establishes another TCP connection to the target, then it informs the target that the SCSI command path is being changed to the new TCP connection. The target can then continue processing SCSI commands on the new TCP connection. The upper level SCSI driver remains unaware that a new TCP connection has been established and that control has been transferred to the new connection. The iSCSI Session remains active during the period and does not have to be reinstated. LIO supports a maximum Error Recovery Level (ERL) of 2, which means that it can recover errors at the session, digest, or connection levels. SCST supports an ERL of 0, which means it can recover from session-level errors only and that all connection oriented errors are communicated to the SCSI driver.

LIO also supports "multiple connections per session" (MC/S). MC/S allows the initiator to open multiple connections between the initiator and target, either on the same or a different physical link. Hence, in case of a failure of one path, the established session can use another path without terminating the session. MC/S can also be used for load balancing across all established connections. Architectural session command ordering is preserved across those communication paths.

The LIO project also claims that its code is used in a number of appliance products and deployments though the user base does not seem to be as varied as that of SCST.

No comparison can be complete without a performance comparison. SCST developers have released their performance numbers from time to time. However, all their numbers were compared against STGT. The SCST comparison page speaks of SCST performing better than LIO, but the results were drawn on source-code study and not using real-world tests. SCST blames LIO for not releasing performance numbers, and there exist no performance data (to my knowledge) which would compare apples to apples.

The decision has finally been made, though, with quite a bit of opposition. Now comes the task of getting all the niche features which LIO lacks to be ported from SCST to LIO. While the decision was contentious, it is yet another example of the difficulty of getting something merged without being able to cooperate with the kernel development community.

Comments (8 posted)

Patches and updates

Kernel trees

Linus Torvalds Linux 2.6.38-rc2 ?

Architecture-specific

Daniel Walker Nexus One Support ?

Catalin Marinas ARM: Add support for the Large Physical Address Extensions ?

Akinobu Mita Introduce little-endian bitops ?

Sebastian Andrzej Siewior Device tree on x86, part v3 ?

Core kernel code

Rik van Riel directed yield for Pause Loop Exiting ?

Development tools

Ahmed S. Darwish [PATCH 0/2][concept RFC] x86: BIOS-save kernel log to disk upon panic ?

Device drivers

riyer@nvidia.com input: tegra-kbc - Add tegra keyboard driver ?

Po-Yu Chuang net: add Faraday FTMAC100 10/100 Ethernet driver ?

halli manjunatha TI WL 128x FM V4L2 driver ?

Filesystems and block I/O

Tejun Heo block: reimplement FLUSH/FUA to support merge ?

Jens Axboe On-stack explicit block queue plugging ?

Darrick J. Wong Refactor barrier=/nobarrier flags from fs to block layer ?

Janitorial

Arnd Bergmann Proposal for remaining BKL users ?

Memory management

Balbir Singh Unmapped page cache control (v3) ?

Jeremy Fitzhardinge Add apply_to_page_range_batch() and use it ?

Peter Zijlstra mm: Preemptibility -v7 ?

Security-related

Roberto Sassu eCryptfs: added support for the encrypted key type ?

Jari Ruusu Announce loop-AES-v3.6a file/swap crypto package ?

Virtualization and containers

Naoya Horiguchi HWPOISON for hugepage backed KVM guest ?

Glauber Costa New Proposal for steal time in KVM ?

Miscellaneous

Borislav Petkov RAS daemon v4 ?

Karel Zak util-linux v2.19-rc2 ?

Page editor: Jonathan Corbet

Distributions

Fedora goals coming into focus

January 26, 2011

This article was contributed by Joe 'Zonker' Brockmeier.

Fedora is getting closer to defining its long term goals. Board member Máirín Duffy posted a summary of the first draft of goals on January 11, with an invitation to comment on the goals on the blog or on the advisory board mailing list.

One might think that the Fedora Project has sufficiently defined what it is and what it's doing. The project has defined objectives, a mission statement, vision statement, core values, and has identified its target audience.

Fedora's vision statement gives a broad idea of what the project is about, and the core values (a.k.a. "four foundations") help briefly state what informs the vision. In this case, "freedom, friends, features, first," which (with the vision) helps provide a fairly decent "elevator pitch" to describe Fedora and for teams within the project to consider when making decisions. default offering.

Say what you will about Fedora, but you can't fault the project for being overly vague. But being precise is the point. All of this is part of Fedora's strategic planning effort, something that is lacking from many open source projects. During a discussion of Fedora's mission statement in October of 2009, Mike McGrath expressed the problem that many were seeing with Fedora at the time:

Right now Fedora is a place for everyone to just come and do whatever they want which is harming us in the long term. There's plenty of room for everyone in the Linux universe. I understand that by narrowing our focus we might lose some contributors who disagree with our values and mission. But that's better than not having one and having volunteers work against each other because they joined The Fedora Project thinking it was one thing only to find it's something else.

Having a clear mission statement and values also enables the project to move forward without being distracted with activities that aren't part of its scope — like worrying about using Fedora for infrastructure services when long term supported releases are not an objective. Rather than being drawn into an (overly) long debate about what Fedora "should be" it's possible to point to the project's objectives — which do not in any way encourage a long term support release. It also enables Fedora to prioritize its resources. As former Fedora Project Leader Paul Frields wrote during the target audience discussion, "having an audience in mind, we as a community can prioritize resources, and at the same time make it possible for people who want to concentrate on other audiences to build community around those efforts."

Fedora has been wrestling with these issues for some time, and there has been some unease expressed by some members of the Fedora community that the board is espousing its view of what Fedora should be rather than what the community wishes Fedora to be. Greg DeKoenigsberg addressed this by saying "the Fedora leadership should stake out positions that they believe to be correct, and should work to mobilize resources that move us in those directions... [while guaranteeing] the freedom for dissenting community members to move in their own directions."

A long list of goals

With all of the other strategic items in place, it is now up to the board to define goals for the next few releases; they now have a working list to consider. The initial list includes 15 goals that have been culled by the board from proposals out of the larger Fedora community. Much of the discussion happened took place back in November on the advisory-board list.

Initially the call for goals was for the "next 3-4 releases", but that seems to have been cut down to the next two releases over the intervening months. The goals are to move Fedora closer to the vision statement for Fedora, which is:

The Fedora Project creates a world where free culture is welcoming and widespread, collaboration is commonplace, and people control their content and devices.

The final list includes improving and simplifying communication in the Fedora community, improving communication within the project, recruiting uncommon skillsets into Fedora, and improving the developer experience within Fedora. Some of the goals seem to describe things Fedora already does well. Goal 14, "Evaluate late-breaking technologies for inclusion/interaction with Fedora," for instance, seems to be well underway already. Others, like the goals around communication, could be combined into a single goal. It does seem that, like many FOSS projects, Fedora finds communication within and without the project to be a continual source of difficulty.

A set of 15 goals, of course, is far too many to be practical, so the board is trying to reduce the list of goals to five goals for the next two releases. The board has settled on its five, being:

Goal #1: Improve and simplify collaboration in the Fedora Community.
Goal #2: Improve and encourage high-quality communication in the Fedora Community.
Goal #4: It is extraordinarily easy to join the Fedora community and quickly find a project to work on.
Goal #11: Expand global presence of Fedora among users & contributors.
Goal #12: Improve education & skill sharing in community.

So far, there's been very little discussion on the advisory-board mailing list, but there has been discussion among some of the subgroups in Fedora. For example, the Fedora Ambassadors Steering Committee (FAMSCo) brainstormed ahead of meeting with the board to offer their suggestions on which goals should be chosen. The board and FAMSCo seem to have mind-melded, as they share the exact same list of five goals.

Fedora's Engineering Steering Committee (FESCo) has also met and discussed the goals and agreed on three goals. The first two (Improve and simplify collaboration in the Fedora Community, Improve and encourage high-quality communication in the fedora community) mirror the board and FAMSCo. The third, unsurprisingly for the engineering committee, is to improve the developer experience in Fedora.

All of the goals that have been put forward so far seem perfectly reasonable. Goal #1, for example, would put emphasis on improving Fedora's governance structure and carries a suggestion that the Fedora board meet in person at least once per year. Goal #2 overlaps with #1, and both carry a suggestion about creating a calendaring solution for Fedora. (Also, perhaps, highlighting the absence of a decent FOSS calendaring solution.)

More feedback will trickle in before the final goals are set, but it looks likely that improving communication and collaboration will be the primary goals for the next two releases. It's important to note that the goals are only suggestions and, as the wiki states "to help people who want to work on several things to prioritise their time."

Coming up with a mission and goals for a large distribution is not easy. This is particularly true of a project with a corporate sponsor shifting from an closed development model to open, and a mixture of paid and unpaid contributors. Consider the efforts of the openSUSE Project to define its strategy. The effort has been in process now since 2009, and is still being worked on, with no target date for completion.

The goals will be a hot topic at the upcoming FUDCon in Tempe, Arizona from January 29 through 31. There will be a session on the goals led by Duffy and the board members present, and a governance hackfest where the goals will likely be discussed as well. Fedora, thankfully, is at the tail end of the process. The question now becomes how well the various subgroups in Fedora will adhere to the goals — and whether they'll actually lead to success.

Comments (none posted)

Brief items

Distribution quotes of the week

The timing of it is very important, as most major distros would like to adopt some of the features that just became popular in the various new app markets and stores, such as screenshots, user comments and ratings. It looks like a lot of new code is about to be written, or a lot of existing code is about to gain quite a bit of popularity.

-- Enrico Zini

The fact is that it's still much easier to work on things in a corner. The funny thing is that people are generally not opposed to work together, far from it. When discussing low-level bits related to packaging systems, many would expect dpkg/apt developers and rpm/zypp developers to have some heated discussion just because we always hear confrontational stories here and there. The truth is that those stories are generally from users, and developers are generally happy to accept differences.

-- Vincent Untz

Comments (none posted)

Debian derivatives census

The Debian Project has invited representatives of Debian-derived distributions to participate in a census of Debian derivatives. "By participating in the census you will increase the visibility of your derivative within Debian, provide Debian contributors with a contact point and a set of information that will make it easier for them to interact with your distribution. Representatives of distributions derived from Ubuntu are encouraged to get their distribution added to the Ubuntu Derivative Team wiki page."

Full Story (comments: none)

Debian Installer 6.0 Release Candidate 2 release

The Debian Installer team has announced the second release candidate of the installer for Debian Squeeze. "We need your help to find bugs and further improve the installer, so please try it."

Full Story (comments: none)

Announcing EPEL 6

The Extra Packages for Enterprise Linux (EPEL) project has announced the release of EPEL 6. "EPEL 6 is a collection of add-on packages available for Red Hat Enterprise Linux (RHEL) 6 and other compatible systems, maintained by the community under the umbrella of the Fedora Project. EPEL 6 is designed to supplement RHEL 6 by providing additional functionality and does not replace any RHEL 6 packages. As a community project, EPEL is maintained and supported by volunteers via Bugzilla and mailing lists. EPEL is not commercially supported by Red Hat, Inc."

Full Story (comments: none)

Fedora 14 for IBM System z 64bit official release

The Fedora IBM System z (s390x) Secondary Arch team has announced the official release of Fedora 14 for IBM System z 64bit.

Full Story (comments: none)

Announcing the release of Foresight Linux 2.5.0 ALPHA 1 GNOME Edition

Foresight Linux 2.5.0 ALPHA 1 GNOME Edition has been released. "Well known for being a desktop operating system featuring an intuitive user interface and a showcase of the latest desktop software, this new release brings you the latest GNOME 2.32 release, a newer Linux kernel 2.6.35.10, Xorg-Server 1.8, Conary 2.2 and a ton of updated applications!"

Full Story (comments: none)

Distribution News

Debian GNU/Linux

Bits from the Security Team (for those that care about bits)

The Debian Security Team had a productive meeting earlier the month. Topics discussed include Improvements to the team workflow, Hardening compiler flags, Longer security support for Debian stable, "Beta testing" of security updates, README.test, Backports security support, Issues in specific packages. There is also a call for volunteers.

Full Story (comments: none)

Join the DebConf team

DebConf team is looking for new people to help with DebConf11. "DebConf is a huge process, and there are many things we could use help on. People come and go, and are usually overworked after a year or two -- so we would love new people to get involved. If you have new ideas, we'd love to hear about them and we can discuss if they'd work and how to make them happen. And by the way, if you are looking for a good way to get involved with Debian and don't know where to start, this might be among the best options!"

Full Story (comments: none)

Debian Project at several conferences and trade fairs

The Debian Project will be present at several upcoming events. "The Debian Project invites all interested persons to said events, ask questions, take a look at Debian 6.0 "Squeeze", exchange GPG-Fingerprints to boost the Web of trust and get to know the members and the community behind the Debian Project."

Full Story (comments: none)

Fedora

A security incident on Fedora infrastructure

Fedora project leader Jared Smith has announced that a Fedora account was compromised. It would appear that the account credentials were somehow compromised externally and that the Fedora infrastructure is not vulnerable to some kind of exploit. "While the user in question had the ability to commit to Fedora SCM, the Infrastructure Team does not believe that the compromised account was used to do this, or cause any builds or updates in the Fedora build system. The Infrastructure Team believes that Fedora users are in no way threatened by this security breach and we have found no evidence that the compromise extended beyond this single account."

Full Story (comments: none)

Fedora Board Meeting January 24

The minutes of the January 24 meeting of the Fedora Board cover a Fedora strategic goals discussion with FESCo.

Comments (none posted)

Ubuntu family

Ubuntu Technical Board meeting minutes

During the January 11th meeting the board discussed a version template for -extras and reorganizing drivers/owners/release managers permissions in Launchpad.

The January 25th meeting covers default ntpd configuration and Seamonkey microrelease SRU exception.

Comments (none posted)

Newsletters and articles of interest

Distribution newsletters

DistroWatch Weekly, Issue 388 (January 24)
Fedora Weekly News Issue 259 (January 19)
openSUSE Weekly News, 159 (January 22)

Comments (none posted)

Untz: Results of the App Installer meeting, and some thoughts on cross-distro collaboration

On his blog, Vincent Untz reflects on the recently completed cross-distribution App Installer meeting, which by his and others' accounts was definitely a success. In the posting, he also spends some time talking about the need for more cross-distribution collaboration. "To be honest, since I started working on openSUSE, I've kept wondering why all distributions duplicate so much work. Sometimes, there is a good reason, like a radically different technical approach. But sometimes, it looks like we're going different ways just for the sake of doing something ourselves. We should fix this. Cross-distro collaboration is not the way we usually do things, and I believe we're wrong most of the time. Cross-distro collaboration is a cultural shift for us. But it's very well needed."

Comments (72 posted)

Shuttleworth: Qt apps on Ubuntu

Mark Shuttleworth has plans for more Qt applications in Ubuntu. "System settings and prefs, however, have long been a cause of friction between Qt and Gtk. Integration with system settings and preferences is critical to the sense of an application "belonging" on the system. It affects the ability to manage that application using the same tools one uses to manage all the other applications, and the sorts of settings-and-preference experience that users can have with the app. This has traditionally been a problem with Qt / KDE applications on Ubuntu, because Gtk apps all use a centrally-manageable preferences store, and KDE apps do things differently. To address this, Canonical is driving the development of dconf bindings for Qt, so that it is possible to write a Qt app that uses the same settings framework as everything else in Ubuntu. We've contracted with Ryan Lortie, who obviously knows dconf very well, and he'll work with some folks at Canonical who have been using Qt for custom development work for customers. We're confident the result will be natural for Qt developers, and a complete expression of dconf's semantics and style."

Comments (none posted)

New Linux Distribution Brings Goodies to Debian (OStatic)

Susan Linton takes a look at Saline OS 1.0, a new distribution based on Debian Squeeze. "Saline OS is delivered as an installable live CD and features Linux 2.6.36, Xorg X Server 1.7.7, and GCC 4.4.5. Chromium Web browser, IceDove mail client, Rhythmbox, Fotoxx photo manager, Parole video player, Osmo organizer, OpenOffice.org, Pidgin, and Xfburn media creator are part of the software stack. Synaptic setup with Debian Squeeze repositores is available to install other software if desired. An icon on the upper panel launches automatic updates, which are pulled in from Debian Squeeze. The lower panel with lots of application launchers hides until hover."

Comments (none posted)

Maciel: Because your distro should be cool

Og Maciel writes about why he likes Foresight Linux. "Reason 2 - Roll backs: Because the entire system is kept under a complete version control down to the file level, It is possible to perform something that other distributions can only dream of: system roll backs! Don't like the application you've just installed? Remove it and it will be as if your system never had it installed! Want to go back to the update you ran 3 weeks or even months ago? Not a problem! Your system is like a giant Git/Mercurial repository and you control what to clone and what branch to checkout."

Comments (17 posted)

Mepis Goes to 11 (Linux Magazine)

Joe "Zonker" Brockmeier reviews a beta of Mepis 11. "Mepis is not one of the best-known Linux distributions, but it does have a loyal following. Though it's never been my distro of choice, it was a favored distribution with some of my colleagues at Linux.com circa 2005 and 2006. In fact, it was favored by a lot of users then - coming in 5th in the DistroWatch listings in 2005, and 4th in 2006. What happened? The Ubuntu/Kubuntu juggernaut, that's what. But user base is not a clear indication of the quality of a distribution. Let's see what Mepis 11 has to offer."

Comments (none posted)

Domsch: Consistent Network Device Naming coming to Fedora 15

Matt Domsch has been working on Consistent Network Device Naming for Fedora 15 (and beyond). "Systems running Linux have long had ethernet network devices named ethX. Your desktop likely has one ethernet port, named eth0. This works fine if you have only one network port, but what if, like on Dell PowerEdge servers, you have four ethernet ports? They are named eth0, eth1, eth2, eth3, corresponding to the labels on the back of the chassis, 1, 2, 3, 4, respectively. Sometimes. Aside from the obvious confusion of names starting at 0 verses starting at 1, other race conditions can happen such that each port may not get the same name on every boot, and they may get named in an arbitrary order. If you add in a network card to a PCI slot, it gets even worse, as the ports on the motherboard and the ports on the add-in card may have their names intermixed."

Comments (63 posted)

Page editor: Rebecca Sobol

Development

Correlating log messages with syslog-ng

January 26, 2011

This article was contributed by Robert Fekete

Correlating log messages to get a deeper insight about the actual events happening on a network or server is an important element of IT security. Being able to do so is mandated by several security compliance standards, best practices, and also common sense. However, many common log analyzing and correlation engines cannot handle high message rates in real time, requiring administrators to filter the input of the analyzing engine. Proprietary solutions are often licensed based on the number of processed messages, which limits their usefulness. The syslog-ng project aims to provide a flexible, real-time correlation solution that scales well even to extreme performance requirements.

Syslog-ng is an advanced system logging tool, which can be a replacement for the standard syslogd and rsyslog daemons. The syslog-ng pattern database, introduced almost two years ago, allows for real-time message identification and classification by comparing the incoming log messages to a set of message patterns. The classification engine of syslog-ng is much faster and scalable than using regular expressions to identify messages, and also permits the administrator to extract relevant information from the message body or to add custom metadata (for example, tags) to log messages. We looked at message classification in syslog-ng just over a year ago.

The new message correlation feature extends the syslog-ng pattern database to make it possible to associate related log messages, and to treat the information from those messages as if they were a single event.

Message correlation is one of the foundations of log analysis and reporting, because log messages tend to be hectic, and often separate important information about events into different log messages. For example, the Postfix e-mail server logs the sender and recipient addresses into separate log messages. For OpenSSH, if there is an unsuccessful login attempt, the server sends a log message about the authentication failure with the reason for the failure in the next message. But in fact the event and its exact details are interesting, not necessarily the individual log messages, therefore being able to collect information as events rather than messages can be a boon for every system administrator.

How correlation works in syslog-ng

Message correlation in syslog-ng operates on the log messages successfully identified by the syslog-ng's pattern database: you can extend the rules describing message patterns with instructions on how to correlate the matching messages.

Correlating log messages involves collecting the messages into message groups called contexts. A context consists of a series of log messages that are related to each other in some way, for example, the log messages of an SSH session can belong to the same context. Messages may be added to a context as they are processed. The context of a log message can be specified using simple static strings or with macros and dynamic values. For example, you can group messages received from the same host ($HOST), application ($HOST$PROGRAM), or process ($HOST$PROGRAM$PID).

Messages belonging to the same context are correlated, and can be processed in a number of ways. It is possible to include the information contained in an earlier message of the context in messages that are added later. For example, if a mail server application sends separate log messages about every recipient of an e-mail (like Postfix), you can merge the recipient addresses to the previous log message. Another option is to generate a completely new log message that contains all the important information that was stored previously in the context, for example, the login and logout (or timeout) times of an authenticated session (like SSH or telnet), and so on.

To ensure that a context handles only log messages of related events, a timeout value can be assigned to a context, which determines how long the context accepts related messages. If the timeout expires, the context is closed.

Triggering new messages and external actions

In syslog-ng Open Source Edition (OSE) 3.2, you can automatically generate new messages when a particular message is recognized, or the correlation timeout of a context expires. The generated messages can be configured within the pattern database rules, meaning that if needed, a new message can be generated for every incoming log message. Obviously this not necessary, unless you take log normalization really seriously.

When used together with message correlation, you can also refer to fields and values of earlier messages of the context. For example, the patterns:

    <pattern>
        Accepted @QSTRING:SSH.AUTH_METHOD: @ for@QSTRING:SSH_USERNAME: \
        @from @QSTRING:SSH_CLIENT_ADDRESS: @port @NUMBER:SSH_PORT_NUMBER:@ ssh2
    </pattern>

    <pattern>
        pam_unix(sshd:session): session closed for user @ESTRING:SSH_USERNAME: @
    </pattern>

could be used to match OpenSSH's log messages. Then the action:

    <value name="MESSAGE">
        An SSH session for $SSH_USERNAME from ${SSH_CLIENT_ADDRESS}@1 \
        closed. Session lasted from ${DATE}@1 to $DATE.
    </value>

would put out a correlated message that included information from both log messages. The above is just a snippet, consult the full XML rules for all the gory details.

Sending alerts directly from syslog-ng is currently not supported, but would be a welcome addition to the next versions. However, it is reasonably simple to pass the selected messages to an external script that sends out alerts in e-mail or SNMP. And since completely new messages can be created from the information extracted from the correlated messages, all the script has to do is to send out the alerts, for example using sendmail or snmptrap.

To process already collected log messages, syslog-ng also allows for correlating log messages from log files. For this reason, the time elapsed between two log messages is calculated from the actual timestamps of the log messages instead of using the system time.

Beyond syslog-ng 3.2

Work on syslog-ng OSE 3.3 has already started, and focuses on improving the support for multicore and multithreaded operations to increase the performance of syslog-ng and make it even more suitable for high-message rate environments. Transforming the internal representation of log messages to other, non-syslog outputs like JSON or WELF is also on the roadmap.

As correlating log messages becomes increasingly important for companies and organizations, it is welcome to see that open source tools are also focusing on solving this problem. Although the syslog-ng project has had a sometimes rocky relationship with the open source community in the past, its OSE is under active development. In fact, the message correlation feature, among others, is currently available only in the OSE.

[ The author is a technical writer for BalaBit, which developed syslog-ng. ]

Comments (4 posted)

Brief items

Quotes of the week

As more groups warm to the beauty that is embodied in Qt, I hope that the message of working together (rather than dictating, for life or otherwise) also spreads. That mode of operation is what got Qt and KDE Platform, as high quality developer tools, to where they are today. It is what motivates us to look at the development platforms we build for application developers and ask ourselves, "How can we make this as painless as possible for the developer while giving them access to as many platforms as seamlessly as possible?" It's a way of thinking that helps create a superior result, and we're always looking for new ways to expand the benefits it brings.

-- Aaron Seigo

...but this caught my eye.

($=[$=[]][(__=!$+$)[_=-~-~-~$]+({}+$)[_/_]+
($$=($_=!''+$)[_/_]+$_[+$])])()[__[_/_]+__
[_+~$]+$_[_]+$$](_/_)

Care to guess what that does?

-- "adamcecc"

Comments (2 posted)

KDE 4.6 released

KDE.News announces the release of KDE 4.6, including KDE Plasma Workspaces, updated KDE applications, and the mobile platform.

Comments (26 posted)

LibreOffice 3.3 released

The Document Foundation has announced the release of LibreOffice 3.3, which is the first stable release of the OpenOffice.org fork. "LibreOffice 3.3 brings several unique new features. The 10 most-popular among community members are, in no particular order: the ability to import and work with SVG files; an easy way to format title pages and their numbering in Writer; a more-helpful Navigator Tool for Writer; improved ergonomics in Calc for sheet and cell management; and Microsoft Works and Lotus Word Pro document import filters. In addition, many great extensions are now bundled, providing PDF import, a slide-show presenter console, a much improved report builder, and more besides. A more-complete and detailed list of all the new features offered by LibreOffice 3.3 is viewable on the following web page: http://www.libreoffice.org/download/new-features-and-fixes/".

Comments (11 posted)

OpenOffice.org 3.3.0 final released (The H)

The H looks at the OpenOffice.org 3.3 release. "OpenOffice.org 3.3.0 features an updated, easier to use, Extension Manager user interface (UI) and several improvements to Calc spreadsheets, such as an increase in the number of rows supported from 65,536 to 1,048,576. The print system has been restructured, the thesaurus dialogue has been redesigned for better usability and slide layout handling has been improved in the presentation application, Impress." More information can be found in the OOo New Features page and the release notes.

Comments (16 posted)

OpenSSH 5.7 released

OpenSSH 5.7 has been released. Some new features in this release include Elliptic Curve Cryptography modes for key exchange (ECDH) and host/user keys (ECDSA), a protocol extension to support a hard link operation added to sftp, new options for scp and ssh, and more. The announcement (click below) contains additional information.

Full Story (comments: 15)

Sala 1.0 released

Sala is a command-line tool for the management of an encrypted password database. Actual passwords are stored in their own file, making the use of tab-completion for lookups possible. The 1.0 release is available now.

Full Story (comments: 1)

Newsletters and articles

Development newsletters from the past week

Caml Weekly News (January 25)
LibreOffice development summary (January 26)
PostgreSQL Weekly News (January 23)
SpamAssassin News (January 24)
Tcl-URL! (January 26)

Comments (none posted)

Will it Blend? A Look at Blender's New User Interface (Linux.com)

Nathan Willis looks at Blender's new UI over at Linux.com. "And as with every new Blender release, there are indeed new tools in 2.56a. For example, the Solidify tool allows you to select a thin, planar object and automatically extrude thickness into it. There is a new paintbrush system, which lets you modify any brush's size, strength, texture, and low-level behavior curves. Sculpt Mode, in which you modify objects by whittling or squishing them around, was also rewritten, making it easier to do multi-resolution sculpting (for example, sculpting at a rough resolution to define a character's body, but working with much finer detail on its face)."

Comments (none posted)

Barnes: Debugging display problems

On his blog, Jesse Barnes has a nice description of how computer displays work in terms of the memory organization and timings, along with some tips on debugging display problems (with photos and links to videos). "There are several variables that apply: bits per pixel, indexed or not, tiling format, and color format (in the Intel case, RGB or YUV), and stride or pitch. Bits per pixel is as simple as it sounds, it simply defines how large each pixel is in bits. Indexed planes, rather than encoding the color directly in the bits for the pixel, use the value as an index into a palette table which contains a value for the color to be displayed. The tiling mode indicates the surface organization of the plane. Tiled surfaces allow for much more efficient rendering, and allowing planes to use them directly can save copies from tiled rendering targets to an un-tiled display plane. Finally, the color format defines what values the pixels represent."

Comments (1 posted)

Page editor: Jonathan Corbet

Announcements

Brief items

Google Summer of Code 2011

Google Summer of Code 2011 was announced at linux.conf.au. "This will be the 7th year for Google Summer of Code, an innovative program dedicated to introducing students from colleges and universities around the world to open source software development. The program offers student developers stipends to write code for various open source projects with the help of mentoring organizations from all around the globe. Over the past 6 years Google Summer of Code has had 4,500 students from over 85 countries complete the program. We are excited to announce that we will extend the scope of the program this year by targeting a 25% increase in accepted student applications as well as accepting a larger number of mentoring organizations. Our goal is to help these students pursue academic challenges over the summer break while they create and release open source code for the benefit of all."

Comments (1 posted)

OSU OSL Introduces Supercell

The Open Source Lab at Oregon State University has announced Supercell. "Supercell is a new on demand virtualization and continuous integration resource, made possible by a generous grant from Facebook's Open Source Team. We have created this cluster for use by open source projects who need to run software tests regularly but may not have access to the appropriate hardware or the funds to pay for outsourcing this service. Supercell will also allow projects to do manual testing to verify that a submitted patch has actually fixed the intended bug or to determine that their software package runs correctly on a particular operating system or distribution. The service will also allow projects to test their software in a large cluster using several VMs concurrently. Supercell will also provide temporary space for projects who would like to test drive new features in their code base or on their website."

Comments (none posted)

Trademark Policy of the Document Foundation

The Document Foundation has released "the more or less final draft " of its trademark policy, pending a legal review.

Full Story (comments: none)

London's Design Museum Recognizes Ubuntu Fonts

The Ubuntu Project has announced the opening of a new exhibition at London's Design Museum dedicated to the Ubuntu Font, in collaboration with international typeface designers Dalton Maag. "Entitled "Shape My Language," the exhibition will run from January 28 to February 28, 2011. The exhibition marks a significant milestone for the Ubuntu Project's advance in design and aims to enhance the consumer experience of using open computing platforms, such as Ubuntu."

Full Story (comments: none)

Articles of interest

Second batch of FOSDEM 2011 speaker interviews

A second batch of FOSDEM speaker interviews is available. Martijn Dashorst (Wicket), David Fetter (PL/Parrot), Andrew Godwin (Django), Soren Hansen (OpenStack), Lennart Poettering (systemd), Spike Morelli (devops), and Kenneth Rohde Christiansen (Qt WebKit) are interviewed in this round.

Comments (none posted)

Fellowship interview with Anne Østergaard

The Fellowship of the Free Software Foundation Europe has an interview with Anne Østergaard. "Anne Østergaard is a veteran of the Free Software community, and attended the first Open Source Days, back in 1998. She holds a Law Degree from The University of Copenhagen, Denmark, and after a decade in government service, international organisations, and private enterprise, she has become a devoted Free Software advocate. Her interests lie in the long-term strategic issues of Free Software; in the social, legal, research, and economic areas of our global society. A former Vice Chairman at GNOME, shes heavily involved in political lobbying, and has been fighting for changes in software patents and copyright for a number of years."

Comments (none posted)

New alleged evidence of Android infringement isn't a smoking gun (ars technica)

Ars technica examines some evidence in the Oracle vs. Google lawsuit. "Patent reform activist Florian Mueller has published what he believes to be new evidence of copyright infringement in Google's Android software platform. He has found files in the Android code repository that have Sun copyright headers identifying them as proprietary and confidential. A close look at the actual files and accompanying documentation, however, suggest that it's not a simple case of copy and paste."

Comments (86 posted)

EFF: Sony v. Hotz: Sony Sends A Dangerous Message to Researchers -- and Its Customers

On its Deeplinks blog, the EFF has a strongly worded look at the actions taken by Sony against George Hotz for finding and publicizing security holes in its PlayStation 3 console. "Not content with the DMCA hammer, Sony is also bringing a slew of outrageous Computer Fraud and Abuse Act claims. The basic gist of Sony's argument is that the researchers accessed their own PlayStation 3 consoles in a way that violated the agreement that Sony imposes on users of its network (and supposedly enabled others to do the same). But the researchers don't seem to have used Sony's network in their research — they just used the consoles they bought with their own money. Simply put, Sony claims that it's illegal for users to access their own computers in a way that Sony doesn't like. Moreover, because the CFAA has criminal as well as civil penalties, Sony is actually saying that it's a crime for users to access their own computers in a way that Sony doesn't like."

Comments (33 posted)

Phipps: OSI And FSF In Unprecedented Collaboration To Protect Software Freedom

On his Computerworld UK blog, Simon Phipps writes about the OSI and FSF teaming up to file a request [PDF] to the US Department of Justice (DOJ) to investigate the CPTN patent purchase (i.e. the 882, now 861, Novell patents). "Whatever the outcome of the matter, its importance has done a great service providing the OSI and the FSF with a first public opportunity to continue the positive relationship that has resulted in earlier private collaborations, such as when both organisations endorsed the formation of the Document Foundation. I strongly hope that both organisations will continue to explore ways to act collaboratively from their different perspectives of software freedom in the interests of the overlapping communities."

Comments (1 posted)

HTML is the new HTML5 (The WHATWG Blog)

Ian Hickson introduces the end to version numbers for the HTML specification. "The WHATWG HTML spec can now be considered a "living standard". It's more mature than any version of the HTML specification to date, so it made no sense for us to keep referring to it as merely a draft. We will no longer be following the "snapshot" model of spec development, with the occasional "call for comments", "call for implementations", and so forth." (Thanks to Paul Wise)

Comments (45 posted)

New Books

Apache Reference Manual - printed edition now available

The Apache Reference Manual is available as a printed book from Network Theory Ltd. For each copy of this manual sold, $1 will be donated to the Apache Software Foundation.

Full Story (comments: none)

HTML5 and CSS3--New from Pragmatic Bookshelf

Pragmatic Bookshelf has released "HTML5 and CSS3", by Brian Hogan.

Full Story (comments: none)

MAKE Magazine Leads the "Arduino Revolution"

MAKE Magazine Volume 25 ("Arduino Revolution") features DiY projects using the Arduino microcontrollers.

Full Story (comments: none)

Calls for Presentations

EuroScipy 2011 - Call for papers

EuroScipy 2011 will be held in Paris, France, August 25-28, 2011. The call for papers is open until May 8, 2011.

Full Story (comments: none)

Grace Hopper Celebration of Women in Computing - CfP

The Grace Hopper Celebration of Women in Computing will take place November 8-12, 2011 in Portland, Oregon. This year's theme is "What if...?" The call for participation is open until March 15, 2011.

Comments (none posted)

Upcoming Events

Red Hat Summit and JBoss World 2011 Registration Open

Red Hat has announced that registration is open for the 2011 Red Hat Summit and JBoss World. "This marks the seventh year that Red Hat has gathered customers, partners, visionary thinkers, technologists and open source enthusiasts to learn, network and explore open source. The event will be held in Boston at the Seaport World Trade Center, May 3-6, 2011. A full list of sessions is now posted with talks covering a wide range of topics from general overviews and roadmaps of Red Hat's cloud, virtualization, platform and middleware offerings to the more developer-focused sessions that include tips, tricks and demonstrations."

Comments (none posted)

SCALE 9X: Jane Silber to give one of two keynotes

The Southern California Linux Expo (SCALE) has announced the list of speakers for this year's conference, which will take place in Los Angeles, California, February 25-27, 2011.

Full Story (comments: none)

Events: February 3, 2011 to April 4, 2011

The following event listing is taken from the LWN.net Calendar.

Date(s)	Event	Location
February 2 February 3	Cloud Expo Europe	London, UK
February 5 February 6	FOSDEM 2011	Brussels, Belgium
February 5	Open Source Conference Kagawa 2011	Takamatsu, Japan
February 7 February 11	Global Ignite Week 2011	several, worldwide
February 11 February 12	Red Hat Developer Conference 2011	Brno, Czech Republic
February 15	2012 Embedded Linux Conference	Redwood Shores, CA, USA
February 25	Build an Open Source Cloud	Los Angeles, CA, USA
February 25 February 27	Southern California Linux Expo	Los Angeles, CA, USA
February 25	Ubucon	Los Angeles, CA, USA
February 26	Open Source Software in Education	Los Angeles, CA, USA
March 1 March 2	Linux Foundation End User Summit 2011	Jersey City, NJ, USA
March 5	Open Source Days 2011 Community Edition	Copenhagen, Denmark
March 7 March 10	Drupalcon Chicago	Chicago, IL, USA
March 9 March 11	ConFoo Conference	Montreal, Canada
March 9 March 11	conf.kde.in 2011	Bangalore, India
March 11 March 13	PyCon 2011	Atlanta, Georgia, USA
March 19	Open Source Conference Oita 2011	Oita, Japan
March 19 March 20	Chemnitzer Linux-Tage	Chemnitz, Germany
March 19	OpenStreetMap Foundation Japan Mappers Symposium	Tokyo, Japan
March 21 March 22	Embedded Technology Conference 2011	San Jose, Costa Rica
March 22 March 24	OMG Workshop on Real-time, Embedded and Enterprise-Scale Time-Critical Systems	Washington, DC, USA
March 22 March 25	Frühjahrsfachgespräch	Weimar, Germany
March 22 March 24	UKUUG Spring 2011 Conference	Leeds, UK
March 22 March 25	PgEast PostgreSQL Conference	New York City, NY, USA
March 23 March 25	Palmetto Open Source Software Conference	Columbia, SC, USA
March 26	10. Augsburger Linux-Infotag 2011	Augsburg, Germany
March 28 April 1	GNOME 3.0 Bangalore Hackfest \| GNOME.ASIA SUMMIT 2011	Bangalore, India
March 28	Perth Linux User Group Quiz Night	Perth, Australia
March 29 March 30	NASA Open Source Summit	Mountain View, CA, USA
April 1 April 3	Flourish Conference 2011!	Chicago, IL, USA
April 2 April 3	Workshop on GCC Research Opportunities	Chamonix, France
April 2	Texas Linux Fest 2011	Austin, Texas, USA

If your event does not appear here, please tell us about it.

Page editor: Rebecca Sobol