User: Password:
|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for January 27, 2011

Unhosted web applications: a new approach to freeing SaaS

January 26, 2011

This article was contributed by Nathan Willis

Free software advocates have been pushing hard against the growing trend of commercial Software-as-a-Service (SaaS) — and the resulting loss of autonomy and software freedom — for several years now. A new project named Unhosted takes a different approach to the issue than that used by better-known examples like Diaspora and StatusNet. Unhosted is building a framework in which all of a web application's code is run on the client-side, and users have the freedom to choose any remote data storage location they like. The storage nodes use strong encryption, and because they are decoupled from the application provider, users always have the freedom to switch between them or to shut off their accounts entirely.

The Unhosted approach

An outline of the service model envisioned by Unhosted can be found on the project's Manifesto page, written by founder Michiel de Jong. "A hosted website provides two things: processing and storage. An unhosted website only hosts its source code (or even just a bootloader for it). Processing is done in the browser, with ajax against encrypted cloud storage."

In other words, the manifesto continues, despite the availability of the Affero GPL (AGPL), which requires making source code available to network end-users, licensing alone is not enough to preserve user freedom because proprietary SaaS sites require users to upload their data to "walled silos" run by the service provider. An Unhosted application is a JavaScript program that runs in the browser, but accesses online storage on a compliant storage node. It does not matter to the application whether the storage node is run by the application provider, the user, or a third party.

Storage nodes are essentially commodity infrastructure, but in order to preserve user freedom, Unhosted requires that applications encrypt and sign the data they store. The project defines an application-layer protocol called Unhosted JSON Juggling Protocol (UJJP, sometimes referred to as UJ) for applications to communicate with storage nodes, for requesting and exchanging objects in JavaScript Object Notation (JSON) format.

As the FAQ explains, this constitutes a distinctly different model than most other free software SaaS projects. Most (like StatusNet and Diaspora) focus on federation, which allows each user to run his or her own node, and requires no centralized authority linking all of the user accounts. The down side of the federated systems are that they may still require the users to entrust their data to a remote server.

Eben Moglen's FreedomBox, on the other hand, focuses on putting the storage under the direct control of the user (specifically, stored at home on a self-managed box). This is a greater degree of freedom, but home-hosting is less accessible from the Internet at large than most web services because it often depends on Dynamic DNS. Home-hosting is also vulnerable to limited upstream bandwidth and common ISP restrictions on running servers.

Unhosted, therefore, attempts to preserve the "accessible anywhere" nicety of popular Web 2.0 services, but de-link the application from the siloed data.

Connecting applications to storage

Obviously, writing front-end applications entirely in HTML5 and JavaScript is not a new idea. The secret sauce of Unhosted is the connection method that links the application to the remote storage node — or, more precisely, that links the application to any user-defined storage node. The system relies on Cross-Origin Resource Sharing (CORS), a W3C Working Draft mechanism by which a server can opt-in to make its resources available to requests originating from other servers.

In the canonical "web mail" example, the Unhosted storage node sees a cross-origin request from the webmail application, checks the source, user credentials, and request type against its access control list, and returns the requested data only if the request is deemed valid. UJJP defines the operations an application can perform on the storage node, including creating a new data store, setting and retrieving key-value pairs, importing and exporting data sets, and completely deleting a data store.

Security-wise, each application only has access to its own data store, not the user's entire storage space, and CORS does allow each storage node to determine a policy about which origins it will respond to. But beyond that, the system also relies on the fact that the user has access to all of the application source code, because it runs in the browser. Thus it is up to the user to notice if the application does something sinister like relay user credentials to an untrusted third party. Dealing with potentially obfuscated JavaScript may be problematic for users, but it is still an improvement over server-side processing, which happens entirely out of sight.

Finally, each application needs a way to discover which storage node a user account is associated with, preferably without prompting the user for the information every time. The current Unhosted project demo code relies on Webfinger-based service discovery, which uniquely associates a user account with an email address. The user would log in to the application with an email address, the application would query the address's Webfinger identity to retrieve a JSON-formatted array of Unhosted resource identifiers, and connect to the appropriate one to find the account's data store.

This is not a perfect solution, however, because it depends on the email service provider supporting Webfinger. Other proposed mechanisms exist, including using Jabber IDs and Freedentity.

The tricky bits

Currently, one of the biggest sticking points in the system is protecting the stored data without making the system arduous for end users. The present model relies on RSA encryption and signing for all data stores. Although the project claims this is virtually transparent for users, it gets more difficult when one Unhosted application user wishes to send a message to another user. Because the other user is on a different storage node, that user's public key needs to be retrieved in order to encrypt the message. But the system cannot blindly trust any remote storage node to authoritatively verify the other user's identity — that would be trivial to hijack. In response, the Unhosted developers are working on a "fabric-based public key infrastructure" that enables users to deterministically traverse through a web-of-trust from one user ID to another. Details on that part of the system are still forthcoming.

It is also an open question as to what sort of storage engine makes a suitable base for an Unhosted storage node. The demo code includes servers written in PHP, Perl, and Python that all run on top of standard HTTP web servers. On the mailing list, others have discussed a simple way to implement Unhosted storage on top of WebDAV, but there is no reason that a storage node could not be implemented on top of a distributed filesystem like Tahoe, or a decentralized network like Bittorrent.

Perhaps the most fundamental obstacle facing Unhosted is that it eschews server-side processing altogether. Consequently, no processing can take place while the user is logged out of the application. Logged out could simply mean that the page or tab is closed, or an application could provide a logout mechanism that disconnects from the storage node, but continues to perform other functions. This is fine for interactive or message-based applications like instant messaging, but it limits the type of application that can be fit into the Unhosted mold. Judging by the mailing list, the project members have been exploring queuing up operations on the storage node side, which could enable more asynchronous functionality, but Unhosted is still not a replacement for every type of SaaS.

Actual code and holiday bake-offs

The project has a Github repository, which is home to some demonstration code showing off both parts of the Unhosted platform — although it loudly warns users that it is not meant for production use. The "cloudside" directory includes an example Unhosted storage node implementation, while the "wappside" directory includes three example applications designed to communicate with the storage node.

The storage node module speaks CORS and is written in PHP with a MySQL back-end. It does not contain any server-side user authentication, so it should not be deployed outside the local area network, but it works as a sample back-end for the example applications.

The example application set includes a JavaScript library named unhosted.js that incorporates RSA data signing and signature verification, encryption and decryption, and AJAX communication with the CORS storage node. There is a separate RSA key generation Web utility provided as a convenience, but it is not integrated into the example applications.

The example named "wappblog" is a simple blog-updating application. It creates a client-side post editor that updates the contents of an HTML file on a storage node, which is then retrieved for reading by a separate page. The "wappmail" application is a simple web mail application, which requires you to set up multiple user accounts, but shows off the ability to queue operations — incoming messages are stored and processed when each user logs in.

The third example is an address book, which demonstrates the fabric-based PKI system (although the documentation warns "it's so new that even I don't really understand how it works, and it's mainly there for people who are interested in the geeky details").

A more practical set of example applications are the third-party projects written for Unhosted's "Hacky Holidays" competition in December. The winning entry was Nathan Rugg's Scrapbook, which allows users to manipulate text and images on an HTML canvas, and shows how an Unhosted storage node can be used to store more than just plain text. Second place was shared between the instant messenger NezAIM and the note-taking application Notes.

The fourth entry, vCards, was deemed an honorable mention, although it used some client-side security techniques that would not work in a distributed environment in the real world (such as creating access control lists on the client side). The author of vCards was commended by the team for pushing the envelope of the protocol, though — he was one of the first to experiment with queuing operations so that one Unhosted application could pass messages to another.

Hackers wanted

At this stage, Unhosted is still primarily a proof-of-concept. The storage node code is very young, and has not been subjected to much real-world stress testing or security review. The developers are seeking input for the next (0.3) revision of UJJP, in which they hope to define better access control mechanisms for storage nodes (in part to enable inter-application communication) as well as a REST API.

On a bad day, I see "unresponsive script" warnings in Firefox and think rich client-side JavaScript applications sound like a terrible idea, but perhaps that is missing the bigger picture. StatusNet, Diaspora, and the other federated web services all do a good job of freeing users from reliance on one proprietary application vendor — but none of them are designed to make the storage underneath a flexible, replaceable commodity. One of the Unhosted project's liveliest metaphors for its storage de-coupling design is that it provides "a grease layer" between the hosted software and the servers that host it. That is an original idea, whether the top layer is written in JavaScript, or not.

Comments (10 posted)

The first stable release of LibreOffice

By Jake Edge
January 26, 2011

After just four months since splitting away from the OpenOffice.org project, LibreOffice has made its first stable release. While LibreOffice 3.3 and the (also just released) OpenOffice.org 3.3 share most of the same code, LibreOffice has started to differentiate itself from its progenitor. It has also built an impressive community in that time, and will be included in the next releases of the major community distributions. From almost any angle, it looks like LibreOffice is on a roll.

There are quite a few new features, as well as bug fixes, in the new release. Some of them may not seem all that new, at least to those who have been using the OpenOffice.org 3.3 release candidates. For some time, Linux users have generally been getting a much-enhanced version of OpenOffice.org based on the builds maintained by the Go-oo project. Since Go-oo has essentially morphed into the LibreOffice project, much of the new functionality will be found in both LibreOffice 3.3 and OpenOffice.org 3.3 on many distributions.

For example, the SVG import feature for Writer and Draw that is listed as a LibreOffice-only feature also appears in the latest OpenOffice.org for Fedora 14 (which is based on OpenOffice.org 3.3-rc9 plus the Go-oo patches). It may be that Windows and Mac OS X users are the most likely to notice big differences, depending on where they were getting their version of OpenOffice.org (from Sun/Oracle or Go-oo). It should also be noted that the SVG import feature still has some bugs to be excised. On an import of the SVG of the LWN penguin, both LibreOffice and OpenOffice.org took many minutes (on the order of ten) to render the SVG, and the rendering was incorrect. Both GIMP and Inkscape render it in seconds (or less) and are both in agreement that it should look much the way it does in the upper left of this page.

I gave LibreOffice 3.3 a try on Fedora 14. Not finding any experimental LibreOffice yum repository in a quick search, I decided to go ahead and download the tarball, which provided a handful of (unsigned) RPMs. After installing those, there is an additional "desktop-integration" RPM to install, which conflicted with the various OpenOffice.org packages that were still installed. After a moment's thought, I went ahead and removed OpenOffice.org, which proved uneventful as LibreOffice is a drop-in replacement.

Working with various documents and spreadsheets in LibreOffice was also uneventful, which is no real surprise. It's not clear what differences there are between Fedora's OpenOffice.org 3.3 and LibreOffice 3.3, but they were not particularly evident in the (fairly simple) documents that I worked with. For power users, perhaps there are more obvious differences. But there is also no reason to go back to OpenOffice.org that I can see. Apart from the LibreOffice start-up splash screen, it really isn't apparent that you aren't running OpenOffice.org.

Lots of Linux users will likely be using LibreOffice soon anyway, as Ubuntu, openSUSE, and Fedora all plan to ship it in their next release. openSUSE 11.4 is currently scheduled for March, so it may well be the first of those to switch over to LibreOffice. But Ubuntu 11.04 ("Natty Narwhal") and Fedora 15 won't be far behind, with the former scheduled for April and the latter for May. Debian "Squeeze" (6.0) will still be shipping OpenOffice.org (3.2.x), which is not surprising given the stability that Debian likes to bake into its releases.

Looking beyond the 3.3 release, LibreOffice has a fairly aggressive release schedule, with plans for 3.3.1 in February and 3.3.2 in March, both of which will consist mostly of bug fixes. There are also plans for a more major 3.4 release in May. Over time, the plan is to do major releases every six months and to try to align them with distribution release cycles by making those releases in March and September.

The biggest new feature in LibreOffice 3.3 is probably the SVG import mentioned earlier. Another is the ability to have spreadsheets with up to a million rows (the previous limit was 64K). Many of the rest seem like they will be popular with a much smaller subset of LibreOffice users, though the improved support for importing MS Works, Lotus Word Pro, and WordPerfect formats will likely strike a chord with folks who have to deal with documents in those formats.

Many of the new features listed seem like longstanding bugs (or misfeatures) of OpenOffice.org that are finally being addressed. An easier to use title page dialog box, a tree view for headings, better slide layout handling for Impress, radio button widgets in the menus, auto-correction that correctly matches the case of the word replaced, and so on, all seem like things that have been bugging users for some time but weren't getting addressed in the OpenOffice.org releases.

The ability to address some of these warts is part of why LibreOffice exists. The large number of patches that were carried along by the Go-oo project was not going to be a sustainable model, and the development style of the OpenOffice.org project made it unable, or unwilling, to incorporate many of those kinds of changes, at least quickly. The LibreOffice developers have clearly learned from that experience and are trying to fix these kinds of things as quickly as reasonable patches are submitted.

One of the goals of the LibreOffice project is to be welcoming to new developers and their patches. That's part of the reason that there is no contributor agreement required for LibreOffice patches. But the welcoming approach goes beyond that. The now semi-famous list of "easy hacks" as an easy introduction for developers (and others) is a perfect example. Many projects would probably find it easier to get people involved by maintaining a similar list.

There is also an active development mailing list, with discussions about all kinds of patches, bugs, and features. There are other mailing lists for users, design, marketing, documentation, and so on, along with an active #documentfoundation IRC channel on Freenode.

Some friction is to be expected in the formative stages of a new project and LibreOffice is not immune to that. The OOXML debate was one such incident. In addition, steering committee member Florian Effenberger alludes to some unhappiness in the community about the role of that committee. Project governance is by no means a solved problem, and community members will often disagree about the direction of the project and its leadership. That certainly isn't just a problem for new projects as the current turmoil in the FFmpeg project will attest.

OpenOffice.org is still chugging along, but a look at its mailing lists suggests that there is far less enthusiasm in that community than LibreOffice's. That may not be a good way to measure, or even estimate, a community's fervor, but it definitely seems like the wind has gone out of OpenOffice.org's sails. Oracle has an interest in continuing Oracle Open Office (formerly StarOffice)—the commercial offshoot of OpenOffice.org—development, but one has to wonder how long it will be willing to maintain an open source edition.

Because of the "corporate" development style and the contributor agreement requirements for OpenOffice.org—two of the major reasons that LibreOffice forked—it seems likely that external contributions, such as they were, will be on the decline. The two projects have the same LGPLv3 license, so code can theoretically migrate between them, but new features that go into LibreOffice may not make their way into OpenOffice.org because of the contributor agreement. That means that LibreOffice can cherry-pick features from OpenOffice.org, at least as long as the code bases don't diverge too much, while OpenOffice.org has to either forgo or reimplement them. Should LibreOffice be successful, it will provide a pretty clear object lesson on the perils of requiring contributor agreements.

Overall, the progress made by LibreOffice has been very impressive. Obviously the Go-oo project (and OpenOffice.org itself) gave the LibreOffice founders a good starting point—and a lot of lessons and experience—but that doesn't diminish what has been accomplished at all. One can only imagine the strides that will be made over the next year or two. It will no doubt be interesting to see where it goes from here.

Comments (1 posted)

LCA: Vint Cerf on re-engineering the Internet

By Jonathan Corbet
January 25, 2011
Vint Cerf is widely credited as one of the creators of the Internet. So, when he stood up at linux.conf.au in Brisbane to say that the net is currently in need of some "serious evolution," the attendees were more than prepared to listen. According to Vint, it is not too late to create a better Internet, despite the fact that we have missed a number of opportunities to update the net's infrastructure. Quite a few problems have been discovered over the years, but the solutions are within our reach.

His talk started back in 1969, when he was hacking SIGMA 7 to make it talk to the ARPAnet's first Internet message processor (IMP). The net has grown a little since then; current numbers suggest that there are around 768 million connected machines - and that doesn't count the vast numbers of systems with transient connections or which are hiding behind corporate firewalls. Nearly 2 billion users have access to the net. But, Vint said, that just means that the majority of the world's population is still waiting to connect to the net.

From the beginning, the net was designed around the open architecture ideas laid down by Bob Kahn. Military requirements were at the top of the list then, so the designers of the net created a system of independent networks connected via routers with no global control. Crucially, the designers had no particular application in mind, so there are relatively few assumptions built into the net's protocols. IP packets have no understanding of what [Vint Cerf] they carry; they are just hauling loads of bits around. Also important was the lack of any country-based addressing scheme. That just would not make sense in the military environment, Vint said, where it can be very difficult to get an address space allocation from a country which one is currently attacking.

The openness of the Internet was important: open source, open access, and open standards. But Vint was also convinced from an early date that commercialization of the Internet had to happen. There was no way that governments were going to pay for Internet access for all their citizens, so a commercial ecosystem had to be established to build that infrastructure.

The architecture of the network has seen some recent changes. At the top of the list is IPv6. Vint was, he said, embarrassed to be the one who decided, in 1977, that 32 bits would be more than enough for the net's addressing system. Those 32 bits are set to run out any day now, so, Vint said, if you're not doing IPv6, you should be. We're seeing the slow adoption of non-Latin domain names and the DNSSEC protocol. And, of course, there is the increasing prominence of mobile devices on the net.

One of the biggest problems which has emerged from the current Internet is security. He was "most disturbed" that many of the problems are not technical, they are a matter of suboptimal user behavior - bad passwords, for example. He'd like to see the widespread use of two-factor authentication on the net; Google is doing that internally now, and may try to support it more widely for use with Google services. The worst problems, he said, come from "dumb mistakes" like configuration errors.

So where are security problems coming from? Weak operating systems are clearly a part of the problem; Vint hoped that open-source systems would help to fix that. The biggest problem at the moment, though, is browsers. Once upon a time, browsers were simple rendering engines which posed little threat; now, though, they contain interpreters and run programs from the net. The browser, he said, has too much privilege in the system; we need a better framework in which to securely run web-based applications. Botnets are a problem, but they are really just a symptom of easily-penetrated systems. We all need to work on the search for better solutions.

Another big issue is privacy. User choices are a part of the problem here; people put information into public places without realizing that it could come back to haunt them later. Weak protection of information by third parties is also to blame, though. But, again, technology isn't the problem; it's more of a policy issue within businesses. Companies like [Vint Cerf] Google and others have come into possession of a great deal of privacy-sensitive information; they need to protect it accordingly.

Beyond that, there's the increasing prevalence of "invasive devices," including cameras, devices with location sensors, and more. It is going to be increasingly difficult to protect our privacy in the future; he expressed worries that it may simply not be possible.

There was some talk about clouds. Cloud computing, he said, has a lot of appeal. But each cloud is currently isolated; we need to work on how clouds can talk to each other. Just as the Internet was created through the connection of independent networks, perhaps we need an "intercloud" (your editor's term - he did not use it) to facilitate collaboration between clouds.

Vint had a long list of other research problems which have not been solved; there was not time to talk about them all. But, he says, we have "unfinished work" to deal with. This work can be done on the existing network - we do not need to dump it and start over.

So what is this unfinished work? "Security at all levels" was at the top of the list; if we can't solve the security problem, it's hard to see that the net will be sustainable in the long run. We currently have no equivalent to the Erlang distribution to describe usage at the edges of the network, making provisioning and scaling difficult. The quality of service (and network neutrality) debate, he said, will be going on for a very long time. We need better distributed algorithms to take advantage of mixed cloud environments.

There were, he said, some architectural mistakes made which are now making things harder. When the division was made between the TCP and IP layers, it was decided that TCP would use the same addressing scheme as IP. That was seen as a clever design at the time; it eliminated the need to implement another layer of addressing at the TCP level. But it was a mistake, because it binds higher-level communications to whatever IP address was in use when the connection was initiated. There is no way to move a device to a new address without breaking all of those connections. In the designers' defense, he noted, the machines at the time, being approximately room-sized, were not particularly mobile. But he wishes they had seen mobile computing coming.

Higher-level addressing could still be fixed by separating the address used by TCP from that used by IP. Phone numbers, he said, once were tied to a specific location; now they are a high-level identifier which can be rebound as a phone moves. The same could be done for network-attached devices. Of course, there are problems to be solved - for example, it must be possible to rebind a TCP address to a new IP address in a way which does [Vint Cerf] not expose users to session hijacking. This sort of high-level binding would also solve the multi-homing and multipath problems; it would be possible to route a single connection transparently through multiple ISPs.

Vint would also like to see us making better use of the net's broadcast capabilities. Broadcast makes sense for real-time video, but it could be applied in any situation where multiple users are interested in the same content - for software updates, for example. He described the use of satellites to "rain packets" to receivers; it is, he said, something which could be done today.

Authentication remains an open issue; we need better standards and some sort of internationally-recognized indicators of identity. Internet governance was on the list; he cited the debate over network censorship in Australia as an example. That sort of approach, he said, is "not very effective." He said there may be times when we (for some value of "we") decide that certain things should not be found on the net; in such situations, it is best to simply remove such materials when they are found. There is no hope in any attempt to stop the posting of undesirable material in the first place. Governance, he said, will only become more important in the future; we need to find a way to run the net which preserves its fundamental openness and freedom.

Performance: That just gets harder as the net gets bigger; it can be incredibly difficult to figure out where things are going wrong. He said that he would like a button marked "WTF" on his devices; that button could be pressed when the net isn't working to obtain a diagnosis of what the problem is. But, to do that, we need better ways of identifying performance problems on the net.

Addressing: what, he asked, should be addressable on the Internet? Currently we assign addresses to machines, but, perhaps, we should assign addresses to digital objects as well? A spreadsheet could have its own address, perhaps. One could argue that a URL is such an address, but URLs are dependent on the domain name system and can break at any time. Important objects should have locators which can last over the long term.

Along those lines, we need to think about the long-term future of complex digital objects which can only be rendered with computers. If the software which can interpret such an object goes away, the objects themselves essentially evaporate. He asked: will Windows 3000 be able to interpret a 1997 Powerpoint file? We should be thinking about how these files will remain readable over the course of thousands of years. Open source can help in this regard, but proprietary applications matter too. He suggested that there should be some way to "absorb" the intellectual property of companies which fail, making it available so that files created by that company's software remain readable. Again, Linux and open source have helped to avoid that problem, but they are not a complete solution. We need to think harder about how we will preserve our "digital stuff"; he is not sure what the solution will look like.

Wandering into more fun stuff, Vint talked a bit about the next generation of devices; a network-attached surfboard featured prominently. He talked some about the sensor network in his house, including the all-important temperature sensor which sends him a message if the temperature in his wine cellar exceeds a threshold. But he'd like more information; he knows about temperature events, or whether somebody entered the room, but there's no information about what may have happened in the cellar. So maybe it is time to put RFIDs on the bottles themselves. But that won't help him to know if a specific bottle has gotten too warm; maybe it's time to put [Vint Cerf] sensors into the corks to track the state of the wine. Then he could unerringly pick out a ruined bottle whenever he had to give a bottle of wine to somebody who is unable to tell the difference.

The talk concluded with some discussion of the interplanetary network. There was some amusing talk of alien porn and oversized ovipositors, but the real problem is one of arranging for network communications within the solar system. The speed of light is too slow, meaning that the one-way latency to Mars is, at a minimum, about three minutes (and usually quite a bit more). Planetary rotation can interrupt communications to specific nodes; rotation, he says, is a problem we have not yet figured out how to solve. So we need to build tolerance of delay and disruption deep into our protocols. Some thoughts on this topic have been set down in RFC 4838, but there is more to be done.

We should also, Vint said, build network nodes into every device we send out into space. Even after a device ceases to perform its primary function, it can serve as a relay for communications. Over time, we could deploy a fair amount of network infrastructure in space with little added cost. That is a future he does not expect to see in its full form, but he would be content to see its beginning.

There was a question from the audience about bufferbloat. It is, Vint said, a "huge problem" that can only be resolved by getting device manufacturers to fix their products. Ted Ts'o pointed out that LCA attendees had been advised (via a leaflet in the conference goodie bag) to increase buffering on their systems as a way of getting better network performance in Australia; Vint responded that much harm is done by people who are trying to help.

Comments (107 posted)

LCA: IP address exhaustion and the end of the open net

By Jonathan Corbet
January 26, 2011
Geoff Huston is the Chief Scientist at the Asia Pacific Network Information Centre. His frank linux.conf.au 2011 keynote took a rather different tack than Vint Cerf's talk did the day before. According to Geoff, Vint is "a professional optimist." Geoff was not even slightly optimistic; he sees a difficult period coming for the net; unless things happen impossibly quickly, the open net that we often take for granted may be gone forevermore.

The net, Geoff said, is based on two "accidental technologies": Unix and packet switching. Both were new at their time, and both benefited from open-source reference implementations. That openness created a network which was accessible, neutral, extensible, and commercially exploitable. [Geoff Huston] As a result, proprietary protocols and systems died, and we now have a "networking monoculture" where TCP/IP dominates everything. Openness was the key: IPv4 was as mediocre as any other networking technology at that time. It won not through technical superiority, but because it was open.

But staying open can be a real problem. According to Geoff, we're about to see "another fight of titans" over the future of the net; it's not at all clear that we'll still have an open net five years from now. Useful technologies are not static, they change in response to the world around them. Uses of technologies change: nobody expected all the mobile networking users that we now have; otherwise we wouldn't be in the situation we're in where, among other things, "TCP over wireless is crap."

There are many challenges coming. Network neutrality will be a big fight, especially in the US. We're seeing more next-generation networks based around proprietary technologies. Mobile services tend to be based on patent-encumbered, closed applications. Attempts to bundle multiple types of services - phone, television, Internet, etc. - are pushing providers toward closed models.

The real problem

But the biggest single challenge by far is the simple fact that we are out of IP addresses. There were 190 million IP addresses allocated in 2009, and 249 million allocated in 2010. There are very few addresses left at this time: IANA will run out of IPv4 addresses in early February, and the regional authorities will start running out in July. The game is over. Without open addressing, we don't have an open network that anybody can join. That, he said, is "a bit of a bummer."

This problem was foreseen back in 1990; in response, a nice plan - IPv6 - was developed to ensure that we would never run out of network addresses. That plan assumed that the transition to IPv6 would be well underway by the time that IPv4 addresses were exhausted. Now that we're at that point, how is that plan going? Badly: currently 0.3% of the systems on the net are running IPv6. So, Geoff said, we're now in a position where we have to do a full transition to IPv6 in about seven months - is that feasible?

To make that transition, we'll have to do more than assign IPv6 addresses to systems. This technology will have to be deployed across something like 1.8 billion people, hundreds of millions of routers, and more. There's lots of fun system administration work to be done; think about all of the firewall configuration scripts which need to be rewritten. Geoff's question to the audience was clear: "you've got 200 days to get this done - what are you doing here??"

Even if the transition can be done in time, there's another little problem: the user experience of IPv6 is poor. It's slow, and often unreliable. Are we really going to go through 200 days of panic to get to a situation which is, from a user point of view, worse than what we have now? Geoff concludes that IPv6 is simply not the answer in that time frame - that transition is not going to happen. So what will we do instead?

One commonly-suggested approach is to make much heavier use of network address translation (NAT) in routers. A network sitting behind a NAT router does not have globally-visible addresses; hiding large parts of the net behind multiple layers of NAT can thus reduce the pressure on the address space. But it's not quite that simple.

[Geoff Huston] Currently, NAT routers are an externalized cost for Internet service providers; they are run by customers and ISP's need not worry about them. Adding more layers of NAT will force ISPs to install those routers. And, Geoff said, we're not talking about little NAT routers - they have to be really big NAT routers which cannot fail. They will not be cheap. Even then, there are problems: multiple levels of NAT will break applications which have been carefully crafted to work around a single NAT router. How NAT routers will play together is unclear - the IETF refused to standardize NAT, so every NAT implementation is creative in its own way.

It gets worse: adding more layers of NAT will break the net in fundamental ways. Every connection through a NAT router requires a port on that router; a single web browser can open several connections in an attempt to speed page loading. A large NAT router will have to handle large numbers of connections simultaneously, to the point that it will run out of port numbers. Ports numbers are only 16 bits, after all. So ISPs are going to have to think about how many ports they will make available to each customer; that number will converge toward "one" as the pressure grows. Our aperture to the net, Geoff said, is shrinking.

So perhaps we're back to IPv6, somehow. But there is no compatibility between IPv4 and IPv6, so systems will have to run both protocols during the transition. The transition plan, after all, assumed that it would be completed before IPv4 addresses ran out. But that plan did not work; it was, Geoff said, "economic rubbish." But we're going to have to live with the consequences, which include running dual stacks for a transition period that, he thinks, could easily take ten years.

During that time, we're going to have to figure out how to make our existing IPv4 addresses last longer. Those addresses, he said, are going to become quite a bit more expensive. There will be much more use of NAT, and, perhaps, better use of current private addresses. Rationing policies will be put into place, and governmental regulation may well come into play. And, meanwhile, we know very little about the future we're heading into. TCP/IP is a monoculture, there is nothing to replace it. We don't know how long the transition will take, we don't know who the winners and losers will be, and we don't know the cost. We live in interesting times.

An end to openness

Geoff worried that, in the end, we may never get to the point where we have a new, IPv6 network with the same degree of openness we have now. Instead, we may be heading toward a world where we have privatized large parts of the address space. The problem is this: the companies which have lost the most as the result of the explosion of the Internet - the carriers - are now the companies which are expected to fund and implement the transition [Geoff Huston] to IPv6. They are the ones who have to make the investment to bring this future around; will they really spend their money to make their future worse? These companies have no motivation to create a new, open network.

So what about the companies which have benefited from the open net: companies like Google, Amazon, and eBay? They are not going to rescue us either for one simple reason: they are now incumbents. They have no incentive to spend money which will serve mainly to enable more competitors. They are in a position where they can pay whatever it takes to get the address space they need; a high cost to be on the net, is, for them, a welcome barrier to entry that will keep competition down. We should not expect help from that direction.

So perhaps it is the consumers who have to pay for this transition. But Geoff did not see that as being realistic either. Who is going to pay $20/month more for a dual-stack network which works worse than the one they have now? If one ISP attempts to impose such a charge, customers will flee to a competitor which does not. Consumers will not fund the change.

So the future looks dark. The longer we toy with disaster, Geoff said, the more likely it is that the real loser will be openness. It is not at all obvious that we'll continue to have an open net in the future. He doesn't like that scenario; it is, he said, the worst possible outcome. We all have to get out there, get moving, and fix this problem.

One of the questions asked was: what can we do? His answer was that we really need to make better software: "IPv6 sucks" currently. Whenever IPv6 is used, performance goes down the drain. Nobody has yet done the work to make the implementations truly robust and fast. As a result, even systems which are capable of speaking both protocols will try to use IPv4 first; otherwise, the user experience is terrible. Until we fix that problem, it's hard to see how the transition can go ahead.

Comments (198 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Security: Do Not Track; New vulnerabilities in awstats, OpenOffice.org, tomcat, wordpress,...
  • Kernel: BKL; LCA: Server power management; Concurrent code and expensive instructions; SCSI targets
  • Distributions: Fedora Goals Coming into Focus; Debian derivatives census, EPEL 6, Foresight, cross-distro collaboration, ...
  • Development: Correlating log messages with syslog-ng; KDE, LibreOffice, OpenOffice.org, OpenSSH,...
  • Announcements: GSoC 2011, OSU OSL Supercell, interviews, Android vs. Oracle, Sony v. Hotz, ...
Next page: Security>>

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds