LWN.net Logo

Leading items

Ruminations on the "cloud problem"

By Jake Edge
January 12, 2011

While the term "cloud" is fairly amorphous, the central idea of ubiquitous access to one's data and applications from anywhere—on all kinds of devices—is clearly attractive. But, at least as currently implemented, cloud computing is worrisome to free software advocates and anyone even a little concerned about their privacy. There are, of course, various free software efforts to offer alternatives to some of the more popular cloud services but, by and large, they are just getting started and have seen little to no adoption. The beginning of a new year is a good time to consider what cloud services might look like if—when—they become more freedom and privacy oriented.

What kind of services are we talking about here? There are the obvious candidates like social networking and email—largely dominated by Facebook, Twitter, Gmail, and the like—but there are, other possibilities as well. Easily accessible, and affordable, cloud data storage would allow users to access documents, music, ebooks, movies, pictures, and so on from any internet-connected device. Sharing and collaborating using that information with friends, relatives, and colleagues should be straightforward as well.

Much of this infrastructure already exists in various "walled gardens" that seek to lock users into their service to the exclusion of others. Facebook doesn't make it easy (or even possible in some cases) to remove one's data from their clutches, nor to collect information from "your" social network. That has good and bad points, of course, as most would probably prefer that their email address not get collected by a spammer posing as a "friend". Google has made some efforts to make it easier for users to get their data out, in particular the Data Liberation Front, but most cloud application providers are trying their best to keep users locked in.

Not storing unencrypted personal data on the servers of the cloud application providers is the only really foolproof method of retaining control over that data. The model envisioned by the Diaspora project is interesting because the data stays on the user's server (or one under his control). The Diaspora application then facilitates sharing that data with various subsets of the user's "friends". If Grandpa posts a link to embarrassing photos stored in Diaspora to another service, the user can easily remove access because that data stays under his control. Nothing can really protect against Grandpa (or someone else with access) actually posting the photos elsewhere, rather than just a link; one must be able to trust the people that they give access to.

But it is more than just photos and snarky status updates. Email is another, obvious candidate for cloud storage. Many folks use Gmail or other services, but there are privacy implications even if Google makes it relatively easy to pull email out of its system. Governments have seemed to be easily able to access information in email accounts, sometimes without even the nicety of a subpoena or other legal document. Employees of those services are likely to be able to access the messages stored in email accounts as well.

Beyond that, how about text documents, spreadsheets, favorite applications, desktop settings, browser bookmarks, Gnucash or Quicken data files, and so on? For the most part, those currently live in user's desktop home directories, with semi-synchronized versions living on laptops, GoogleDocs (and the like), and on smartphones—if they are available elsewhere at all. Firefox and other browsers have ways to sync browser data (bookmarks and settings) between multiple browser instances, but why should those settings be treated any differently than Thunderbird preferences, or GNOME/KDE settings? Will there need to be a distinct mechanism to sync each and every different application?

It would be nice to believe that some day there will be ways to securely store this kind of data "in the cloud", such that only the owner of the data has the keys to decrypt it. There are already existing services, like Dropbox or SparkleShare, where users can store data, encrypted or not, but they lack an access layer that handles the encryption cleanly. Users must be able to access and share the encrypted data without turning over the keys to the storage provider. The technical challenges of that aren't massively difficult on the cryptographic side, as the Tahoe secure filesystem shows, but there are still a number of other hurdles to overcome.

In order for there to be any reasonable level of adoption by the general public, any kind of cloud server solution will have to be easy to use. Coming up with a way to tie together the storage for disparate objects like email, settings/preferences, documents, and so forth will be challenging enough without making wholesale changes to the applications themselves. But any sensible solution also needs to account for the possibility that users will want/need to access that data when the internet is not available. Changes would then need to be synced at some later point.

While free software applications would be relatively easy to change to support some kind of new protocol for retrieving and updating settings and the like, it might be easier to avoid that for existing applications. Instead, some kind of "wizard" could be created that understands the local storage used by various applications (both free and proprietary) and could manage the transfer and synchronization as needed. Newer applications or major updates to existing programs could, of course, take this cloud storage mechanism into account.

Another hurdle is that internet-connected servers cost money. Most users, especially those who are not particularly technically savvy, won't want to run their own server. Instead, some kind of low-cost, easy-to-use, services would need to be available to provide those users a landing place for their encrypted data. Given the prevalence (and popularity) of gratis web services, it may well be that getting the general public to pay for that kind of service is difficult or impossible. If so, it will be their loss, as the current situation turns users into the product to be sold to advertisers and others, as has been noted elsewhere.

For the rest of us, perhaps, the addition of an income stream for storage providers will turn that relationship on its head, making the users into customers. Given that a system that respects privacy really won't have much in the way of useful data to sell to advertisers, since encryption will be the norm, there needs to be another way to generate income. While it certainly won't generate the enormous market valuations that companies like Facebook do, there will hopefully be enough business to support some cloud storage providers. Even users that want to run their own server may have use for a backup elsewhere, and if the service is cheap enough for a nice chunk of storage (on the order of $5/month for example), it will likely be easy to justify.

Maybe these ideas are overambitious and/or too pie-in-the-sky. Privacy is not very highly valued by most these days, so it may well be that storing one's data in the cloud will really mean that it gets stored with Google, Facebook, Apple, or others. Other than a lot of work, there are no huge technical barriers to overcome. Some kind of protocol needs to be established or adopted, some encryption key management issues need to be considered, and so on, but they aren't terribly difficult. Instead, the difficult barriers are largely social and political.

On the other hand, though, it sure would be nice to be on the road some day, open my laptop (or tablet or phone or ...), and pick up right where I left off at home, with access to the same information, settings, applications, and so on. Hopefully I won't have to wait as long for that as I've been waiting for my personal robot and flying car ...

Comments (30 posted)

OpenStreetMap's point of no return

By Jonathan Corbet
January 12, 2011
Back in 2008, LWN reported on the OpenStreetMap project and its plan to change the licensing of its map database. This change was controversial, to say the least, but the project as a whole appeared to be determined to press forward with it. At the beginning of 2011, the license change has not yet happened. But it has now been determined that April 1 of this year will be an important milestone date in this process. This could be interesting to watch, as the project is still not entirely sure of what it is changing to.

The new license - the Open Database License (ODBL) - is well understood. The ODBL is an attempt to stretch European-style database rights to the point where they cover the database worldwide. To that end, the ODBL is explicitly written as a contract - a crucial difference from most free licenses, which try to avoid contract law entirely. The ODBL must take this approach because the OpenStreetMap database, being primarily factual in nature, is not easily covered by copyright. A license which relied strictly upon copyright law would risk being unenforceable in much of the world.

Of course, relying on contract law has its own difficulties - contracts are only binding if everybody involved has agreed to them. Direct downloads of the database from OpenStreetMap will require a click-through agreement, but further redistribution (which is naturally allowed by the license) need not involve any such formalities. If there is ever a case in a part of the world which does not recognize database copyrights, where the defendant denies having ever agreed to the contract, the outcome could be interesting to say the least.

Be that as it may, the project Foundation voted to change over to the ODBL. But a vote does not give the OpenStreetMap Foundation the right to change the license on previously-contributed data. So, before the database as a whole can move to the ODBL, the project must (1) convince all contributors to agree to a relicensing of "their" data, or (2) remove data contributed by people who are unwilling to agree. To that end, OpenStreetMap has been trying to get contributors to agree to a contributor agreement which gives the Foundation some wide-ranging rights:

Subject to Section 3 below, You hereby grant to OSMF a worldwide, royalty-free, non-exclusive, perpetual, irrevocable license to do any act that is restricted by copyright over anything within the Contents, whether in the original medium or any other. These rights explicitly include commercial use, and do not exclude any field of endeavour. These rights include, without limitation, the right to sublicense the work through multiple tiers of sublicensees.

The "Section 3" mentioned above restricts the Foundation to the use of ODBL, CC-BY-SA, or "another free and open license." This agreement has been somewhat controversial within the project for a number of reasons, starting with the mechanism by which the license could be changed (again) in the future: a vote of "active contributors" would be held. Some people seem to fear that a future, dark-side Foundation could restrict contribution for a bit, then hold a rigged election to obtain the results it wants.

The contributor agreement also restricts contributors to adding data to which they, personally, hold the copyright (if any). Much of the data going into OpenStreetMap, though, comes from governmental sources and may have its own license terms applied to it. Forgoing that data seems undesirable, so some contributors understandably complained. In response, there is now a draft update to the agreement which softens that requirement. This draft has not yet been adopted, though, and there are suggestions that further changes are in the works.

Despite the lack of an updated agreement, the OpenStreetMap board recently mandated that, after March 31, only contributors who have accepted the agreement will be allowed to make changes to the database. That clears the way for the final step: the removal of all data for which permission to relicense has not been obtained. Some contributors fear that quite a bit of data could be lost at that time.

How much data is entirely unclear. There appears to be no publicly-available information on how many contributors have accepted the agreement, or how much data can be relicensed. There seems to be confusion about what will happen to data contributed by one person (who may not have accepted the agreement) which was subsequently edited by another (who did agree) - or vice versa. People within the Foundation may have a good idea of what the consequences of the license change will be, but they don't seem to be talking much; requests for information (example) have gone unanswered. The board did say, in its December 2010 meeting minutes, that:

The board discussed the issue of data loss and expects, considering what has been seen, it will not be a showstopper at the time of the final switch.

In any case, that "final switch" may still be some time in the future. The April 1 deadline ensures that new data is ODBL-compatible, but it does not, itself, force a relicensing of the database. The OpenStreetMap license change page contains a lot of information about the new license and the motivation for the change, but it contains no dates for an actual changeover. So this transition, which has dragged on for some years, could continue to drag for a while yet, especially if it looks like a lot of data could be lost. The prospect of a significantly reduced map database could give strength to the loud contingent of contributors who would rather see the project just put the data into the public domain and be done with it.

The motivations which are driving the move to the ODBL are similar to those behind the use of the GPL for code; contributors do not want to see others distribute enhanced versions of their work without giving back their changes. But trying to extend the reach of copyright to data it does not naturally cover, in a project involving many thousands of contributors, is never going to be an easy thing to do. There is no clear map showing a way out of this situation.

Comments (83 posted)

Supporting OOXML in LibreOffice

By Jonathan Corbet
January 11, 2011
The formation of the Documentation Foundation and the launch of the LibreOffice project have created a user and developer community which has few parallels elsewhere in the free software world. This community is huge given the newness of the project, and it appears to include many people who have not engaged with free software development in the past. As a result, the Foundation's mailing lists sometimes host conversations that wouldn't be found in other projects. An extensive and sometimes bitter debate on whether LibreOffice should write files in the OOXML format is a good example of differing views of how this project (and free software in general) should work.

OOXML, of course, is the Microsoft-driven "standard" alternative to the ODF format. Given its sponsor and the dubious means by which it attained "standard" status, OOXML was always going to be controversial. The simple fact of the matter, though, is that, if Office writes OOXML files, then those files will proliferate. Whether we like the format or not has little influence on the final result.

Just before the end of the year, Larry Gusaas called on the LibreOffice community to refuse to support the writing of OOXML files. Standard OpenOffice.org is able to read such files, but will not write them; that is, according to Larry, how things should be. But LibreOffice is based on the Go-oo project, which is the version of OpenOffice.org which has actually been shipped by most Linux distributions. This version does have the ability to write OOXML files; thus, LibreOffice does as well.

Quite a few people supported Larry's desire for read-only OOXML support in LibreOffice; one could easily peruse the thread and come to the conclusion that the LibreOffice community is overwhelming opposed to the idea of writing in that format. Even so, a number of LibreOffice developers have made it clear (repeatedly) that they have no intention of removing the ability to write OOXML files. There is, thus, no need to worry that we might have to go on using Go-oo after all.

There are many reasons for LibreOffice to support this format, even if the community has to collectively hold its nose in the process. The reality of the situation is that many LibreOffice users will need to work with people who send them OOXML files and will expect to get a response in the same format. Telling collaborators that their choice of document format is unacceptable works in some situations, but a corporate employee who talks that way to a customer may soon end up with a great deal of unexpected free time. A LibreOffice which cannot write OOXML files would be unsuited to many environments, and adoption would suffer accordingly.

Beyond that, as has been pointed out in the discussion, Microsoft will, someday, phase out support for its (equally proprietary) DOC format, leaving OOXML as the only real option for document interchange. There appears to be little hope that Microsoft's ODF support will be sufficient to make ODF a viable alternative. So any office productivity suite which aspires to millions of users, and which does not support OOXML, will find itself scrambling to add that support when DOC is no longer an option. It seems better to maintain (and improve) that support now than to be rushing to merge a substandard implementation in the future.

A number of people in the LibreOffice community seem to be under the impression that free software is about fighting Microsoft. But free software is really about giving freedom to its users and developers; one of the key ways in which that freedom has been expressed since the beginning is through a high level of interoperability. Linux systems speak most protocols, and they handle most file formats of interest. That makes it possible to plug in a Linux system almost anywhere and to work with almost everybody. We should think long and hard before we walk away from that sort of freedom.

We should also think about (1) whether a project like LibreOffice really has the weight to affect document format use by withholding support for those it doesn't like, and (2) whether we as a community would want to use that power in that way if we did have it.

There is a separate message from Larry which brought out another interesting aspect of the debate:

It should be a community decision, not one made by the developers. Or based on LibreOffice being based on Go-OO code which already had OOXML write support because of the Novell agreement with Microsoft.

It is a rare free software project indeed which allows a decision of this kind to be made by anybody but the developers involved. Most such projects are those controlled by corporations which have no qualms about vetoing features which do not align with The Official Product Roadmap. Debian does allow the community to shape its distribution through its general resolution mechanism, but those who are allowed to vote on resolutions are almost exclusively developers, and they are all contributors. Few other communities even have a way by which the community as a whole could attempt to make such a decision, much less enforce it. The Document Foundation's proposed bylaws do envision a board of directors and an engineering steering committee which could address such issues, but such institutions will only override the developers in extreme situations; otherwise, they tend not to have many developers to override.

In the free software world, the people who do the work make the decisions about that work, and there are few who would seek to change that state of affairs. In this discussion, developers have not been calling for the removal of full OOXML support, and no patches to that effect have been posted. LibreOffice is shipping with that support, and that situation seems unlikely to change.

So, it seems, the LibreOffice developers have made the decision to continue to support writing the OOXML format. They are well aware that OOXML is not an ideal document format, that its attainment of "standard" status was shadowy at best, that it is another proprietary moving-target format, and that there is still some patent uncertainty surrounding it. They are aware that the much more open (though still imperfect) ODF format is preferable, and that ODF should be the default format used by LibreOffice. But they have also concluded that supporting OOXML gives more freedom and capability to their users and is good for LibreOffice in the long term.

Comments (37 posted)

Page editor: Jonathan Corbet
Next page: Security>>

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds