LWN.net Logo

Ruminations on the "cloud problem"

By Jake Edge
January 12, 2011

While the term "cloud" is fairly amorphous, the central idea of ubiquitous access to one's data and applications from anywhere—on all kinds of devices—is clearly attractive. But, at least as currently implemented, cloud computing is worrisome to free software advocates and anyone even a little concerned about their privacy. There are, of course, various free software efforts to offer alternatives to some of the more popular cloud services but, by and large, they are just getting started and have seen little to no adoption. The beginning of a new year is a good time to consider what cloud services might look like if—when—they become more freedom and privacy oriented.

What kind of services are we talking about here? There are the obvious candidates like social networking and email—largely dominated by Facebook, Twitter, Gmail, and the like—but there are, other possibilities as well. Easily accessible, and affordable, cloud data storage would allow users to access documents, music, ebooks, movies, pictures, and so on from any internet-connected device. Sharing and collaborating using that information with friends, relatives, and colleagues should be straightforward as well.

Much of this infrastructure already exists in various "walled gardens" that seek to lock users into their service to the exclusion of others. Facebook doesn't make it easy (or even possible in some cases) to remove one's data from their clutches, nor to collect information from "your" social network. That has good and bad points, of course, as most would probably prefer that their email address not get collected by a spammer posing as a "friend". Google has made some efforts to make it easier for users to get their data out, in particular the Data Liberation Front, but most cloud application providers are trying their best to keep users locked in.

Not storing unencrypted personal data on the servers of the cloud application providers is the only really foolproof method of retaining control over that data. The model envisioned by the Diaspora project is interesting because the data stays on the user's server (or one under his control). The Diaspora application then facilitates sharing that data with various subsets of the user's "friends". If Grandpa posts a link to embarrassing photos stored in Diaspora to another service, the user can easily remove access because that data stays under his control. Nothing can really protect against Grandpa (or someone else with access) actually posting the photos elsewhere, rather than just a link; one must be able to trust the people that they give access to.

But it is more than just photos and snarky status updates. Email is another, obvious candidate for cloud storage. Many folks use Gmail or other services, but there are privacy implications even if Google makes it relatively easy to pull email out of its system. Governments have seemed to be easily able to access information in email accounts, sometimes without even the nicety of a subpoena or other legal document. Employees of those services are likely to be able to access the messages stored in email accounts as well.

Beyond that, how about text documents, spreadsheets, favorite applications, desktop settings, browser bookmarks, Gnucash or Quicken data files, and so on? For the most part, those currently live in user's desktop home directories, with semi-synchronized versions living on laptops, GoogleDocs (and the like), and on smartphones—if they are available elsewhere at all. Firefox and other browsers have ways to sync browser data (bookmarks and settings) between multiple browser instances, but why should those settings be treated any differently than Thunderbird preferences, or GNOME/KDE settings? Will there need to be a distinct mechanism to sync each and every different application?

It would be nice to believe that some day there will be ways to securely store this kind of data "in the cloud", such that only the owner of the data has the keys to decrypt it. There are already existing services, like Dropbox or SparkleShare, where users can store data, encrypted or not, but they lack an access layer that handles the encryption cleanly. Users must be able to access and share the encrypted data without turning over the keys to the storage provider. The technical challenges of that aren't massively difficult on the cryptographic side, as the Tahoe secure filesystem shows, but there are still a number of other hurdles to overcome.

In order for there to be any reasonable level of adoption by the general public, any kind of cloud server solution will have to be easy to use. Coming up with a way to tie together the storage for disparate objects like email, settings/preferences, documents, and so forth will be challenging enough without making wholesale changes to the applications themselves. But any sensible solution also needs to account for the possibility that users will want/need to access that data when the internet is not available. Changes would then need to be synced at some later point.

While free software applications would be relatively easy to change to support some kind of new protocol for retrieving and updating settings and the like, it might be easier to avoid that for existing applications. Instead, some kind of "wizard" could be created that understands the local storage used by various applications (both free and proprietary) and could manage the transfer and synchronization as needed. Newer applications or major updates to existing programs could, of course, take this cloud storage mechanism into account.

Another hurdle is that internet-connected servers cost money. Most users, especially those who are not particularly technically savvy, won't want to run their own server. Instead, some kind of low-cost, easy-to-use, services would need to be available to provide those users a landing place for their encrypted data. Given the prevalence (and popularity) of gratis web services, it may well be that getting the general public to pay for that kind of service is difficult or impossible. If so, it will be their loss, as the current situation turns users into the product to be sold to advertisers and others, as has been noted elsewhere.

For the rest of us, perhaps, the addition of an income stream for storage providers will turn that relationship on its head, making the users into customers. Given that a system that respects privacy really won't have much in the way of useful data to sell to advertisers, since encryption will be the norm, there needs to be another way to generate income. While it certainly won't generate the enormous market valuations that companies like Facebook do, there will hopefully be enough business to support some cloud storage providers. Even users that want to run their own server may have use for a backup elsewhere, and if the service is cheap enough for a nice chunk of storage (on the order of $5/month for example), it will likely be easy to justify.

Maybe these ideas are overambitious and/or too pie-in-the-sky. Privacy is not very highly valued by most these days, so it may well be that storing one's data in the cloud will really mean that it gets stored with Google, Facebook, Apple, or others. Other than a lot of work, there are no huge technical barriers to overcome. Some kind of protocol needs to be established or adopted, some encryption key management issues need to be considered, and so on, but they aren't terribly difficult. Instead, the difficult barriers are largely social and political.

On the other hand, though, it sure would be nice to be on the road some day, open my laptop (or tablet or phone or ...), and pick up right where I left off at home, with access to the same information, settings, applications, and so on. Hopefully I won't have to wait as long for that as I've been waiting for my personal robot and flying car ...


(Log in to post comments)

Ruminations on the "cloud problem"

Posted Jan 13, 2011 6:58 UTC (Thu) by grahame (subscriber, #5823) [Link]

Really simple hack I knocked up; you encrypt private data into an RSS feed, then serve that RSS feed unprotected and subscribe to it in Google Reader. You can then decrypt it in the browser using a greasemonkey script.
http://code.google.com/p/aesulate/

Seems to work quite well and google aren't privy to the data, even though I get to use their great feed reader.

The data is stored AES encrypted in counter mode, in hunks annotated like this {AES}[base64 AES encoded data].

Ruminations on the "cloud problem"

Posted Jan 13, 2011 16:13 UTC (Thu) by karim (subscriber, #114) [Link]

Simply brilliant!

Ruminations on the "cloud problem"

Posted Jan 13, 2011 7:34 UTC (Thu) by Oddscurity (guest, #46851) [Link]

What might work is alternative firmware on a router. That router might have a USB port for an external harddisk to serve up content from your personal cloud, or it could take it from a PC on the internal network.

Then some client software to install on your PC's and smartphones, point it to your fixed IP or dyndns address, select what you want synced/exported, and voila.

Later on some savvy router manufacturer might want to add this out of the box as a selling point. "Your stuff, everywhere; secure, private."

Ruminations on the "cloud problem"

Posted Jan 14, 2011 0:40 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

What might work is alternative firmware on a router. That router might have a USB port for an external harddisk to serve up content from your personal cloud, or it could take it from a PC on the internal network.

And I take it this router is a piece of hardware installed in the user's home. Why? Wouldn't it be more efficient and more reliable to make it a virtual entity managed by and physically present at Google or AT&T?

Ruminations on the "cloud problem"

Posted Jan 14, 2011 13:52 UTC (Fri) by Oddscurity (guest, #46851) [Link]

That might be easier for some people, but that road leads to vendor lock-in, loss of privacy and so on. In other words the same situation we're in already.

So yes, the whole point is to make it part of people's DSL/cable/fibre modems so that they have full control over their stuff. It's additionally very convenient as well, you get to publish things at the speed of your internal network.

If sufficiently polished, the interface to this could be a whole lot more usable and consistent than all the various disparate services now used for the same purpose. So in time the 'easier' would no longer be a bonus for Facebook, Google, et al.

I could be mistaken, but I thought the EFF was working on something along those lines under the umbrella 'Freedom Box'. I like the idea of an internet that's many to many instead of many to few, along with the additional resilience this brings to censorship (as well as earlier stated benefits).

Those who need better uptime than that which their residential service provides can always opt for a colocated or 3rd party solution, with the associated impact that might have on their privacy.

Ruminations on the "cloud problem"

Posted Jan 15, 2011 3:49 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

I'm not sure there's really a difference in lock-in and level of control between having a file server in your house and having it in some big company's data center (and in the latter case, between having it be a chunk of hardware in that data center or just a service of a larger computer).

For engineers, for sure a machine in your house is going to give you that control. You can configure it how you like, replace hardware and software components at will and even write code. But for the masses, you're just as locked in to a vendor either way. You get a router from Linksys with this feature in it and you have to do things Linksys's way until you can suffer the investment to jump to Netgear. If Linksys abandons you, you'll probably be high and dry before long.

We can easily get privacy out of the picture. We're talking about inventing a new protocol and implementing both sides of it either way, so we can just go ahead and make the data encrypted at rest and on the wire and then it's approximately as hard for someone to get your data from the Google data center as from your house.

Ruminations on the "cloud problem"

Posted Jan 15, 2011 4:49 UTC (Sat) by Oddscurity (guest, #46851) [Link]

I agree with you on this in principle. However, and herein lies the rub, where does this data at rest get decrypted, processed and served up encrypted again as part of the service you want to have run for yourself?

Consider all data at Google being encrypted with my private key, my public key publicly available, as well as all 'control blocks'. These would be blocks that point at individual byte ranges and encrypted with a BluRay-like tree of keys so that multiple intended recipients can decrypt it. I'm making up this terminology as I go.

Any time you want to add a piece of information, add a recipient or revoke a piece of information or access, you'll have to interact with the data in the clear for a bit before encrypting it and adding or replacing bits to that data at rest.

How would you do this on a web service without giving the provider the keys to your kingdom? As far as I'm aware there is *some* work on computing with encrypted values where the outcome would be the same as if you first decrypted the values, did the computation, then reencrypted it; last I saw this was in its infancy, in a Google Tech talk about encryption and voting systems.

So while I agree with you that many would still be beholden to a Linksys or Netgear, I don't see how you'd get to do what you'd want to do with all data at Google - even if encrypted - without having to act on it using a piece of client software that runs on a computer (or smartphone, or tablet) that you trust (before publishing the altered data again at Google).

Doesn't that partially defeat the benefit of having it hosted at a 3rd party?

If you have to run a piece of trusted software on a machine under your control anyway, there's little difference if that is your router or your PC. For those not-skilled-in-the-arts, they're going to have to rely on others to tell them whether or not said hardware box or said piece of software really has their interests at heart.

Maybe I'm overthinking this or missing something. It's entirely possible. If you have any insights on how to maintain control over your data during the window you're adding to it or editing it, I'd be happy to hear it; you've raised excellent points so far.

Ruminations on the "cloud problem"

Posted Jan 15, 2011 17:56 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

If you have to run a piece of trusted software on a machine under your control anyway, there's little difference if that is your router or your PC. For those not-skilled-in-the-arts, they're going to have to rely on others to tell them whether or not said hardware box or said piece of software really has their interests at heart.

Somehow, you've turned the argument completely around and are explaining that having your data at Google is not more private than having it at home. What I'm saying is that it's not less private and that the reason for keeping the data at Google is that it costs less and has higher quality.

In both cases, you trust machines that are physically under your control, you trust the community of experts that tell you you're safe, and others are about equally able to get your data. Assuming you're not an engineer.

Ruminations on the "cloud problem"

Posted Jan 15, 2011 19:26 UTC (Sat) by Oddscurity (guest, #46851) [Link]

I think we're arguing the same point.

The biggest technical hurdle, I think, is going to be the editing/updating of your information in a secure manner.

You're right that hosting the encrypted info at a 3rd party is no less secure/private, as long as you get to do the updating on a machine under your immediate control.

That was the bit I had trouble wrapping my head around -- why host it elsewhere. So as far as turning the problem upside down, that's the bit I was struggling with. If I have to have a machine I trust, I'd personally use it to host the data as well. I see what you're saying now about outsourcing the hosting.

It's unusual to think of Google as just a data host, not a provider of a webservice that consumes that data as well. Therein lay the source of my confusion: how would a service (by which I mean the data entry part) run by them still be secure?

So indeed, it pays to think of it as two disparate problems: 1) where and how to host the encrypted data - 2) where and how to consume that data and publish it.

In fact then we could publish our data anywhere at all without access controls as far as download go. The real access controls would be buried somewhere inside of those encrypted chunks and knowing the correct key gets you access to a control chunk that might tell you how to consume another set of chunks?

This reminds me of the way www.tarsnap.com works.

Ruminations on the "cloud problem"

Posted Jan 14, 2011 13:59 UTC (Fri) by Oddscurity (guest, #46851) [Link]

If instead you meant that the software runs at Google or AT&T, but it serves the content from your local network over a VPN solution or the like, I see no benefit either. The weakest link would still be your local internet connection and its uptime.

Hosting everything at Google or AT&T on the other hand, apart from giving you an internet connection that's more reliable, means no improvement over the current situation of giving control of your data to others. In fact you'd end up concentrating all of it at 1 vendor.

Encrypted storage

Posted Jan 13, 2011 9:00 UTC (Thu) by CSSX (subscriber, #56720) [Link]

There's wuala.com, which is a DropBox replacement with AES encryption. I haven't used it yet though so I don't know how well it actually works.

Encrypted storage

Posted Jan 13, 2011 9:40 UTC (Thu) by burki99 (subscriber, #17149) [Link]

I've been using wuala for about two years and I didn't experience any technical problems (it is a java-application smoothly integrated into the file system). But it is much harder to get other people to share files through that service than dropbox. Probably a typical network effect (Metcalfe's law). But it does its job backing up my personal files to the cloud.

Encrypted storage

Posted Jan 13, 2011 12:15 UTC (Thu) by ekj (guest, #1524) [Link]

I second this. Wuala was my first thought too. It's in the cloud, and even enables you to gain extra storage-space by "trading" space, i.e. you store some data of others, and the others will store some of yours.

Everything is AES-encrypted, and the key is not shared with anyone. (well, unless you set a folder as "public")

Ruminations on the "cloud problem"

Posted Jan 13, 2011 9:10 UTC (Thu) by michaeljt (subscriber, #39183) [Link]

I suspect that at least for the average person (the sort who doesn't deliberately use FLOSS software) this will end up as a question of which companies you trust with your data. After all, the chances that a reasonably trustworthy company does something very bad are much lower than the chances that they make a big mistake in protecting their data themselves. (Think viruses, or accidentally clicking on the wrong file or the wrong button, or whatever). And that trust is likely to be defined somewhat differently than the average lwn reader would do, probably based on stories of companies doing things with the data that they really saw as a threat to themselves. What people might actually consider a threat, beyond things that can affect their finances, I don't know.

Ruminations on the "cloud problem"

Posted Jan 13, 2011 11:38 UTC (Thu) by ortalo (subscriber, #4654) [Link]

Personnally, I really don't like this "cloud" idea at all because, as you say, it goes against at least 2 different things I would really like to be able to do. And that cloud computing idea rules them away without adequate motivation from my point of view.
I would like to run a server a home. Of course, I do not want these energy hungry, noisy and administration-needing servers you find in data centers in my living room (or even in the basement). But well, something like a WiFi access point, something like an internet box, something like *that* would fit me well. (And I am playing with dd-wrt and things like that.) Even nowadays, these would be a nice place to store some of the crypto keys needed to keep control of data possibly sent elsewhere on the Internet. In fact, even with current cheap technology, it should probably be possible to have the authentication & authorization server at home.
I also would love applications to be change: to better protect my data, but also to allow new ways of using it. We have been waiting for much too long in my opinion to actually deliver collaborative editing in office application. We need to improve applications to do that, so we need to change them.

Nowadays of course I cannot do such things with convenient mainstream solutions. But, well, things like AFS, Coda, or even NFS, NIS, Kerberos, and original application development solutions (usually based on new languages like Mozart/Oz, Erlang, etc.) in the research community were already exploring that space a decade ago.
Maybe I am simply an aging boy, unconsciously lingering for the good old things of the past. Or maybe it is time to realize that we keep on getting the confirmation that HTTP and Web browsers/servers - even if they brought a social revolution with the emergence of the usage of computer networks by everyone - were not such a technical revolution after all.

Time to clean the webs and (re)invent something better? Dunno, but not simply a new version number involved.

PS: Such topics always remind me of the good old controversy between distributed and centralized computing btw.

Ruminations on the "cloud problem"

Posted Jan 15, 2011 3:28 UTC (Sat) by pabs (subscriber, #43278) [Link]

You might be interested in Eben Moglen's "Freedom Box" idea, the silver lining in the cloud:

http://wiki.debian.org/FreedomBox

I can't believe our esteemed editor failed to mention it in the article.

Ruminations on the "cloud problem"

Posted Jan 18, 2011 18:02 UTC (Tue) by ortalo (subscriber, #4654) [Link]

Once again, thank you very much for the link.

The philosophy sounds very near from the one I tried to promote.

However, looking into more detail at the FreedomBox, I have to say that I would have much different specs. I did not envision a server with full services like email or tor of http, but only basic core infrastructure services:
- hold my crypto keys
- hold the identity directory (me, my children, my wife) for regular computers to use (think NIS+, kerberos)
- hold links to ressources, configuration and if possible be the authorization server (think to that useful part of ActiveDirectory - but without the weight)
- monitors itself and other parts of the "home network" (ala Nagios feature-wise)
- provide some control maybe (home automation?)
- your idea here

And maybe some of these ideas would necessitates new simple protocols for exchanging data and avoiding linkedin, facebook, etc. centralized model.

Ruminations on the "cloud problem"

Posted Jan 18, 2011 18:08 UTC (Tue) by ortalo (subscriber, #4654) [Link]

Last minute additions to the "spec":
- storage for my finance files
- some core inventory (like for the children, but for physical things owned, usually more entries... ;-)
- ...

Ruminations on the "cloud problem"

Posted Jan 13, 2011 16:23 UTC (Thu) by karim (subscriber, #114) [Link]

The problem is that free and 0.01$ are two entirely different value-propositions.

Here are some thoughts I had on this in a comment to a previous LWN article, I still think they're relevant (i.e. convenience/accessibility wins over principle/ideas any time):
http://lwn.net/Articles/420180/

I'm still unsure what other role FLOSS can play in that world.

Ruminations on the "cloud problem"

Posted Jan 14, 2011 11:12 UTC (Fri) by ortalo (subscriber, #4654) [Link]

Good idea to bring economics into the picture.

Maybe cloud computing brings in different parameters in the equation (with respect to your referenced comment).
However, and that's what you pointed out, the real value is in the data stored behind the applications.

What I find most astonishing is the fact that most people can be convinced to do the actual work of typing and giving "their" data (esp. personal) just for using these applications. Within a few years from now, maybe most people will also get bored by the taks of managing (especially maintaining) all this data and will want to use solutions that allow them to have that administrative work done differently.

That could change the equation again. Another asset of free software is that they leave/give users a lot of control on their data and the processing made with it. (Database applications for the masses is the next big thing to do in free software?)

Ruminations on the "cloud problem"

Posted Jan 13, 2011 16:51 UTC (Thu) by gartim (subscriber, #10123) [Link]

http://www.pogoplug.com/home-en-whats-pogoplug-mypogoplug...

I've not tried this device (pogoplug), but seem data control would be in the individuals hands. I currently run a small server and it was fun for a while but may look for a simpler solution like the pogo.

Ruminations on the "cloud problem"

Posted Jan 13, 2011 17:54 UTC (Thu) by theICEBear (subscriber, #23193) [Link]

It does look like the KDE folks have also started to think on this matter. I haven't used this nor I am sure it is more than a bit past the prototype/idea stage: http://owncloud.org/index.php/Main_Page

Ruminations on the "cloud problem"

Posted Jan 13, 2011 19:36 UTC (Thu) by wingo (subscriber, #26929) [Link]

I still think GNU should take this on. You can't write wizards for all the experiences you don't even know you want yet -- we need a new "OS" for writing secure, distributed, autonomy- and privacy-preserving applications.

I wrote more about this here: Towards a GNU autonomous cloud.

Waiting for my flying car

Posted Jan 14, 2011 3:58 UTC (Fri) by ghane (subscriber, #1805) [Link]

Hopefully I won't have to wait as long for that as I've been waiting for my personal robot and flying car ...

I am over 40 now, and in my lifetime, I have seen us drop from the moon, and lose commercial supersonic flight, to mention just two examples.

I cite the "flying cars" as my biggest regret. I was promised, when I was a kid, that if I worked hard, stayed clean, gave up dreams of a colour TV (not available when I was a kid where I grew up), I would have flying cars.

I am still waiting. "Society" has broken its contract with me, and I am not amused.

Waiting for my flying car

Posted Jan 14, 2011 14:11 UTC (Fri) by Oddscurity (guest, #46851) [Link]

I'm still waiting for Eon's Thistledown library type direct downloading of knowledge, later blatantly ripped off by The Matrix. *

* I acknowledge the idea may indeed predate Greg Bear's Eon, but that's the book that first showed me the idea so vividly.

For that matter, where's the space elevator promised in so many words? The colony on Mars?

Waiting for my flying car

Posted Jan 14, 2011 19:27 UTC (Fri) by njs (guest, #40338) [Link]

I'm still waiting for the world peace.

Though, to be fair, I'm also still waiting for the nuclear annihilation.

Waiting for my flying car

Posted Jan 19, 2011 16:30 UTC (Wed) by nye (guest, #51576) [Link]

>I am still waiting. "Society" has broken its contract with me, and I am not amused.

On the other hand, I'm 26, and the future I was promised is pretty much here, with the last couple of miles looking pretty imminent.

It hit me a couple of years ago that I actually feel like I'm living in the future, which is a feeling I'd never really thought I'd experience.

Ruminations on the "cloud problem"

Posted Jan 15, 2011 14:51 UTC (Sat) by asherringham (subscriber, #33251) [Link]

These concerns are all valid and do weigh on my mind as well. Not just a question of "trusting" the cloud provider itself.

As a commercial entity usually, you might have to trust whoever might purchase the business down the line. Google's doing pretty well just now, but if it starts losing traction in 5-10 years and gets bought? Or what about the business that just dies and is closed down?

Unhosted is where it is at

Posted Jan 20, 2011 0:13 UTC (Thu) by frabcus (guest, #25169) [Link]

The most interesting new solution to this problem is called Unhosted.

It's s protocol that separates web application code provision from web application data storage. Encryption prevents either provider (code or data) from being able to access your data. And the whole lot helps open source compete on the web.

http://www.unhosted.org/

I would love to see lots of people who are worried about the proprietary nature of the top layer of the Internet stack (Google, Facebook...) making unhosted apps and storage nodes.

Ruminations on the "cloud problem"

Posted Jan 20, 2011 17:15 UTC (Thu) by landley (guest, #6789) [Link]

> since encryption will be the norm

Your optimism is overwhelming. The above statement's been wishfully applied to everything from email to http transactions to people's hard drives for a couple decades now.

But the simple fact is people only start caring about fire prevention after a big enough fire. They start exercising after their first heart attack (or diabetes diagnosis). In the few cases where a clear looming disaster gets averted by great effort at the last minute (Y2K comes to mind), people immediately start equating "nothing happened" with "obviously it couldn't have been such a big deal in the first place, must have been exaggerated". You only get to be a hero if you let everything fall apart and then clean up afterwards.

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds