Unhosted web applications: a new approach to freeing SaaS

January 26, 2011

This article was contributed by Nathan Willis

Free software advocates have been pushing hard against the growing trend of commercial Software-as-a-Service (SaaS) — and the resulting loss of autonomy and software freedom — for several years now. A new project named Unhosted takes a different approach to the issue than that used by better-known examples like Diaspora and StatusNet. Unhosted is building a framework in which all of a web application's code is run on the client-side, and users have the freedom to choose any remote data storage location they like. The storage nodes use strong encryption, and because they are decoupled from the application provider, users always have the freedom to switch between them or to shut off their accounts entirely.

The Unhosted approach

An outline of the service model envisioned by Unhosted can be found on the project's Manifesto page, written by founder Michiel de Jong. "A hosted website provides two things: processing and storage. An unhosted website only hosts its source code (or even just a bootloader for it). Processing is done in the browser, with ajax against encrypted cloud storage."

In other words, the manifesto continues, despite the availability of the Affero GPL (AGPL), which requires making source code available to network end-users, licensing alone is not enough to preserve user freedom because proprietary SaaS sites require users to upload their data to "walled silos" run by the service provider. An Unhosted application is a JavaScript program that runs in the browser, but accesses online storage on a compliant storage node. It does not matter to the application whether the storage node is run by the application provider, the user, or a third party.

Storage nodes are essentially commodity infrastructure, but in order to preserve user freedom, Unhosted requires that applications encrypt and sign the data they store. The project defines an application-layer protocol called Unhosted JSON Juggling Protocol (UJJP, sometimes referred to as UJ) for applications to communicate with storage nodes, for requesting and exchanging objects in JavaScript Object Notation (JSON) format.

As the FAQ explains, this constitutes a distinctly different model than most other free software SaaS projects. Most (like StatusNet and Diaspora) focus on federation, which allows each user to run his or her own node, and requires no centralized authority linking all of the user accounts. The down side of the federated systems are that they may still require the users to entrust their data to a remote server.

Eben Moglen's FreedomBox, on the other hand, focuses on putting the storage under the direct control of the user (specifically, stored at home on a self-managed box). This is a greater degree of freedom, but home-hosting is less accessible from the Internet at large than most web services because it often depends on Dynamic DNS. Home-hosting is also vulnerable to limited upstream bandwidth and common ISP restrictions on running servers.

Unhosted, therefore, attempts to preserve the "accessible anywhere" nicety of popular Web 2.0 services, but de-link the application from the siloed data.

Connecting applications to storage

Obviously, writing front-end applications entirely in HTML5 and JavaScript is not a new idea. The secret sauce of Unhosted is the connection method that links the application to the remote storage node — or, more precisely, that links the application to any user-defined storage node. The system relies on Cross-Origin Resource Sharing (CORS), a W3C Working Draft mechanism by which a server can opt-in to make its resources available to requests originating from other servers.

In the canonical "web mail" example, the Unhosted storage node sees a cross-origin request from the webmail application, checks the source, user credentials, and request type against its access control list, and returns the requested data only if the request is deemed valid. UJJP defines the operations an application can perform on the storage node, including creating a new data store, setting and retrieving key-value pairs, importing and exporting data sets, and completely deleting a data store.

Security-wise, each application only has access to its own data store, not the user's entire storage space, and CORS does allow each storage node to determine a policy about which origins it will respond to. But beyond that, the system also relies on the fact that the user has access to all of the application source code, because it runs in the browser. Thus it is up to the user to notice if the application does something sinister like relay user credentials to an untrusted third party. Dealing with potentially obfuscated JavaScript may be problematic for users, but it is still an improvement over server-side processing, which happens entirely out of sight.

Finally, each application needs a way to discover which storage node a user account is associated with, preferably without prompting the user for the information every time. The current Unhosted project demo code relies on Webfinger-based service discovery, which uniquely associates a user account with an email address. The user would log in to the application with an email address, the application would query the address's Webfinger identity to retrieve a JSON-formatted array of Unhosted resource identifiers, and connect to the appropriate one to find the account's data store.

This is not a perfect solution, however, because it depends on the email service provider supporting Webfinger. Other proposed mechanisms exist, including using Jabber IDs and Freedentity.

The tricky bits

Currently, one of the biggest sticking points in the system is protecting the stored data without making the system arduous for end users. The present model relies on RSA encryption and signing for all data stores. Although the project claims this is virtually transparent for users, it gets more difficult when one Unhosted application user wishes to send a message to another user. Because the other user is on a different storage node, that user's public key needs to be retrieved in order to encrypt the message. But the system cannot blindly trust any remote storage node to authoritatively verify the other user's identity — that would be trivial to hijack. In response, the Unhosted developers are working on a "fabric-based public key infrastructure" that enables users to deterministically traverse through a web-of-trust from one user ID to another. Details on that part of the system are still forthcoming.

It is also an open question as to what sort of storage engine makes a suitable base for an Unhosted storage node. The demo code includes servers written in PHP, Perl, and Python that all run on top of standard HTTP web servers. On the mailing list, others have discussed a simple way to implement Unhosted storage on top of WebDAV, but there is no reason that a storage node could not be implemented on top of a distributed filesystem like Tahoe, or a decentralized network like Bittorrent.

Perhaps the most fundamental obstacle facing Unhosted is that it eschews server-side processing altogether. Consequently, no processing can take place while the user is logged out of the application. Logged out could simply mean that the page or tab is closed, or an application could provide a logout mechanism that disconnects from the storage node, but continues to perform other functions. This is fine for interactive or message-based applications like instant messaging, but it limits the type of application that can be fit into the Unhosted mold. Judging by the mailing list, the project members have been exploring queuing up operations on the storage node side, which could enable more asynchronous functionality, but Unhosted is still not a replacement for every type of SaaS.

Actual code and holiday bake-offs

The project has a Github repository, which is home to some demonstration code showing off both parts of the Unhosted platform — although it loudly warns users that it is not meant for production use. The "cloudside" directory includes an example Unhosted storage node implementation, while the "wappside" directory includes three example applications designed to communicate with the storage node.

The storage node module speaks CORS and is written in PHP with a MySQL back-end. It does not contain any server-side user authentication, so it should not be deployed outside the local area network, but it works as a sample back-end for the example applications.

The example application set includes a JavaScript library named unhosted.js that incorporates RSA data signing and signature verification, encryption and decryption, and AJAX communication with the CORS storage node. There is a separate RSA key generation Web utility provided as a convenience, but it is not integrated into the example applications.

The example named "wappblog" is a simple blog-updating application. It creates a client-side post editor that updates the contents of an HTML file on a storage node, which is then retrieved for reading by a separate page. The "wappmail" application is a simple web mail application, which requires you to set up multiple user accounts, but shows off the ability to queue operations — incoming messages are stored and processed when each user logs in.

The third example is an address book, which demonstrates the fabric-based PKI system (although the documentation warns "it's so new that even I don't really understand how it works, and it's mainly there for people who are interested in the geeky details").

A more practical set of example applications are the third-party projects written for Unhosted's "Hacky Holidays" competition in December. The winning entry was Nathan Rugg's Scrapbook, which allows users to manipulate text and images on an HTML canvas, and shows how an Unhosted storage node can be used to store more than just plain text. Second place was shared between the instant messenger NezAIM and the note-taking application Notes.

The fourth entry, vCards, was deemed an honorable mention, although it used some client-side security techniques that would not work in a distributed environment in the real world (such as creating access control lists on the client side). The author of vCards was commended by the team for pushing the envelope of the protocol, though — he was one of the first to experiment with queuing operations so that one Unhosted application could pass messages to another.

Hackers wanted

At this stage, Unhosted is still primarily a proof-of-concept. The storage node code is very young, and has not been subjected to much real-world stress testing or security review. The developers are seeking input for the next (0.3) revision of UJJP, in which they hope to define better access control mechanisms for storage nodes (in part to enable inter-application communication) as well as a REST API.

On a bad day, I see "unresponsive script" warnings in Firefox and think rich client-side JavaScript applications sound like a terrible idea, but perhaps that is missing the bigger picture. StatusNet, Diaspora, and the other federated web services all do a good job of freeing users from reliance on one proprietary application vendor — but none of them are designed to make the storage underneath a flexible, replaceable commodity. One of the Unhosted project's liveliest metaphors for its storage de-coupling design is that it provides "a grease layer" between the hosted software and the servers that host it. That is an original idea, whether the top layer is written in JavaScript, or not.

Index entries for this article
GuestArticles	Willis, Nathan

Unhosted web applications: a new approach to freeing SaaS

Posted Jan 27, 2011 3:11 UTC (Thu) by karim (subscriber, #114) [Link] (3 responses)

This might be useful for web apps that try to replace desktop equivalents. A Floss version of Zoho perhaps. But the moment my data acquires value by intermingling with others' data and/or when federation is indeed required, this scheme is of little to no value. And most of the popular "Web 2.0" apps are of that kind. Even apps like Zoho tend to have a collaboration angle ... which require federation.

Unhosted web applications: a new approach to freeing SaaS

Posted Jan 27, 2011 6:20 UTC (Thu) by lambda (subscriber, #40735) [Link]

These apps should be able to use both models. They can run entirely on the client side, talking only to encrypted data stores on the server; or they can talk to both traditional web apps/web services, as well as your own encrypted storage. Just like with email; you rely on your service provider to be online for you, accepting and queueing mail, but once you download it, you control the data, the program that works with it, and what you do with it.

What I'd really like to see is a good permission system for giving third-party services access to some of your data, in a reasonably fine-grained, and revokable fashion. That way, if you wanted to use a service that acquires value by being shared with others you could do sosay, a spam filtering service that learns to distinguish spam from ham on a large corpus, but limit it to read-only access of some of your mail folders, plus the ability to move mail to a spam folder. If you decide later on that it's no longer helpful, or you no longer trust it, you can revoke its access token, and it can no longer see any of your data.

Unhosted web applications: a new approach to freeing SaaS

Posted Jan 27, 2011 9:44 UTC (Thu) by mjthayer (guest, #39183) [Link]

> But the moment my data acquires value by intermingling with others' data and/or when federation is indeed required, this scheme is of little to no value.

Just a thought - what about access permissions on the storage?

Unhosted web applications: a new approach to freeing SaaS

Posted Feb 2, 2011 21:38 UTC (Wed) by mich-unhosted (guest, #72652) [Link]

Intermingling with other people's data is fully possible. For instance, I may store a list of friends, with URLs for their unhosted storage and public keys to check their signatures, and I would compose my news feed by connecting to each friend's unhosted storage node. The web2.0 aspect is perfectly possible. What is hard is search. Instead of centralized search, you could implement some form of social search in apps. Or make all users together carry a DHT.

The goal is bringing free software to the web. Free software seems to be stuck in installed software (be it installed on your PC or installed on a server), and that's a shame, because a lot of software that people use nowadays is hosted. Everything we have achieved is becoming almost meaningless if you use a free kernel and free libraries and free drivers and a free browser, to access a non-free facebook, or a non-free google, to find content on a non-free youtube, or a non-free google docs.

Unhosted web applications: a new approach to freeing SaaS

Posted Jan 27, 2011 9:46 UTC (Thu) by mjthayer (guest, #39183) [Link]

I don't know whether something like this will take off on a large scale, but I am beginning to see the quest for free webservice-style applications producing all sorts of interesting and useful ideas.

Unhosted web applications: a new approach to freeing SaaS

Posted Jan 27, 2011 11:05 UTC (Thu) by nelljerram (subscriber, #12005) [Link]

Nice and useful article, thanks.

Unhosted web applications: a new approach to freeing SaaS

Posted Jan 30, 2011 16:29 UTC (Sun) by djc (subscriber, #56880) [Link]

See also couchapp, a framework for writing apps served from CouchDB. While tied to CouchDB (like) storage, writing all of the app in JavaScript is becoming more interesting as the browsers grow more capable.

Unhosted web applications: a new approach to freeing SaaS

Posted Feb 4, 2011 8:58 UTC (Fri) by oldtomas (guest, #72579) [Link] (2 responses)

I find many ideas in this proposal intriguing and good.

Still, I'm very concerned about the recent trend to funnel everything through HTTP, Json/XML and the browser[1]. I think t's the wrong direction.

There are quite successful models of signed, content-addressable storage around (peer-to-peer networks, Git). There are many other protocols on top of TCP/IP worth considering. And there is a working key distribution model around (pgp, web of trust). Wy re-invent everything?

Peersonally, I am not thrilled by the prospect of having the fat, bloated Web browser as my "second-layer" operating system.

--------
[1] Of course, Google *likes* this trend :-)

Unhosted web applications: a new approach to freeing SaaS

Posted Feb 7, 2011 23:54 UTC (Mon) by mich-unhosted (guest, #72652) [Link] (1 responses)

You have quite a valid point there. Because of the restrictions of current in-browser programming, we should use installable software whenever possible. An unhosted calendar app should never be an alternative to an installed calendar app; only to a /hosted/ calendar app.

The (sad?) truth is that end users seem to like web apps. The functional restrictions on in-browser apps make them more distributable ("installation" without root access), and that is apparently what end users value. See also http://rww.to/idBbxI about this trade-off.

Programming for the browser limits us to using JavaScript, doing communication over http, having a window (or tab) open for each app that runs, etc. This is far from ideal, you're absolutely right.

But I'm not saying this undesirable situation cannot be fixed in the long run. We hope the unhosted web will be a temporal hack, a stone in the road back to more software freedom. We try to overhaul the concepts of "website", "web app" and "SaaS". If we succeed, then the next step will be overhauling the concept of "browser".

Unhosted web applications: a new approach to freeing SaaS

Posted Feb 8, 2011 11:00 UTC (Tue) by quotemstr (subscriber, #45331) [Link]

> See also http://rww.to/idBbxI about this trade-off.

Short URLs are a scourge. Please don't them when a normal URL would work fine, such as on LWN.