Why should developers use CouchDB? It all comes down to
replication, says Benjamin Young. Data is "lonely," trapped in proprietary
formats and cloud silos, or remote applications that require dependable
Internet connections. When the power goes out, when you want to work on the
plane, or when you just want control over your data — Young says you
want CouchDB.
Young was speaking at the Strange Loop
2011 conference on Tuesday, September 20 in St. Louis. The mission? To
convince some of the 900-plus developers at the conference to consider
CouchDB as database to power their applications.
It wouldn't be very convincing if Young wasn't a fan himself, of
course. Young's history with CouchDB goes back to work with BlueInk, a content management system for
building simple sites
for small companies. BlueInk used CouchDB, which caught the attention of
CouchOne (now Couchbase, following
a
merger of CouchOne and Membase). BlueInk was released as
open source a year ago when Young joined CouchOne.
What is CouchDB?
After giving a bit of his history, Young moved on to talking about
CouchDB and what it actually is. CouchDB is a document storage database
written in Erlang. It is open source, and part of the Apache project
— though several different implementations are offered by a handful
of companies like Couchbase. It's one of the current crop of NoSQL, or
coSQL if you prefer, databases.
Documents in CouchDB are stored in JavaScript Object Notation (JSON)
format, plus attachments if you have binary data (photos, for example) to
put into the database. Queries are written using JavaScript. It's a
distributed database that replicates statelessly. That is, as Young
describes, "it can be interrupted and doesn't care." CouchDB
counts the number of changes that have been made since it last connected to
another instance, and picks up where it left off.
Young described a scenario of deploying an application using CouchDB in
rural Africa. The users might only have Internet a couple of hours per day
in rural Africa. If they were using a cloud-based application that depends
on a constant connection "they'd be completely screwed." Using
CouchDB, the data can be easily synced during the window of connectivity
without any hassle. The database doesn't have to keep a constant state,
CouchDB operates on the principle that it will be "eventually
consistent."
As the scenario Young described implies, CouchDB is not limited to
running on a server. It will run on a desktop, server, or even on a mobile
device. Any device running CouchDB can communicate and send data to another
instance of CouchDB, so replication can be peer-based or follow a
server-client model.
Replication can also be filtered, so that an instance only receives some
of the information relevant to its local application. For instance, Young
mentioned you might use CouchDB for distributing shipping information
— the primary instance of CouchDB getting all of the order
information, and filtering the appropriate orders to the users who are
filling some of the orders.
Another argument for CouchDB? Young says "you already know the API,"
because it's basically
HTTP. CouchDB provides a RESTful API with GET, PUT,
POST, DELETE, as well as a non-standard extension to HTTP, COPY. The
COPY API does what one would expect: it duplicates the contents and
attachments of a document to a new document using a new name, and without
requiring it to be retrieved first.
Web 2.5: Work Locally, Sync Globally
The big reason for choosing CouchDB according to Young?
Replication. Like many in the open source community, Young doesn't express
a lot of love for cloud services — which he describes as
"managed and terms-of-serviced and fascist."
That's not to say Young doesn't want to collaborate with his friends and
co-workers. Far from it, but Young argues for a return to a
"decentralized and very organic" Web. He calls it "Web 2.5"
(because "Web 3.0 is already taken").
Web 2.0 gave us "cloud powers" but took away ownership of
data. "For cloud powers we traded ownership, privacy, security,
safety, stability... we need to get all those things back, to get that we
need replication. Not only to move data, but applications."
Web 2.5, says Young, is a situation where you have the browser and
server "and then the .5 lives in the cloud, and that's a service that
watches for changes in your CouchDB." Data a user chooses to share
can still be synced up to cloud services, but it lives on the users'
machine as well. "I as a person shouldn't have to keep everything
that is me in a little package in the cloud... I can work locally on my
laptop and then if I need to share I can go somewhere people
are."
Get Started
Young suggests that developers interested in CouchDB start at Iris Couch as the fastest way to get
started. He also suggested checking out the Apache project, and its mailing
lists. For more immediate questions and feedback, Young noted that
there was a decent community of CouchDB developers on Freenode on the
#couchdb channel as well as in the #couchbase channel.
(
Log in to post comments)