September 15, 2010
This article was contributed by Josh Berkus
It's a rare developer summit which ends with a bonfire,
s'mores and a
sing-along. But somehow it seems appropriate for
CouchDB, a new database whose motto is
"relax". Last week I went to
CouchCamp at Walker Creek Ranch in
California, and enjoyed a different kind of conference for a different kind
of database.
The CouchDB Document Database
CouchDB is a "document database", a classic type of database which debuted
in the 1960's and has periodically reappeared each decade, most recently
through the various XML databases. Today, many of the new "NoSQL" databases
are document databases, including Amazon SimpleDB, MongoDB, and MonetDB.
Document databases make use of the intuitive concept of each item in the
database as a document with a document ID and a large variety of data
inside the document, including everything from blocks of text to complex
nested structures. Each database is then a collection of documents,
instead of tables and columns. Access to data is via either the document
ID, or by various secondary indexes. As a rule, document databases do not
use SQL, but rely on other interfaces like XQuery, HTTP and proprietary
APIs.
Unlike its predecessors, CouchDB may have a combination of features which
will ensure lasting popularity. First, the database is designed to be 100%
accessible and familiar to web developers, thanks to data access via
RESTful HTTP interfaces, data storage using the popular JSON serialization
format, and database programming in javascript. Secondly, CouchDB has
extremely user-friendly asynchronous multi-master replication, designed to
support multiple disconnected copies of your data. And finally, it
supports the use of map/reduce functions to do multi-database queries,
sharding and scaling, and even to build calculated indexes or "views".
Unlike many other new non-relational databases, CouchDB supports ACID
(Atomic, Consistent, Isolated and Durable) transactions like many SQL
databases do, making it more appropriate as a primary data store. CouchDB
is written in Erlang.
The CouchDB project is a 2-year-old Apache project and recently released
version 1.0 (immediately followed by a patch, version 1.0.1). While several
of its developers are now working for a Berkeley startup called CouchOne, the project remains
completely community-based. Nowhere was this more evident that CouchCamp.
CouchCamp
The Walker Creek
Ranch is an hour's drive north of San Francisco through the winding
roads of rural Marin County. There is no cell phone reception, Internet
access is
intermittent, and there are more foxes and dairy cows than people. This
was a different sort of "unconference"; for one thing, the poor
connectivity and isolation meant that we had to pay attention to the talks
and discussion sessions. It also meant no "tweeting" about the conference
sessions.
Eighty or ninety people attended CouchCamp, about evenly split between
database developers, contributors to CouchDB, and users of the database.
In general, they came from all over the United States, with only a few
coming in from Europe. CouchCamp was a community summit rather than a user
conference, with a lot of the public conference activity to take place at
JSConf.eu in Berlin later this month.
The conference began with a campfire-lit keynote by Damian Katz, inventor
of CouchDB. A former Lotus Notes developer, he wanted to create a database
which encouraged sharing and collaboration. Originally he wrote CouchDB in
C++, but, once he started grappling with concurrency issues, he rewrote it in
Erlang. He's as surprised as anyone that it's now a full-time job and
finds himself living in the San Francisco Bay Area -- but he was even more
surprised when a raccoon interrupted his talk.
The next two days of the conference alternated between general talks and
breakout discussion sessions. The talks, given to all attendees, included
Stuart Langridge of Canonical on the use of CouchDB in UbuntuOne, Ted Leung of Apache on the
importance of community, and Dion Almaer of Palm on CouchDB on smart
phones. Interestingly, the CouchCamp organizers also chose to include
talks by Selena Deckelmann and myself about features of PostgreSQL which
the CouchDB project could learn from or even "steal".
In breakout sessions, I learned about GeoCouch,
a new competitor to PostGIS, and Spatialite for open source geographic
databases. We also discussed more technical topics, including CouchDB's
security model, database "compaction" (garbage collection), the current
full-text indexing work in progress, and how to create a CouchApp.
Database hackers spent a lot of time in both sessions and at meals hashing
out longstanding technical issues; I would not be surprised to see a rapid
release of version 1.1.
There were also a few announcements at or around the conference. Cloudant, a cloud host for CouchDB
applications, launched BigCouch, a cloud-scaled
version of CouchDB with some form of automated sharding. Couch.io
announced its change of name to CouchOne, and then announced a bunch of new
hires, including the former documentation lead for MySQL AB. Damian warned
people about the need to use version 1.0.1 instead of 1.0.0, which includes
a data corruption bug. CouchDB also now comes built-in to newer versions of
Ubuntu.
Of course, being in California, there were quite a few "gourmet" touches to
roughing it: microbrew beers, oysters from nearby Tomales Bay and cheese
from nearby Cowgirl Creamery. The only thing missing was a wine tasting;
maybe I'll arrange one next time.
If you develop web applications, you owe it to yourself to check out
CouchDB and see if it's appropriate for your current project. Maybe you'll
come out to Walker Ranch next year.
(
Log in to post comments)