November 2, 2010
This article was contributed by Josh Berkus
What do you get when you put together 80 to 100 hard-core database geeks
from ten different open source databases for a weekend?
OpenSQLCamp, which
was held most recently at MIT.
Begun three years ago, OpenSQLCamp is a semi-annual unconference for
open source database hackers to meet and collaborate on ideas and theories in
the industry. It's held at various locations alternately in Europe and the
United States, and organized and run by volunteers. This year's conference
was organized by Sheeri Cabral, a MySQL community leader who works for
PalominoDB.
This year's event included database hackers who work on MySQL, MariaDB,
PostgreSQL, VoltDB, Tokutek, and Drizzle. In contrast to the popular
perception that the various database systems are in a no-holds barred
competition for industry supremacy, most people who develop these systems
are more interested in collaborating with their peers than arguing with
them. And although it's OpenSQLCamp, programmers from "NoSQL" databases
were welcome and present, including MongoDB, Membase, Cassandra, and
BerkeleyDB.
While the conference was mainly database engine developers, several
high-end users were present, including staff from Rackspace, GoDaddy,
VMWare, and WidgetBox. The conference's location meant the participation
of a few MIT faculty, including conference co-chair Bradley Kuzsmaul.
While few of the students who registered actually turned up, attendees were
able to learn informally about the software technologies which are now hot
in universities (lots of work on multi-processor scaling, apparently).
Friday
The conference started with a reception at the WorkBar, a shared
office space in downtown Boston. After a little drinking and socializing, participants slid immediately into discussing database and database
industry topics, including speculation on what Oracle is going to do with
all of its open source databases (answer: nobody knows, including the
people who work there), recent releases of PostgreSQL and MySQL, and how
VoltDB works. Whiteboard markers came out and several people shifted to
technical discussions and continued the discussion until 11pm.
Jignesh Shah of VMWare brought up some interesting SSD testing results. In
high-transaction environments, it seems that batching database writes
actually reduces throughput and increases response times, completely
contrary to performance on spinning disks. For example, Jignesh had
experimented with asynchronous commit with large buffers, which means that
the database returns a success message to the client and fsyncs the data in
batches afterward. This reduced database write throughput, whereas on a
standard spinning disk RAID it would have increased it up to 30%. There
was a great deal of speculation as to why that was.
A second topic of discussion, which shifted to a whiteboard for
comprehensibility, was how to put the "consistency" in "eventual
consistency" without increasing response time. This became a session on
Sunday. This problem, which is basic to distributed databases, is the
question of how you can ensure that any write conflict is resolved in
exactly the same way on all database nodes for a transactional database
which is replicated or partitioned across multiple servers. Historical
solutions have included attempting to synchronize timestamps (which is
impossible), using centralized transaction counter servers (which become
bottlenecks), and using vector clocks (which are insufficiently
determinative on a large number of nodes). VoltDB addresses this by a
two-phase commit approach in which the node accepting the writes checks
modification timestamps on all nodes which could conflict. As with many
approaches, this solution maintains consistency and throughput at a
substantial sacrifice in response times.
Saturday
The conference days were held at MIT, rather ironically in the William
H. Gates building. For those who haven't seen Frank Gehry's sculptural
architecture feat, it's as confusing on the inside as it is on the outside
outside, so the first day started late. As usual with unconferences, the
first task was to organize a schedule; participants proposed sessions
and spent a long time rearranging them in an effort to avoid
double-scheduling, which led to some "concurrency issues" with different
versions of the schedule. Eventually we had four tracks for the four
rooms, nicknamed "SELECT, INSERT, UPDATE and DELETE".
As much as I wanted to attend everything, it wasn't possible, so I'll just
write up a few of the talks here. Some of the talks and discussions will
also be available as videos from the conference web site later. I attended
and ran mostly discussion sessions, which I find to be the most useful
events of an unconference.
Monty Taylor of Drizzle talked about their current efforts to add
multi-tenancy support, and discussed implementations and tradeoffs with
other database developers. Multi-tenancy is another hot topic now that
several companies are going into "database as a service" (DaaS); it is the
concept that multiple businesses can share the same physical database while
having complete logical separation of data and being unaware of each other.
The primary implementation difficulty is that there is a harsh tradeoff
between security and performance, since the more isolated users are from
each other, the less physical resources they share. As a result, no single
multi-tenancy implementation can be perfect.
Since it was first described in the early 80's, many databases have
implemented Multi-Version Concurrency Control (MVCC). MVCC is a set of
methods which allow multiple users to read and modify the same data
concurrently while minimizing conflicts and locks, supporting the
"Atomicity", "Consistency", and "Isolation" in ACID transactions. While
the concept is conventional wisdom at this point, implementations are
fairly variable. So, on request, I moderated a panel on MVCC in
PostgreSQL, InnoDB, Cassandra, CouchDB and BerkeleyDB. The discussion
covered the basic differences in approach as well as the issues with data
garbage collection.
Jignesh Shah of VMWare and Tim Callagan of VoltDB presented on current
issues in database performance in virtualized environments. The first,
mostly solved issue was figuring out degrees of overcommit for virtualized
databases sharing the same physical machine. Jignesh had tested with
PostgreSQL and found the optimal level in benchmark tests to be around 20%
overcommit, meaning five virtual machines (VMs) each entitled to 25% of the
server's CPU and RAM.
One work in progress is I/O scheduling. While VMWare engineers have
optimized sharing CPU and RAM among multiple VMs running databases on
the same machine, sharing I/O without conflicts or severe overallocation
still needs work.
The other major unsolved issue is multi-socket scaling. As it turns out,
attempting to scale a single VM across multiple sockets is extremely
inefficient with current software, resulting in tremendous drops in
throughput as soon as the first thread migrates to a second socket. The
current workaround is to give the VMs socket affinity and to run one VM per
socket, but nobody is satisfied with this.
After lunch, Bradley ran a Q&A panel on indexing with developers from
VoltDB, Tokutek, Cassandra, PostgreSQL, and Percona. Panelists answered
questions about types of indexes, databases without indexes, performance
optimizations, and whether server hardware advances would cause major
changes in indexing technology in the near future. The short answer to
that one is "no".
As is often the case with "camp" events, the day ended with a hacking
session. However, only the Drizzle team really took advantage of it; for
most attendees, it was a networking session.
Sunday
Elena Zannoni joined the conference in order to talk about the state of
tracing on Linux. Several database geeks were surprised to find out that
SystemTap was not going to be included in the Linux kernel, and that there
was no expected schedule for release of utrace/uprobes. Many database
engineers have been waiting for Linux to provide an alternative to Dtrace,
and it seems that we still have longer to wait.
The VoltDB folks, who are local to Boston, showed up in force and did a
thorough presentation on their architecture, use case, and goals. VoltDB
is a transactional, SQL-compliant distributed database with strong
consistency. It's aimed at large companies building new in-house
applications for which they need extremely high transaction processing
rates and very high availability. VoltDB does this by requiring users to
write their applications to address the database, including putting all
transactions into stored procedures which are then precompiled and executed
in batches on each node. It's an approach which sacrifices response times
and general application portability in return for tremendous throughput,
into the 100,000's of transactions per second.
Some of the SQL geeks at the conference discussed how to make developers
more comfortable with SQL. Currently many application developers not only
don't understand SQL, but actively hate and fear it. The round-table
discussed why this is and some ideas for improvement, including: teaching
university classes, contributing to object-relational mappers (ORMs),
explaining SQL in relation to functional languages, doing fun "SQL tricks"
demos, and working on improving DBA attitudes towards developers.
In the last track of the day, I mediated a freewheeling discussion on "The
Future of Databases", in which participants tried to answer "What databases
will we be using and developing in 2020?" While nobody there had a crystal
ball, embedded databases with offline synchronization, analytical databases
which support real-time calculations, and database-as-a-service featured
heavily in the discussion.
Wrap-up
While small, OpenSQLCamp was fascinating due to the caliber of attendee; I
learned more about several new databases over lunch than I had in the
previous year of blog reading. If you work on open-source database
technology, are a high-end user, or are just very interested in databases,
you should consider attending next year. Watch the OpenSQLCamp web site
for videos to be posted, and for the date and location of next year's
conferences in the US and Europe.
(
Log in to post comments)