LWN.net Logo

VoltDB launches

The VoltDB in-memory database management system has announced its existence. "Under the leadership of Postgres and Ingres co-founder, Mike Stonebraker, VoltDB has been developed as a next-generation, open-source DBMS that has been shown to process millions of transactions per second on inexpensive clusters of off-the-shelf servers. It has outperformed traditional OLTP database systems by a factor of 45 on a single server, and unlike NoSQL key-value stores, VoltDB can be accessed using SQL and ensures transactional data integrity (ACID)." The code is licensed under GPLv3; annual support subscriptions start at a mere $15,000.
(Log in to post comments)

VoltDB launches

Posted May 25, 2010 20:15 UTC (Tue) by lwinkenb (guest, #60737) [Link]

Do in-memory databases like this have any persistence at all? What happens if a power outage causes all nodes in the cluster to go offline? I wouldn't have enough faith in a UPS backup if that was the case.

VoltDB launches

Posted May 25, 2010 20:39 UTC (Tue) by rfunk (subscriber, #4054) [Link]

I don't know about this one, but in my experience in-memory databases are designed either to use replication (including across multiple data centers) or to store data that can be replaced from slower stores (i.e. caching applications).

VoltDB launches

Posted May 25, 2010 20:42 UTC (Tue) by seanyoung (subscriber, #28711) [Link]

In-memory databases can do persistence. For example, soliddb writes to disk. The difference with "traditional" databases is that it never needs to read column data from disk, except for on start-up/recovery.

Considering VoltDB claims ACID, I presume it writes to disk too. I have not checked.

VoltDB launches

Posted May 25, 2010 20:51 UTC (Tue) by flammon (guest, #807) [Link]

From http://www.dbms2.com/2010/05/25/voltdb-finally-launches/

Instead, VoltDB lets you snapshot data to disk at tunable intervals. “Continuous” is one of the options, wherein a new snapshot starts being made as soon as the last one completes.

VoltDB launches

Posted May 25, 2010 21:06 UTC (Tue) by aweisberg (guest, #58563) [Link]

VoltDB uses k-safety (replication) for durability. Snapshots are for disaster recovery when there is no WAN replica, backups etc. Replication within a local cluster works now and WAN replication coming.
http://community.voltdb.com/
http://community.voltdb.com/roadmap

VoltDB launches

Posted May 25, 2010 22:03 UTC (Tue) by ms (guest, #41272) [Link]

Oh christ, I hope this isn't another "let's write out the whole world every N seconds". The number of KeyValue stores that do that rather than writing only changed data is depressing.

VoltDB launches

Posted May 26, 2010 7:47 UTC (Wed) by intgr (subscriber, #39733) [Link]

The only such database I know of is Redis, but it also does transaction logging these days, so recovery isn't limited to loading the last snapshot.

Seems like a fairly good persistence model for in-memory databases because sequential disk writes are cheap, fast and simple. You don't have to worry about disk layout because you only need it for recovery. So what's wrong with it?

VoltDB launches

Posted May 26, 2010 9:34 UTC (Wed) by ms (guest, #41272) [Link]

Err well, if you have 20GB of RAM in your machine, and a similarly large dataset, and a few bytes of that dataset changes, but then you go past the timeout and now a new snapshot gets written, that's an awful lot of data you're rewriting for no reason at all.

VoltDB launches

Posted May 26, 2010 10:34 UTC (Wed) by intgr (subscriber, #39733) [Link]

I think you have missed the point of in-memory databases. If your application changes a few bytes in a 20GB dataset every so often, then by all means, do use disk databases.

In-memory databases are used in workloads where you need to have deterministic latency, or where write I/O throughput is so high that incurring a disk seek for every change is no longer practical. There is a long way you can go with write buffering, battery-backed RAID caches and high-end storage devices, but for many applications the tradeoffs of in-memory databases are favorable. Losing last 5 minutes worth of changes in the extremely unlikely event of a power failure can be acceptable.

Also consider that writing down 20GB of data even on the *cheapest* 7200RPM SATA disks takes 5 minutes at most (usually sequential throughput exceeds 100 MB/s). But if you have 20 GB of RAM in your servers then you can probably afford much better storage.

VoltDB launches

Posted May 26, 2010 19:35 UTC (Wed) by ms (guest, #41272) [Link]

The nature of queries is, in general, not uniformly distributed. You may well have a 20GB dataset, but let's suppose the queries you're doing on it vary with which part of the world is awake at the time. Let's also say that you've decided that a RDBMS isn't going to cut it for you, for whatever reason.

Clearly, arbitrarily rewriting the entire dataset over and over again is a waste of time. And please don't forget: a) in the cloud, it's very easy to get lots of RAM. It's quite hard to get fast HDD access; and b) it's very likely that cloud providers are not only going to charge per CPU time and network transfer, but also by storage transfer. Now yes, *I've* just brought in the added complication of the cloud, but that's where I see things going - the flexibility and ease of billing are very attractive to sysadmins and COOs alike.

VoltDB launches

Posted May 25, 2010 20:52 UTC (Tue) by flewellyn (subscriber, #5047) [Link]

So that's what Stonebraker's been up to lately. Interesting.

I wonder how VoltDB would handle the case where the database is potentially larger than available memory?

VoltDB launches

Posted May 25, 2010 21:28 UTC (Tue) by fuhchee (subscriber, #40059) [Link]

Judging from their website:

"Removing complex logging, locking, latching and buffer management clears the way for VoltDB’s 50x speedup over traditional systems. Since VoltDB has no disk waits and no user waits inside a transaction, OLTP SQL operations complete serially in microseconds."

... this sounds like a strictly RAM limited database.

VoltDB launches

Posted May 25, 2010 21:36 UTC (Tue) by flewellyn (subscriber, #5047) [Link]

That would certainly limit its applications. Unless, of course, they rely on virtual memory, but then they lose the speed advantage...

VoltDB launches

Posted May 26, 2010 4:50 UTC (Wed) by njs (subscriber, #40338) [Link]

I think you get EBUYMORECOMPUTERS.

VoltDB launches

Posted May 26, 2010 15:01 UTC (Wed) by rilder (guest, #59804) [Link]

May be that is why it mentions about database clusters. You partition the data among several machines such that database size < total memory on each host.

VoltDB launches

Posted May 25, 2010 21:10 UTC (Tue) by jwb (guest, #15467) [Link]

Stonebraker is impressively prolific. I use Vertica in my work and every time someone says "Vertica" in an email message, Gmail sees fit to give me an advertisement for VoltDB. Obviously I'm looking forward to trying out VoltDB.

VoltDB launches

Posted May 26, 2010 9:27 UTC (Wed) by dgm (subscriber, #49227) [Link]

For those interested, it does support SQL indeed, but not as one would expect.

You cannot issue arbitrary SQL sentences against a live Database. Instead, you have to compile each SQL into a stored procedure written in Java, and then upload that compiled code to the database for execution. JDBC (or ODBC) cannnot be used to access the data.

Also, you cannot alter the database schema without shutting it down. On the positive side, though, they claim that data repartitioning and migration after schema changes is handled by the database automatically.

VoltDB launches

Posted May 27, 2010 19:56 UTC (Thu) by man_ls (subscriber, #15091) [Link]

Is this practical? Maybe it's my established thinking, but use cases in the real world suddenly seem very limited. Specifically: deployment on traditional enterprise settings become very hard. Only in very mature and static environments (where queries and schemas hardly change if at all) can this work.

VoltDB launches

Posted May 26, 2010 10:43 UTC (Wed) by robert_s (subscriber, #42402) [Link]

I was very interested in this until I saw the amount of java involved.

VoltDB launches

Posted May 26, 2010 16:04 UTC (Wed) by lwinkenb (guest, #60737) [Link]

I think there are better metrics to judge an application on than the language it is written in.

VoltDB launches

Posted May 26, 2010 19:49 UTC (Wed) by NightMonkey (subscriber, #23051) [Link]

There are worse, too. Java has such administrative and resource overhead that this aspect brings my interest down quite a few notches.

VoltDB launches

Posted May 27, 2010 7:07 UTC (Thu) by rilder (guest, #59804) [Link]

Cannot agree more on this. Whenever I see a project with Java, all of sudden all the overhead comes to my mind.

VoltDB launches

Posted May 28, 2010 0:31 UTC (Fri) by robert_s (subscriber, #42402) [Link]

I say this because I would never like to trust java with dealing with large amounts of memory.

You just have to look at the JVM funny and it will swallow half a gig of RAM.

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds