LWN: Comments on "Alternatives to SQL Databases" https://lwn.net/Articles/328487/ This is a special feed containing comments posted to the individual LWN article titled "Alternatives to SQL Databases". en-us Sat, 18 Oct 2025 02:08:47 +0000 Sat, 18 Oct 2025 02:08:47 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Alternatives to SQL Databases https://lwn.net/Articles/331175/ https://lwn.net/Articles/331175/ gdamjan <div class="FormattedComment"> Well, at least CouchDB is very robust.<br> <p> CouchDB will only append to the storage file on disk, and will add a token at the end of the append. So in the wors case you only loose the last "transaction".<br> <p> CouchDB does not need any special recovery tools, and instatly restarts<br> </div> Fri, 01 May 2009 18:00:22 +0000 Alternatives to SQL Databases https://lwn.net/Articles/329866/ https://lwn.net/Articles/329866/ jbellis <div class="FormattedComment"> djc et al, <br> <p> Cassandra is doing well in its reincarnation as an Apache Incubator project. I was added as a committer outside Facebook, and I'm working full time on Cassandra now for Rackspace. Patches are being reviewed and applied and we're making good progress towards an official release.<br> <p> I wrote about why Cassandra was the best fit for our needs here: <a rel="nofollow" href="http://spyced.blogspot.com/2009/03/why-i-like-cassandra.html">http://spyced.blogspot.com/2009/03/why-i-like-cassandra.html</a><br> </div> Thu, 23 Apr 2009 19:00:42 +0000 PostRelational databases https://lwn.net/Articles/329822/ https://lwn.net/Articles/329822/ Wol <div class="FormattedComment"> If anybody notices a response to a story that's gone a bit stale...<br> <p> At the end of the day, the problem with relational databases is C&amp;D's 12 rules. You CANNOT have an efficient relational database because the inefficiency is mandated by the rules.<br> <p> Who says "data comes in rows and columns"? C&amp;D. But what happens if it comes in three, or four, dimensions? Sure, you can MODEL that in two dimensions, but the modelling can get very expensive ...<br> <p> The relational maths is great, but any relational engine is crippled by its adherence to the underlying (flawed) rules.<br> <p> For example, I know I got into a spat on a previous story, but many cases with relational engines that require ACID, *don't* require ACID on post-relational because they're inherently atomic.<br> <p> Cheers,<br> Wol<br> </div> Thu, 23 Apr 2009 16:46:05 +0000 Alternatives to SQL Databases https://lwn.net/Articles/329774/ https://lwn.net/Articles/329774/ joib <div class="FormattedComment"> At least on RHEL, you need to use slapd_db_recover rather than dbxx_recover, and similar for the other db_* commands, this will use the correct version of the BDB libraries that openldap was built against.<br> <p> But yeah, needing manual recovery after a crash is incredibly annoying. Though we have replicated openldap servers that reduce the likelihood of service disruptions due to this.<br> </div> Thu, 23 Apr 2009 14:44:33 +0000 Alternatives to SQL Databases https://lwn.net/Articles/329665/ https://lwn.net/Articles/329665/ nix <div class="FormattedComment"> I've seen one site that I know for sure is 'powered by Oracle': Oracle's <br> own Metalink site. Its performance is utterly appalling: half-minute <br> delays between doing anything and the response... Oracle's stunning (lack <br> of) useful full-text search capabilities shine through in the completely <br> hopeless search page as well. Of course I don't know what systems back <br> this site but I doubt it's exactly underpowered.<br> <p> For a long time they had their advertising slogan 'Oracle Software Powers <br> The Internet' on there. This led to despairing laughter and the <br> occasional 'thank god it doesn't' from everyone who saw it, including <br> various Oracle employees.<br> <p> Oracle is quite good at massive thumping bank systems, but I wouldn't back <br> a website with it if I were you. Even Oracle can't make that work.<br> <p> </div> Wed, 22 Apr 2009 23:03:04 +0000 Alternatives to SQL Databases https://lwn.net/Articles/329522/ https://lwn.net/Articles/329522/ dmag <div class="FormattedComment"> <font class="QuotedText">&gt; the real common thread between these datastores is less the fact that they sacrafice ACID than in the fact that they ignore SQL.</font><br> <p> Yes and no. The reason they don't have SQL is that they are young and focused on being different than RDBMSes. <br> <p> It's actually not that hard to add some SQL support. Amazon's SimpleDB recently added "SQL-like" querying (nothing fancy, just "Select * from Table Where Field=Value"). There are a lot of SQL parsers out there, so it wouldn't be too hard for the others to add a large dose of SQL. Mind you, I don't think any of these will be 100% fully SQL-compliant. But then again, just about every RDBMS ignore some of the dark corners of the SQL standard anyway.<br> <p> The reason for this new generation is that they scale better on one box, and scale better on multiple boxes. There's a reason that Amazon, Google, Yahoo, etc aren't "powered by Oracle" at their heart.<br> <p> Each makes completely different assumptions about data. For example, if you are OK with "eventually consistent", you can have better availability during a network partition event.<br> <p> I think their biggest win will be performance. All of these projects are too young to be fully tuned, but "Real" databases have a lot of overhead logic (query parser, query optimizer, transaction subsystem) that could be tossed out if you want 'bare metal' performance. For example, storing your Order + all its LineItems together means less I/O. Even if you tell your RDBMS to write to memory, I'll bet it's doing all kinds of layout tricks to optimize the "disk".<br> <p> </div> Wed, 22 Apr 2009 12:49:05 +0000 Alternatives to SQL Databases https://lwn.net/Articles/329494/ https://lwn.net/Articles/329494/ dlang <div class="FormattedComment"> I agree that this is a huge gap in this article.<br> <p> after the first two paragraphs it ignores the issue it raises to become just a list of random datastores<br> <p> to make a reasonable decision we need to know what the trade-offs are of each option<br> <p> for example<br> <p> memcached fails ACID because it stores everything in ram, so it looses the D (durability)<br> <p> note that many 'regular' databases can also be configured to sacrafice durability in the name of performance. <br> <p> <p> the real common thread between these datastores is less the fact that they sacrafice ACID than in the fact that they ignore SQL.<br> </div> Wed, 22 Apr 2009 02:31:16 +0000 Alternatives to SQL Databases https://lwn.net/Articles/329222/ https://lwn.net/Articles/329222/ rfunk I may not have been clear. I'm not looking for a general explanation of ACID; I know what it is. I'm looking for <i>specifically</i> how each of these doesn't fit ACID. Mon, 20 Apr 2009 18:22:09 +0000 Alternatives to SQL Databases https://lwn.net/Articles/329044/ https://lwn.net/Articles/329044/ dmag <div class="FormattedComment"> <a href="http://en.wikipedia.org/wiki/ACID">http://en.wikipedia.org/wiki/ACID</a><br> <p> You're running a bank and want to debit $10 from one account and credit $10 to another account. You want it all to happen or none to happen. (Atomicity)<br> <p> You don't want your "end-of-month" summary report to be off by $10 when that money just happened to be "in transit" when the report was run. Nobody should see the in-between "bad" states. (Consistency, Isolation)<br> <p> Even more important, you don't want a server failure (crash, power off) to *ever* leave things in that intermediate state permanently. (Durability)<br> <p> ACID databases can't do much in parallel because it must always think about the strict ordering of transactions.<br> <p> On the other hand, if you're running a web forum, maybe you're willing to live with the possibility of loosing a few messages (on server failure) or allowing new posts in a deleted forum (for a few seconds) in exchange for scaling 100x better.<br> <p> (I predict that non-relational/non-ACID will become the dominant form of databases -- because very few things actually need all properties of ACID.)<br> </div> Sun, 19 Apr 2009 17:00:08 +0000 Alternatives to SQL Databases https://lwn.net/Articles/328842/ https://lwn.net/Articles/328842/ nlucas <div class="FormattedComment"> SQLite could also be mentioned here because it's flexibility allows it.<br> If you don't mind losing data after a crash, disable that feature with "PRAGMA synchronous=OFF".<br> If you want a memory database, just open the ":memory:" database.<br> You can attach several databases and proceed as if it's just one.<br> You want even more control, use the "virtual tables" feature, where you can treat, for example, CVS files as tables, or even create a virtual table that connects to another database.<br> By implementing a new "VFS" layer you can change the low-level interface with the "disk", like add encryption, compression, make I/O using mmaped memory, whatever.<br> It's not a database for dummies. The "lite" in SQLite means you don't have a full blown SQL optimizer, so you need to do the work of actually optimizing the SQL queries beforehand. That is your job as a programmer, not SQLite.<br> <p> </div> Fri, 17 Apr 2009 15:13:02 +0000 CDB https://lwn.net/Articles/328778/ https://lwn.net/Articles/328778/ phiggins <div class="FormattedComment"> I just heard about CDB a couple of months ago and became enamoured with it's simplicity. For applications where updates are infrequent and the size of the database is relatively small (such that the entire database can be rewritten whenever and update is made), then CDB probably cannot be beat for reliability and performance. At first, I assumed that almost no domains would be suitable for this, but I was surprised when I started looking at what I've written that actually is suitable for CDB.<br> <p> The reliability of CDB depends on renaming a new file over the old database file, and I'm not sure if DJB's CDB calls fsync() or not. tinycdb does not. Make sure your filesystem can't lose your data with a rename before choosing CDB for reliability.<br> <p> <a href="http://cr.yp.to/cdb.html">http://cr.yp.to/cdb.html</a><br> <a href="http://www.corpit.ru/mjt/tinycdb.html">http://www.corpit.ru/mjt/tinycdb.html</a><br> </div> Thu, 16 Apr 2009 22:43:41 +0000 Alternatives to SQL Databases https://lwn.net/Articles/328750/ https://lwn.net/Articles/328750/ jordanb <div class="FormattedComment"> Yeah in this case we tried making the LDAP init script do that, but it seemed to make the problem even worse. We began to suspect that the problem was that there was a version incompatibility between the bdb tools we had and the library being used, which was likely Fedora's fault, but we weren't able to verify that.<br> <p> But anyway, 'need to explicitly run recovery before attempting to use the database again' is a design failure, imho. The database should be able to recognize that it is not completely consistent and recover itself on startup, and there should always be enough data to reach a consistent state that's not too far from the state when the system crashed.<br> </div> Thu, 16 Apr 2009 19:43:37 +0000 Alternatives to SQL Databases https://lwn.net/Articles/328743/ https://lwn.net/Articles/328743/ rfunk <div class="FormattedComment"> While none of these is relational, in most cases I don't see where they fail<br> ACID. Can anyone point to the specific non-ACIDity of each of these?<br> </div> Thu, 16 Apr 2009 18:04:25 +0000 Alternatives to SQL Databases https://lwn.net/Articles/328734/ https://lwn.net/Articles/328734/ intgr <div class="FormattedComment"> I think the problem you were having with Sleepycat BDB wans't "corruption" per se. BDB requires an explicit recovery command after an unclean shutdown and many applications using it do not automatically run this.<br> <p> This annoying feature has probably driven off many of Sleepycat's potential customers.<br> <p> </div> Thu, 16 Apr 2009 17:00:15 +0000 Stonebreaker et al research supports traditional https://lwn.net/Articles/328689/ https://lwn.net/Articles/328689/ mrjk <div class="FormattedComment"> There has been a recent study by Michael Stonebreaker and company comparing a "traditional" <br> parallel SQL database with Map-Reduce over a 100 server farm and finding the SQL version was <br> much more efficient. Now M.S. is not exactly unbiased here, but still it was interesting. You have to <br> be ACM member right now I think, but look at ACM Transactions shortly I believe.<br> </div> Thu, 16 Apr 2009 14:12:24 +0000 Alternatives to SQL Databases https://lwn.net/Articles/328653/ https://lwn.net/Articles/328653/ viiru <div class="FormattedComment"> Perhaps Memcachedb should've been mentioned here, also. It uses BerkeleyDB for storage and the memcached protocol for communication. It's available at <a href="http://memcachedb.org/">http://memcachedb.org/</a> and also packaged for Debian (by yours truly..)<br> </div> Thu, 16 Apr 2009 08:58:39 +0000 Alternatives to SQL Databases https://lwn.net/Articles/328652/ https://lwn.net/Articles/328652/ djc <div class="FormattedComment"> Hmm, I've heard that Cassandra is still early stages, with a lot of big code drops from Facebook and not much development in the open.<br> <p> Might also be interesting to note that both Cassandra and CouchDB are now Apache projects (the former is still in the incubator; the latter recently graduated to top-level).<br> </div> Thu, 16 Apr 2009 08:54:17 +0000 Alternatives to SQL Databases https://lwn.net/Articles/328638/ https://lwn.net/Articles/328638/ tstover <div class="FormattedComment"> Tokyo Cabinet also has a server component called Tokyo tyrant that not only serializes access for multiple process, but also supports the memcached protocol for a persistent memcached solution. It's also the successor to the noteworthy qdbm. It has much to offer those exploring data storage black magic.<br> <p> <p> </div> Thu, 16 Apr 2009 04:37:30 +0000 Alternatives to SQL Databases https://lwn.net/Articles/328634/ https://lwn.net/Articles/328634/ jordanb <div class="FormattedComment"> One thing to consider when looking at storing persistent data is how much effort the data store developers have put into maintaining data integrity. This was driven home to me when I was administering a computer running OpenLDAP and Fedora. The computer was in a situation where it would occasionally be hard-booted, and that combined with some (apparent) version incompatibilities with the Fedora repos would cause the sleepycat backend for the LDAP server to corrupt itself.<br> <p> Our eventual (horrible, rube-goldbergian) workaround was to store the actual information in MySQL/innodb and have a script nuke the LDAP database and re-inject the data whenever it fell over.<br> <p> I decided after that experience that if the data has even a chance of being important, then the most important property of any datastore is that the information should always be there and never be corrupt, regardless of if the computer is hard-booted or if the disk drive lies or if Ted Tso decides that his ideology is more important than your data. <br> <p> A bit later I heard an interview with Richard Hipp in which he discussed how much effort they put into making sure data in sqlite is "in the oxide" -- going so far as simulating hard boots during writes in their testing procedures. I've since made sqlite my default data store whenever the data might be important and it's not in an RDBMS and would hesitate to go to something else without some assurance that they take a similar amount of interest in data integrity.<br> <p> That said, memcached is awesome for caching, and some of these things do sound interesting for storing unimportant data like search indicies. <br> <p> <p> </div> Thu, 16 Apr 2009 04:16:21 +0000