Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 23, 2013
An "enum" for Python 3
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
Alternatives to SQL Databases
Posted Apr 19, 2009 17:00 UTC (Sun) by dmag (subscriber, #17775)
You're running a bank and want to debit $10 from one account and credit $10 to another account. You want it all to happen or none to happen. (Atomicity)
You don't want your "end-of-month" summary report to be off by $10 when that money just happened to be "in transit" when the report was run. Nobody should see the in-between "bad" states. (Consistency, Isolation)
Even more important, you don't want a server failure (crash, power off) to *ever* leave things in that intermediate state permanently. (Durability)
ACID databases can't do much in parallel because it must always think about the strict ordering of transactions.
On the other hand, if you're running a web forum, maybe you're willing to live with the possibility of loosing a few messages (on server failure) or allowing new posts in a deleted forum (for a few seconds) in exchange for scaling 100x better.
(I predict that non-relational/non-ACID will become the dominant form of databases -- because very few things actually need all properties of ACID.)
Posted Apr 20, 2009 18:22 UTC (Mon) by rfunk (subscriber, #4054)
Posted Apr 22, 2009 2:31 UTC (Wed) by dlang (✭ supporter ✭, #313)
after the first two paragraphs it ignores the issue it raises to become just a list of random datastores
to make a reasonable decision we need to know what the trade-offs are of each option
memcached fails ACID because it stores everything in ram, so it looses the D (durability)
note that many 'regular' databases can also be configured to sacrafice durability in the name of performance.
the real common thread between these datastores is less the fact that they sacrafice ACID than in the fact that they ignore SQL.
Posted Apr 22, 2009 12:49 UTC (Wed) by dmag (subscriber, #17775)
Yes and no. The reason they don't have SQL is that they are young and focused on being different than RDBMSes.
It's actually not that hard to add some SQL support. Amazon's SimpleDB recently added "SQL-like" querying (nothing fancy, just "Select * from Table Where Field=Value"). There are a lot of SQL parsers out there, so it wouldn't be too hard for the others to add a large dose of SQL. Mind you, I don't think any of these will be 100% fully SQL-compliant. But then again, just about every RDBMS ignore some of the dark corners of the SQL standard anyway.
The reason for this new generation is that they scale better on one box, and scale better on multiple boxes. There's a reason that Amazon, Google, Yahoo, etc aren't "powered by Oracle" at their heart.
Each makes completely different assumptions about data. For example, if you are OK with "eventually consistent", you can have better availability during a network partition event.
I think their biggest win will be performance. All of these projects are too young to be fully tuned, but "Real" databases have a lot of overhead logic (query parser, query optimizer, transaction subsystem) that could be tossed out if you want 'bare metal' performance. For example, storing your Order + all its LineItems together means less I/O. Even if you tell your RDBMS to write to memory, I'll bet it's doing all kinds of layout tricks to optimize the "disk".
Posted Apr 22, 2009 23:03 UTC (Wed) by nix (subscriber, #2304)
For a long time they had their advertising slogan 'Oracle Software Powers
The Internet' on there. This led to despairing laughter and the
occasional 'thank god it doesn't' from everyone who saw it, including
various Oracle employees.
Oracle is quite good at massive thumping bank systems, but I wouldn't back
a website with it if I were you. Even Oracle can't make that work.
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds