The entire noSQL family of servers is based on relaxing the reliability constraints of the classic ACID protections that SQL databases provided.
Posted Feb 9, 2012 0:17 UTC (Thu) by Wol (guest, #4433)
Because if it's at the OS/disk interface, what the heck is ACID doing in the database? It can't provide ANY guarantees, because it's too remote from the action.
And if it's at the db/OS interface, well as far as Pick is concerned, most transactions are near-enough atomic that the overhead isn't worth the cost (that was my comment about "90% of the time").
Your relational bias is clouding your thinking (although Pick might be clouding mine :-) But just because relational cannot do atomic transactions to disk doesn't mean Pick can't. As far as Pick is concerned, that transaction is atomic right up to the point that the OS code actually puts the data onto the disk. And if the OS screws that up, ACID isn't going to save you ...
Think of a "begin transaction" / "end transaction" pair. It's almost impossible for that transaction to truly be atomic in a relational database - you will invariably need to update multiple rows. In Pick, it's more than possible for that transaction to be truly atomic at the point where the db hands it over to the OS. ACID enforces atomicity between the OS and the db. Pick doesn't need it.
What guarantees does ACID provide over and above data consistency? Because a well-designed Pick app guarantees "if it's there it's consistent". And if the OS screws up and corrupts it, neither Pick nor ACID will save you.
Posted Feb 9, 2012 0:54 UTC (Thu) by dlang (subscriber, #313)
ACID is a feature that SQL databases have had, but you don't need to abandon SQL to abandon ACID and you don't need to have SQL to have ACID
Berkeley DB is ACID, but not SQL, MySQL was SQL but not ACID with the default table types for many years.
ACID involves the database application doing a lot of stuff to provide the ACID guarantees to users by using the features of the OS and hardware. If the OS/hardware lies to the database application about when something is actually completed then the database cannot provide ACID guarantees.
It appears that you have an odd interpretation about what ACID means, so reviewing
A transaction is either completely implemented or not implemented at all. For changes to a single record this is relatively easy to do, but if a transaction involves changing multiple records (subtract $10 from account A and add $10 to account B) it's not as simple as atomically writing one record. Remember that even a single write() call in C is not guaranteed to be atomic (it's not even guaranteed to succeed fully, you may be able to write part of it and not other parts)
this says that at any point in time the database will be consistent, by whatever rules the database chooses to enforce. Berkeley DB has very trivial consistency checks, the records must all be complete. Many SQL databases have far more complex consistency requirements (foreign keys, triggers, etc)
This says that one transaction can affect another transaction happening at the same time
This says that once a transaction is reported to succeed then nothing, including a system crash at that instant (but excluding something writing over the file on disk) will cause the transaction to be lost
What you are describing about Pick makes me thing that it has very loose consistency and isolation requirements, but to get Atomicity and Durability the database needs to be very careful about how it writes changes.
It cannot overwrite an existing record (because the write may not complete), and it must issue appropriate system calls (fsync and similar) to the OS, and watch for the appropriate results, to know when the data has actually been written to disk and will not change.
It's getting this last part done that really differentiates similar database engines from each other. There are many approaches to doing this and they all have their performance trade-offs. If you are willing to risk your data by relaxing these requirements a database becomes trivial to implement and is faster by several orders of magnitude.
note how the only SQL concept that is involved here is the concept of a transaction in changing the data.
Posted Feb 9, 2012 20:22 UTC (Thu) by Wol (guest, #4433)
Atomic: as I said, a relational transaction in relational will pretty much inevitably be split across multiple, often many, tables. In Pick, all dependant attributes (excluding foreign-key links) will be updated as a single transaction right down to the file-system layer. So, as an example, if I have separate FILEs for people and buildings, it's possible I'll corrupt "where someone lives" if I update the person and fail to create the building, but I won't have inconsistent person or building data.
Consistency: IF designed properly, a Pick database should be consistent within entities. All data associated with an individual "real world primary key". Relations between entities could get corrupted, but that *should* be solved with good programming practice - in my example above, "lives at" is an attribute of person, so you update building then person.
Isolation: I don't quite understand that, so I won't comment.
Durability: Well, when I tried to write a Pick engine, my first reaction to actually writing FILEs to disk was "copy on write seems pretty easy...". And there comes a point where you have to take the OS on trust.
So I think my premise still stands - a LOT of the requirement for ACID is actually *caused* by the rigorous separation demanded by relational between the application and the database. By allowing the application to know about (and work with) the underlying database structure you can get all the advantages of relational's rigorous analysis, all the advantages of a strong ACID setup, and all the advantages of noSQL's speed. But it depends on having decent programmers (cue my previous comment about Pick and C giving you all the rope you need ...)
And one of the reasons I wanted to write that Pick db engine was so I could put in - as *optional* components - loads of stuff that enforced relational constraints to try and reign in the less-competent programmers! I want a Modula-2 sort of Pick, that by default protects you from yourself, but where the protections can be turned off.
Posted Feb 9, 2012 20:36 UTC (Thu) by dlang (subscriber, #313)
consistency, what if part of your updates get to disk and other parts don't? what if the OS (or drive) re-orders your updates so that the write to the record for person happens before the write to building?
As far as durability goes, if you don't tell the OS to flush it's buffers (which is what fsync does), then in a crash you have no idea what may have made it to disk and what didn't.
Posted Feb 10, 2012 16:17 UTC (Fri) by Wol (guest, #4433)
Well, if you define the transaction as an entity, then it gets written to its own FILE. If the system crashes then you get a discrepancy that will show up in an audit. It makes sense to define it as an entity - it has its own "primary key" ie "time X at teller Y". Okay, you'll argue that I have to run an integrity check after a crash (true) while you don't, but I can probably integrity-check the entire database in the time it takes you to scan one big table :-)
Consistency? Journalling a transaction? Easily done.
And yes, your point about flushing buffers is good, but that really should be the OS's problem, not the app (database) sitting on top. Yes I know, I used the word *should* ...
Look at it from an economic standpoint :-) If my database (on equivalent hardware) is ten times faster than yours, and I can run an integrity check after a crash without impinging on my users, and I can guarantee to repair my database in hours, which is the economic choice?
Marketing 101 - proudly announce your weaknesses as a strength. The chances of a crash occuring at the "wrong moment" and corrupting your database are much higher with SQL, because any given task will typically require between 10s and 100s more transactions between the db and OS than Pick. So SQL needs ACID. With Pick, the chances of a crash happening at the wrong moment and corrupting data are much, much lower. So expensive strong ACID actually has a prohibitive cost. Especially if you can get 90% of the benefits for 10% of the effort.
I'm not saying ACID isn't a good thing. It's just that the cost/benefit equation for Pick says strong ACID isn't worth it - because the benefits are just SO much less. (Like query optimisers. Pick doesn't have an optimiser because it's pretty much a dead cert the optimser will save less than it costs!)
Posted Feb 10, 2012 18:43 UTC (Fri) by dlang (subscriber, #313)
that doesn't sound like a performance win to me.
Posted Feb 11, 2012 2:30 UTC (Sat) by Cyberax (✭ supporter ✭, #52523)
Posted Feb 11, 2012 5:48 UTC (Sat) by dlang (subscriber, #313)
besides, git tends to keep the most recent version of a file uncompressed, it's only when the files are combined into packs that things need to be reconstructed, and even there git only lets the chains get so long.
Posted Feb 11, 2012 13:44 UTC (Sat) by Cyberax (✭ supporter ✭, #52523)
NoSQL systems work in a similar way - they can store the 'tip' of the data, so that they don't have to reapply all the patches all the time. However, the latest data view can be rebuilt if required.
Posted Feb 12, 2012 15:57 UTC (Sun) by nix (subscriber, #2304)
Posted Feb 12, 2012 18:29 UTC (Sun) by dlang (subscriber, #313)
and it's frequently faster to read a compressed file and uncompress it than it is to read the uncompressed equivalent (especially for highly compressible text like code or logs), I've done benchmarks on this within the last year or so
Posted Feb 12, 2012 13:38 UTC (Sun) by Wol (guest, #4433)
Each month, when you run end-of-month statements, you save that info. When you upate an account you keep a running total.
If the system crashes you then do "set corruptaccout = true where last-month plus transactions-this-month does not equal running balance". At which point you can do a brute force integrity check on those accunts.
(If I've got a 3rd state of that flag, undefined, I can even bring my database back on line immediately I've run a "set corruptaccount to undefined" command!)
And in Pick, that query will FLY! If I've got a massive terabyte database that's crashed, it's quite likely going to take a couple of hours to reboot the OS (I just rebooted our server at work - 15-20 mins to come up including disk checks etc). What's another hour running an integrity check on the data? And I can bring my database back on line immediately that query (and others like it) have completed. Tough luck on the customer who's account has been locked ... but 99% of my customers can have normal service resume quickly.
Thing is, I now *know* after a crash that my data is safe, I'm not trusting the database company and the hardware. And if my system is so much faster than yours, once the system is back I can clear the backlog faster than you can. Plus, even if ACID saves your data, I've got so much less data in flight and at risk.
But this seems to be mirroring the other debate :-) the moan about "fsync and rename" was that fsync was guaranteeing (at major cost) far more than necessary. The programmer wanted consistency, but the only way he could get it was to use fsync, which charged a high price for durability. If I really need ACID I can use BEGIN/END TRANSACTION in Pick. But 99% of the time I don't need it, and can get 90% of its benefits with 10% of its cost, just by being careful about how I program. At the end of the day, Pick gives me moderate ACID pretty much by default. Why should I have to pay the (high) price for strong ACID when 90% of the time, it is of no benefit whatsoever? (And how many SQL programmers actually use BEGIN/END TRANSACTION, even when they should?)
Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds