Is this SQL databases or No-SQL?
Is this SQL databases or No-SQL?
Posted Mar 25, 2014 20:20 UTC (Tue) by Wol (subscriber, #4433)In reply to: Is this SQL databases or No-SQL? by farnz
Parent article: A discussion between database and kernel developers
Actually, alter that slightly. Change "SQL" to "Relational/First Normal Form". You can use SQL to access Pick databases, and if you're familiar with that I'd say that's not a bad idea.
Relational forces you to put everything into flat one-dimensional arrays. That has a bunch of nasty side-effects which I'll come to later, but basically it seriously flouts Einstein's dictum of "make everything as simple as possible BUT NO SIMPLER"! If I use relational maths to define my Pick database, I take each object of interest, analyse its attributes, and put them in a "table". That magic 254 character you saw simply tells Pick "this is a column separator", so I can do one-dimensional arrays like relational. But the magic 253 character means I can do a two-dimensional array! Where a single object has multiple instances of an attribute, I can store a list of them! For example, the table "person" can store a list of children!
This has several *massive* advantages. Let's say I'm writing a genealogy database. In both relational and Pick, I obviously need a "person" table". But in Pick I do NOT need a "is child of" table. I do not need a "is spouse of" table. (Obviously, I can have a "my father is" and "my mother is" column in both.) Oops - I'm now realising this might not be the best example, in that parent->child is a one->many that can be done pretty well in relational, but Pick can do a many->many this way, which requires an extra table in First Normal Form.
The next big advantage is, should you for some reason want to store the parent->child relationship in the parent record, you get order information "for free". To do this in relational, you would need an extra table, with three columns, ie parent, child, sort-order. AND SORT-ORDER IS NOT DATA - IT'S METADATA! The *content* of that field is meaningless, what matters is its collation sequence. So effectively, First Normal Form forces us to store garbage data.
The third big advantage is that we can store all strongly-related data in a single "row" of a "table". There are actually more magic characters - 252 and 251 (even 250 and 249) that let us create an array of 3 or more dimensions. Coupled with the fact we need a primary key, that we can store an array of foreign keys in a row, and that disk access has been optimised so much that the database generally retrieves a row with - on average - just over *one* seek request, that makes data access blindingly fast.
So basically, the main point of Pick is that the API between the datastore and the application passes *objects* back and forth. This then makes life conceptually simple for both the database guy and the application guy. If the database guy has any sense (Pick doesn't enforce it) he will do a relational analysis on the object so each row in the table exactly corresponds to a relational *view* of the object. So the application guy can use SQL to query the database if he wants :-)
Basically, from the engineering viewpoint, Pick mimics the real world - an object in the datastore matches an object in the real world - it's easy to understand. Also, because it's engineering-based, I can run engineering proofs! I can PROVE that Pick is fast - blinding fast, in fact.
Because Relational explicitly forbids the application layer to know anything about the database internals, such engineering proofs are impossible. In fact, I would go so far as to say that it is easy to prove that relational is crippled in its very design! Because an RDBMS is *not* a *data* store. It is a MODEL store. It contains oodles of crap "data" that has been created as a side effect of converting to first normal form - that sort column for instance! Even worse, it then passes all that crap back to the application layer so that the application has to recreate the object of interest.
A challenge I've repeatedly thrown out is "show me how to *STORE* a list in a first normal form database". So far, *NOBODY* has even responded to the challenge (a couple of people did say they'd do it, but they never did). I did say that you're not allowed to model the list, and you're not allowed to mix data and metadata in the same table. That pretty much defines the challenge as being mathematically impossible :-)
The other thing to note is that Pick seems to be faster and easier to develop in. Forgive me for giving you yet more reading matter - this is a manager's blog in which she wonders why her Pick staff always seemed to be on time and under budget, while the relational guys always overspent and ran out of time. She became a big Pick fan as a result.
http://www.tincat-group.com/mewsings/2006/01/is-codd-dead...
And one more small bit of reading ... Cache not Pick but never mind ...
http://www.linkedin.com/groups/PR-InterSystems-Cache-capt...
Note how they said Oracle could only achieve the required 100K rows/sec by cheating and inserting blobs. But cache did it no problem, and even achieved 250K rows/sec in production. Did I say Pick was blindingly fast (and yes I know Cache isn't Pick but it's engineeringly similar, but you've probably seen me compare Oracle on a twin Xeon 800 versus Pick on a P90 - and the Oracle system had difficulty keeping up ... !!!)
The only place, imho, where relational scores over Pick is that it doesn't have to answer the question "what is an object". This is, I freely admit, a very hard question. But, in dodging it, imho relational makes matters much worse. The real world comes as nouns, adjectives, and relations (we'll forget about verbs ...). Relational tries to model the real world solely in terms of adjectives and relations - and inevitably makes a pig's ear of it. You just can't describe the world in words of one syllable - it's too complicated for that.
Cheers,
Wol
