|
|
Log in / Subscribe / Register

Is this SQL databases or No-SQL?

Is this SQL databases or No-SQL?

Posted Mar 28, 2014 23:52 UTC (Fri) by Wol (subscriber, #4433)
In reply to: Is this SQL databases or No-SQL? by intgr
Parent article: A discussion between database and kernel developers

> You make all these sweeping claims about relational databases. But how much experience do you actually have with them?

I've used them. I wasn't impressed. Where I worked, management decided to migrate from Pick to SQL-Server. Okay, my knowledge of SQL might not be good, but when it takes me maybe seconds to run a dump from my NF2 table, and it then takes SQL-Server hours to run the script to create the matching SQL tables ... when I do a "dir" (it was on Windows) to find out how much space my Pick tables were taking up, and finding out that the SQL data store was nigh on an order of magnitude larger ... and we had consultants who could easily have helped me speed things up but I got the impression they were happy with that sort of speed - WTF!

And I'm interested in the theory of relational systems. You know I've repeated said "you're not supposed to know how the RDBMS stores data"? And it's been repeatedly pointed out that we know how PostgreSQL does it? But that requirement is spec'd in the definition of a generic RDBMS. That's what really annoys me - the spec says "users aren't allowed to know how the RDBMS does it, to permit the RDBMS developers find a faster way". As an *engineer*, I can prove that Pick does it the fastest way possible so why on earth would I want to allow space for the discovery of a faster way!?!?!?

Likewise, First Normal Form requires that everything is *forced* into sets. The poor business analyst is forced to use a sledgehammer to smash square lists and rhomboid bags into round sets! As an *engineer* I can prove this overcomplicates things, and means that an RDBMS is irreparably inefficient.

> I tried, but failed. No, all the fields in a single row are stored together. The difference from Pick is that there are no field terminators. Instead, variable-length fields are prefixed with their length. If it's not the column you're looking for, the database reads the length value of the field and skips that number of bytes to find the next column. Fixed-length fields like integer/float/date/etc, do not need to be prefixed with their length since their offset is known just by their data type.

Ah. But in practice, Pick is just as fast. When looking for column N

searchcol = N;
while (searchcol != 0 & not(end of string)) {
if currentchar == columnmarker) searchcol--;
get next char;
}

Okay, that's pseudocode but in practice it's just as fast as

currentpos = 0;
for (column = 0; column == N, column++) {
switch (columntype) {
case integer:
currentpos += 4;
case date:
currentpos += ?
...
...
}

(or maybe even faster! How long does that switch take to run for each column?)

Again, Pick gives you plenty of rope to hang yourself with - if you're daft enough to declare column 1000 (and I've known people do it) you're going to shoot yourself in the foot :-)

> Adding a column that defaults to NULL (i.e. no value) is quick PostgreSQL and many other databases too.

:-) Which is why I restricted my comment to older RDBMSs. I know newer ones have cracked it. Chances are, however, adding a field that has a default value that needs updating every single row is pretty quick in Pick. If you're sensible about it (and it's easy to force in Pick), it would probably be a case of "read block 1, write block 1, read block 2, write block 2, ..., read block n, write block n". How can you do it faster than that?

> You can call any piece of data "just a string". That doesn't give you any insight about how it's accessed.

But as far as the Pick data store (the DBMS itself) is concerned, why should the DBMS access the data at all? As far as the DBMS itself is concerned, the row is just a blob.

Part of the trouble, I think, is that an RDBMS internally contains several layers. There's the data store. Then there's the indexing and metadata management layer. Then there's the table management layer. Then there's the view management layer. And only at this point do we have the First Normal Form API through which all access to the database is supposed to occur. With Pick, so much more of this is exposed to the application (and, imho, the RDBMS FNF API is like a lady's corset that forces data into an unatural shape to get through it :-)

The only reason a Pick mvDBMS has to concern itself with the layout of the data is in the index and metadata management layer.

> SQLite uses length-prefixed fields too, not terminator bytes.

Ah. But apparently (so I've heard) they're all strings. Not that I've ever seen it, but I have heard that one of its failure modes is putting text data into numeric fields ... That was what I meant - that apparently all data in SQLite is stored as text.

Cheers,
Wol


to post comments

Is this SQL databases or No-SQL?

Posted Apr 9, 2014 22:05 UTC (Wed) by kleptog (subscriber, #1183) [Link]

Your comments about Pick are interesting and I'm trying to follow you, but comments like this:

> I know newer ones have cracked it. Chances are, however, adding a field that has a default value that needs updating every single row is pretty quick in Pick. If you're sensible about it (and it's easy to force in Pick), it would probably be a case of "read block 1, write block 1, read block 2, write block 2, ..., read block n, write block n". How can you do it faster than that?

Make me just go "wow!". Obviously adding a column doesn't require rewriting the table, the database simply records that a column has been added and whenever a row is read that is missing that column it fills in the default value.

You seem to be quick to tar all RDBMSes with a single brush. No modern database system implements strict SQL. Indexes are not part of SQL, think about that. Yet every RDBMS has them. You keep talking about C&D's 1st rule, but no database actually requires that. PostgreSQL has supported arrays since the beginning and key/value fields for a long time. Recent releases allow you to store and index JSON documents. Which seems to me to give you all the performance of Pick with the flexibility of an RDBMS.

RDBMS builders are engineers too.

But finally, the bit that worries me most about Pick is that the entire pickwiki doesn't mention transactions at all. One of the fundamental features of databases and it's not even mentioned. You mention Pick uses hashes, but transaction safe hashes have historically been very difficult so the combination makes me think Pick don't do it at all.

And don't say "but when you store complete documents you don't need transactions". In the real world you need to be able to read a document, make changes to it and write it back knowing no-one else changed it in the meantime. You need to be able to perform complex queries without worrying you're getting an inconsistant state. Systems that don't need transactions tend to be trivial systems.

Hopefully I'm wrong though...


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds