Is this SQL databases or No-SQL?
Is this SQL databases or No-SQL?
Posted Mar 28, 2014 1:03 UTC (Fri) by Wol (subscriber, #4433)In reply to: Is this SQL databases or No-SQL? by intgr
Parent article: A discussion between database and kernel developers
Bang on - there are various safeguards in place that help prevent misuse, but at the end of the day it's down to the programmer to get it right.
> I haven't used Oracle much in particular, but I think query optimizers are a godsend based on my experience with PostgreSQL.
Let's look at what a relational optimiser has to do. Assuming crude unoptimised data access, EVERY row you look for has to search half the table!
Okay, if I'm only searching for a single row in Pick, it's quite likely I'll have to do the same. BUT!!!
Let's take the example of an invoice. Have you keyed it on the invoice number? I quite likely have. If I have, my mvDBMS can tell the OS where on the disk the data I am looking for is, with a 97% chance of being right. "Open file X, seek to position Y, read that block". If your invoice table is big (let's say 1Mb) you've got to read 1/2Mb to find it!
I've now got ALL the invoice data. Chances are (in fact it's pretty certain) your invoice data is plastered over several tables, so you are going to get further clobbered. Then what happens if you want to access the related customer details? Yet again, you probably have the data plastered over several tables and have to read half of each table to retrieve the data. My invoice record will contain the company key, and once again, I will be able to tell the OS exactly where to find it with a 97% chance of being right.
In other words, a raw unoptimised RDBMS data access is incredibly inefficient, and it gets worse as the database gets bigger. An efficiency of 1% is probably very good! So an optimiser doesn't have to be very good in order to earn its keep.
But for Pick? Let's take my worst case - assume I haven't keyed on invoice no. If I don't have an index I'll have to scan the entire table - so my worst case is equal to yours (if I'm being stupid!) But all I've got to do is declare an index on the invoice no ... and it takes one disk read (97% probability) to convert from invoice no to invoice key, and one more to retrieve the invoice.
(Okay, you can index your primary invoice table on invoice no too, but then you have to index all your subtables too, with all the overhead that involves.)
So. Your SQL optimiser can achieve a doubling of performance by increasing efficiency by a fraction of a percent.
My Pick optimiser (if I had one) can achieve a maximum of 1/3 of one percent increase in performance before it hits the unbreachable barrier of perfect efficiency. It's just not worth spending any effort searching for those sort of gains.
Oh - and there's another thing to take home from this. When making my claims about RDBMSs, I'm mostly speculating. You see, I'm *not* *allowed* to know what's going on under the covers, so I'm forced to guess. Because I know pretty much exactly what's going on under the covers of Pick, I can use logic to reason about it, and PROVE what's going on. I've got figures to work with. (Okay, the figures I have are the inverse of 97%. In a properly balanced Pick file, it takes on average 1.05 "get this block from disk" commands to score a direct hit on the target data. Oh, and rebalancing the file after a write (if that's necessary) incurs about 3 more writes. That's not that much overhead.)
And because every Pick record MUST have a key, most of the time I'm not searching for data, but targeting known records for which I have the key. (Plus I have far fewer tables to search to retrieve the same data.)
Cheers,
Wol
