Is this SQL databases or No-SQL?
Is this SQL databases or No-SQL?
Posted Mar 11, 2014 12:41 UTC (Tue) by Wol (subscriber, #4433)Parent article: A discussion between database and kernel developers
I'm trying (for the umpteenth time) to write my own Free MV implementation, so here's my take on things ...
From the Multi-Value/Pick POV, temporary ram isn't important. The original design was very ram-starved ($1000s/4Kb) and it didn't have the concept of disk - the hard disk was used as permanent virtual ram. That said, there's probably a lot of "nice to have" features there.
The ability to hint to the kernel that certain files (or parts thereof) are best kept in memory if possible - lookup tables for example. The ability to memory-map files for example - if I design it so an account is a single file at the OS level, I can then stick to the original design and treat disk space as if it were ram - a persistent backing store. Someone described MV as a "directed graph" which is close to the mark, so I don't need ram to store large amounts of temporary data - a typical query will retrieve a list of primary keys off the principal table, then simply retrieve each row from disk as required.
Looking at 8kb write support - yes this would be great for MV. Tables are "blocky" and you can guarantee that they will be written in fixed-size chunks. It was initially 512b or 2Kb (to match then disk-sizes). Most modern implementations allow you to specify the size in multiples of those two. From my point of view, it would be nice to be able to query the fs and ask what the optimal size is, but note that more than a few K is NOT optimal for the data loads! So large block sizes (and 8K is tending that way) are not a good idea. Basically, if I can store more than 4 or 5 rows per block then the block is too large, and a row is typically measured in bytes not Kb.
Disk writeback stuff ... fsync and friends are well-known performance killers. I don't know much about the other things, but one thing that would be great is the ability to insert a write barrier. I'm planning to make all my files COW so I want to be able to guarantee write order. I want to be able to call the OS at the end of a transaction and say "flush all this stuff in this order, but I don't want to wait until it's done". The problem is I can't guarantee over what extent this will be required! If an account is a single file, then *typically* the barrier will extend only to this file. But my preferred approach is one file per table, so I'll need a barrier on a process basis not a file basis. And of course, once I do that, I can't guaranteed that all the files are on the same fs, or even computer! There though, I would simply have to warn the user and say that guarantees don't apply.
So, speaking from an engineering pov, and from an MV pov too, I'd say that any enhancements to disk i/o would probably help MV as much as SQL, but because MV just doesn't use anywhere near as much ram, improvements there are probably not going to make much difference to us.
Cheers,
Wol
