Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 23, 2013
An "enum" for Python 3
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
Status.net service to phase out, replaced by pump.io
Posted Dec 18, 2012 22:51 UTC (Tue) by job (guest, #670)
Cool new Free software
Posted Dec 19, 2012 8:41 UTC (Wed) by man_ls (subscriber, #15091)
Give it a try sometime, there is a reason why node.js is taking the world by storm. We are not talking about a Java or .NET corporate campaign. Plus, the new stack is heavily based on Free software!
Posted Dec 19, 2012 10:21 UTC (Wed) by juliank (subscriber, #45896)
Posted Dec 19, 2012 12:20 UTC (Wed) by man_ls (subscriber, #15091)
Our friends in the functional community should appreciate the return of functional programming, with a vengeance: in node, everything is a callback. I remember reading the tortured syntax that Paul Graham had to write to enable some kind of pseudo-multiprocessing in Lisp; in node, it is all there and nicely managed for you.
Posted Dec 19, 2012 14:40 UTC (Wed) by ms (subscriber, #41272)
Callbacks lead to horrible staircases of code which tend to look fugly, and infect all APIs everywhere. Things like promises make for slightly nicer looking code, and give you more choices from an API POV, but are still not brilliant. IMHO, the "fact" is that for nice looking code, which is intuitive and readable, you want to be able to write blocking code.
I just don't understand why people think that callback-based programming is necessary for good performance. Most sane languages with a good RTS and green threads will be able to match the performance of NodeJS for anything involving IO. And almost certainly lead to better looking code. The popularity of NodeJS I find most perplexing. Maybe most other languages do not have good RTS or green threads.
Posted Dec 19, 2012 15:53 UTC (Wed) by man_ls (subscriber, #15091)
Anyway it is possible to write code that hides some of this complexity, avoiding anonymous functions for most of the work. The functional programming model and the object oriented stuff can be made to play together nicely with some effort.
Since a few years ago I have wondered why the kernel doesn't do something similar with an application's event loop: instead of the app having to call and wait for some time (or, god forgive us, poll all the time), the kernel should drop its privileges and give control to the app's process directly. Similar to Java event handlers, I guess, but without the JVM overhead. Async everywhere!
Posted Dec 19, 2012 16:13 UTC (Wed) by nix (subscriber, #2304)
the kernel should drop its privileges and give control to the app's process directly
Posted Dec 19, 2012 16:29 UTC (Wed) by man_ls (subscriber, #15091)
Posted Dec 19, 2012 16:52 UTC (Wed) by nix (subscriber, #2304)
Creating a new process for every single event seems sure to be far more expensive than just a privilege transition out of kernel mode. My system here can manage about five or ten thousand forks per second, but it can manage well over a million context switches into and out of kernel mode per second.
(As for a process being 'based on' a kernel process, the level of your confusion regarding how processes work in Unix is seemingly so great that I can't even imagine what you intend this to mean. Your last line suggests that you expect it to have a whole new address space of its own, which makes the thing a process-creation-level of difficulty rather than a thread-creation-level, and requires among other things a TLB flush. Expensive, expensive...)
Posted Dec 19, 2012 17:27 UTC (Wed) by man_ls (subscriber, #15091)
Posted Dec 19, 2012 19:42 UTC (Wed) by nix (subscriber, #2304)
Posted Dec 19, 2012 19:47 UTC (Wed) by man_ls (subscriber, #15091)
Posted Dec 19, 2012 16:53 UTC (Wed) by mpr22 (subscriber, #60784)
Posted Dec 19, 2012 17:31 UTC (Wed) by man_ls (subscriber, #15091)
Posted Dec 19, 2012 19:48 UTC (Wed) by nix (subscriber, #2304)
The softer version of fork() you refer to is called thread creation. It is faster -- I can generate a few tens of thousands of threads a second on this system -- but still not anywhere near as fast as context switching. You are still throwing performance away for, as far as I can tell, no benefit.
Worse yet -- if you're receiving a stream of events from some input device, they are almost certainly related in some way, and you almost always want to process them in the same process, or at least in related processes. Kicking off a new process for each event seems likely to lead to cursing from developers as they have to marshal the things back together again before they can process them. End result: a slower system, annoyed developers, more complex and buggy code, and... uh... no benefit, as far as I can see. What benefit is all this supposed to bring? Is it just 'more processes -> more isolation -> always necessarily better'? 'cos I used to think like that, and it just isn't so.
Posted Dec 19, 2012 20:46 UTC (Wed) by man_ls (subscriber, #15091)
The thing is that an event should only do what it is supposed to do, and nothing else. If the result of a mouse movement is that a counter is updated, then let the kernel do it, and not bother my own process; when my process needs to read the counter it will do that. If the mouse pointer is moved and I want to paint a cool shadow, then ideally I would let the kernel know and the kernel will paint the cool shadow. Many things work like that now; it would be great if it was the default mechanism to answer user events. Look at it as dtrace on steroids, if you want.
For a long task (such as a menu selection), then by all means create a process and let it deal with the event. Otherwise, store the new value and let me query it if and when I need it. Taken to the extreme I don't even need to have running processes except for background tasks.
Posted Dec 19, 2012 22:50 UTC (Wed) by dlang (✭ supporter ✭, #313)
Taking your mouse example, some graphics cards do support hardware mouse cursors, where the card handles moving the mouse pointer.
however, such support really doesn't help much and is fading from use today because nowdays people want to change the shape of the cursor depending on what you are doing, etc.
you also have to think about how the program should know that it should look for an update to the mouse position. you don't want it spinning in a loop when nothing happens.
you also don't just want the current position of the mouse, you want to know it's path, and how fast it's moving (along with what buttons, if any are pressed during any part of this time)
There are pretty good reasons that systems do things the way they do. People have tried a LOT of things over the years, and what's in place may not be the perfect way to do things, but they've survived lots of competing ideas along the way.
Posted Dec 19, 2012 23:03 UTC (Wed) by man_ls (subscriber, #15091)
Posted Dec 20, 2012 12:53 UTC (Thu) by njs (guest, #40338)
When a user process blocks, it doesn't actually sit there as an entity taking up space. It gets converted into a callback attached to whatever event caused it to block.
Posted Dec 20, 2012 18:37 UTC (Thu) by jch (guest, #51929)
We've done some benchmarks comparing threads with (machine-generated) event-driven code. Unsurprisingly enough, the results indicate that well implemented user-space cooperative ("green") threads are just as fast as event-driven code, but use vastly more memory. Surprisingly enough, Linux' NPTL is only some 50% slower, which I find very impressive.
See Figures 4 and 6 on pages 10 and 12 of http://www.pps.univ-paris-diderot.fr/~jch/research/cpc-20...
Posted Dec 20, 2012 20:21 UTC (Thu) by nix (subscriber, #2304)
Posted Dec 20, 2012 20:57 UTC (Thu) by job (guest, #670)
But I would say that's probably what people like about it. It's not interesting, it works and it's easy to understand. Perhaps that makes for a large ecosystem one day. V8 certainly gives great performance. That's a great combination.
(Personally I think the callback spaghetti makes for hard to read code. Let's just say I wouldn't want to debug someone else's code. And this is from someone who actually likes Perl.)
Posted Dec 19, 2012 13:04 UTC (Wed) by macson_g (subscriber, #12717)
Node.js is nice indeed, but there is no way MongoDB could beat PostgreSQL.
I've been doing some benchmarks recently, and Postgres is no slower than Mongo, and in case of querying it's usually much faster. Even when querying for content of JSON fields.
One can also store BSON objects in Postgres, in binary ("bytea") fields. I wrote a simple C++ extension allowing for reading the fields of BSON objects. It's ~20 lines of code.
And all the other things: reliability, configurability, quality of the driver and documentation, extensibility... MongoDB has a long way to go. But then again: Mongo is hip and cool and 2.0 and whatnot.
Posted Dec 19, 2012 15:58 UTC (Wed) by man_ls (subscriber, #15091)
There are other NoSQL stores like Redis or Riak which are much faster and more scalable by nature. But their key-value nature makes the programming model much more time-consuming: you have to think about how you are going to exploit the data, not how you want to store it.
Posted Dec 19, 2012 16:03 UTC (Wed) by andresfreund (subscriber, #69562)
You can optionally say that you don't need that (SET synchronous_commit = off in the session or configuration) which will then only sync commits in a background process. Which gives you a data-loss window from < 1s in default settings. Note that *only* the last transactions will be lost, *no* old data can be corrupted by that.
If you need even less guarantees you can say fsync=off.
Posted Dec 19, 2012 16:07 UTC (Wed) by macson_g (subscriber, #12717)
But there is few tricks one can do: 1) use COPY instead of INSERT, 2) loosen reliability guarantees (WAL, sync commits) in configuration, 3) define your table as UNLOGGED.
This will bring your insertion speed close to MongoDB. And probably reliability will still be better than Mongo's.
Posted Dec 20, 2012 16:41 UTC (Thu) by pboddie (subscriber, #50784)
Posted Dec 19, 2012 17:53 UTC (Wed) by Wol (guest, #4433)
As an engineer, it is IMPOSSIBLE to design an efficient purely FNF database. The maths is great, the engineering efficiency is awful.
Design an NFNF database properly (non-first-normal-form) and it will blow a relational database out of the water. Basically, I can store a *view* in the equivalent of a table, and guarantee that accessing any attribute of an object will retrieve ALL the associated attributes of that object into RAM as a side effect. Given that disk io is the biggest performance killer, I've just built intelligent read-ahead into the database engine ...
There's a reason Pick databases used to ABSOLUTELY FLY on old hardware. The difference isn't so marked now that many databases seem to fit in RAM ... but if your database is large relative to RAM and has a random working set of data, then a No-SQL database should be able to clobber a relational database for speed, simply due to reduced disk io.
(Absolutely fly == 30 users on a 386, or a Pentium 90 running a query faster than the twin-Xeon-800 intended to replace it, etc etc.)
Posted Dec 19, 2012 19:25 UTC (Wed) by hummassa (subscriber, #307)
Posted Dec 19, 2012 19:35 UTC (Wed) by man_ls (subscriber, #15091)
Posted Dec 19, 2012 23:26 UTC (Wed) by Wol (guest, #4433)
A lot of ACID and data integrity in FNF databases is needed to make sure a SERIES of updates don't get interrupted and corrupt the data. In an NFNF database, that is ONE update.
Let's say I'm writing a new invoice to my accounting system. There's, say, ten line items, and a new delivery address.
Firstly, an application written to a FNF database is likely to treat the whole thing as a single transaction, with a lot of ACID overhead as it creates a new address in the address table, ten new items in the line-item table, and a new entry in the invoice table, plus updating all the associated ledgers, customer record, etc.
An application written to a Pick database, on the other hand, will probably create the new delivery address as a single, unprotected database update. What's the point of protecting it? So long as a *single* database write is guaranteed to complete or fail, you don't need fancy-shmancy ACID, and a delivery address is a stand-alone entry. Creating an invoice is a bit trickier, not because of integrity in the invoice, but because of the need to update multiple entities in one step.
But again, so long as a *single* write is guaranteed to complete or fail, I don't need fancy-shmancy ACID to create the invoice. It's a single write to the invoice FILE. However, I do need to wrap it in a transaction because I need to update the associated ledgers.
So, as I said before, the point is I need FAR FEWER DISK ACCESSES. Which makes a MAJOR difference to speed, whether reading OR writing. Plus, statistically, I'm far more likely to be reading than writing, and as I said my "intelligent read-ahead" will massively reduce the need for reads. The fact that I'm storing the equivalent of a relational view in a single database "atom" or RECORD (the equivalent of a relational row) massively reduces the need for writes.
FNF does exist for a reason, but it's not the reason you think. It makes the maths solveable. Once you've solved it for FNF, you can then go to NFNF knowing that the MATHS is just the same, but the ENGINEERING is MUCH SIMPLER.
If you want to work in FNF, my NFNF database can give you an FNF view with minimal effort. It probably takes less time to convert (in RAM) an NFNF record into an FNF view than it did to retrieve that record from disk. I can present you with an entire view, in the same time that you take to retrieve a single row from your relational database. (And update that view, in the same time it takes you to write a single row.)
Two points to think about ...
1) Relational works with 2-dimensional tables. (done properly) NFNF works with n-dimensional tables. In maths, the generic always trumps the specific - n-dimensional trumps 2-dimensional.
2) FNF is maths. As Einstein said, "As far as the laws of mathematics refer to reality, they are not certain; and as far as they are certain, they do not refer to reality."
I've taken your maths, and applied engineering to it. If you want speed, I can guarantee you WON'T get it with a relational engine, my NFNF engine is a superset (ie it can give you *everything* you get from relational, and more beside, eg speed).
Posted Dec 19, 2012 23:37 UTC (Wed) by Wol (guest, #4433)
Oh - one of the requirements, for an easily maintainable application, is you MUST NOT MIX data and metadata in the same table. Or store the same data multiple times across several tables.
oh no! I've just specified an impossible requirement!
The problem with mixing data and metadata is that the database has no way of knowing which is which. It's all crucial information, that exists only in the programmer's head. BAD BAD BAD!
But I can do that easily in Pick. Precisely because I can do things in n dimensions, not just 2. The only data I need to store across multiple tables is foreign keys. And even then, I can get away with far less duplication than relational.
Posted Dec 20, 2012 22:26 UTC (Thu) by Wol (guest, #4433)
So as soon as your data contains lists, an rdbms is inefficient. Much data consists of lists, does it not?
Posted Dec 20, 2012 9:51 UTC (Thu) by paulj (subscriber, #341)
It seems to me that the fundamental performance/integrity trade-off and the attendant complexity is there in both cases, but in the FNF, ACID DB the DB handles this for me while with the NFNF DB I have to handle that myself. So the ACID DB way may result in all transactions suffering some performance overhead, from ensuring integrity, but it will stay consistent (if I've done the schema right) and handle those issues for me. While the NFNF DB gives me fast, no-overhead writes, but I have to be aware of and handle the complex, tricky multi-update integrity issues myself. Is that a fair view?
Would it be fair to say that the NFNF way would be more to suited to DB gurus, while anyone who hasn't spent a good bit of time working with and thinking about multi-update integrity issues might be better off using the ACID DB approach (at least, until update performance is actually proven to be an issue)?
Posted Dec 20, 2012 10:02 UTC (Thu) by man_ls (subscriber, #15091)
Also, modern databases force everything to be in a transaction, which is horrible to performance. Why force developers to lock tables when sometimes there is no need to?
I almost prefer that transactions have to be done explicitly so that developers don't have the lazy option of letting the framework do them. On the other hand, NoSQL databases make transactions too hard to do properly. I miss some kind of system where transactions are not mandatory, have to be done explicitly, and can be activated easily.
Posted Dec 20, 2012 12:20 UTC (Thu) by andresfreund (subscriber, #69562)
Transactions in the usual database system only persist across session end if they are started as two phase commit transactions. Which you only use if *have* to because you need to coordinate critical changes between distributed systems where divergence is not an option (NB: 2PC != 2PL).
> Also, modern databases force everything to be in a transaction, which is horrible to performance. Why force developers to lock tables when sometimes there is no need to?
Uhm. a) Most RDBMS's allow you to have transactions that automatically only span a single statement b) Why do transactions, unless used together with synchronous commits, imply being horribly slow. c) What does that have to do with locking tables? No sensible database forces you to exlusively lock a table to make modifications to the contents of said table. Most require some lock that prevents it from being ALTERed or DROPed, but thats something different.
Just some trivial number towards b), on my workstation from 2009, with other stuff going on, in an absolutely unoptimized configuration, PG does 320k readonly transactions per sec and 140k trivial writing ones. With a data-loss window of 1/3s. Now, I'll freely admit that the latter number only is that high because I didn't create an index but even after that its still 90k.
Now, I know from experience that there are scenarios where RDBMS's don't work very well, but - sorry to be harsh here - I don't see too many people making those 'RDBMS/ACID/whatever sucks' arguments here have actual experience with that kind of workloads. Also note that most shops using fancy KV stores and such *also* use relational databases for parts of their data. Its simply a question of the right tool for the right job.
Disclaimer: I do work with/on/arround PG, so I am certainly biased. But some other RDBMS's are certainly able to produce similar numbers.
Disclaimer-2: Yes, those numbers are not realistic for any real-world workload.
Posted Dec 20, 2012 15:15 UTC (Thu) by nix (subscriber, #2304)
on my workstation from 2009, with other stuff going on, in an absolutely unoptimized configuration, PG does 320k readonly transactions per sec and 140k trivial writing ones.
I am obviously missing something. (Again. It's not my week...)
Posted Dec 20, 2012 15:52 UTC (Thu) by andresfreund (subscriber, #69562)
unoptimized != default. It was just the system wide installation I had lying around, it sure has some adjustments, just not optimized for anything. I don't know from the top of my head which ones exactly, I can look it up though if you want.
> There's no way you can do more than a few hundred such transactions per second unless you have a battery-backed RAID array so you don't need to wait for on-the-average half a disk rotation to bring the data under the head
Yes, I had synchronous_commit turned off (I do that on all my development systems) thats why I wrote that I had a 1/3 second window of data-loss. Thats ok enough in many scenarios.
You btw. can get into the lower thousands on a single spindle these days (since 9.2) with synchronous commit as well, due to the way WAL flushes are batched/grouped across different sessions. Requires quite a bit of concurrency though.
> (and then possibly another avg half rotation for associated metadata)
PG writes the WAL sequentially into pre-allocated files to avoid that...
> Doing 320,000 transactions per second with PostgreSQL is hard to imagine.
Those were the read only ones. 320k writing transactions to logged (i.e crash-safe) tables is probably impossible independent from the hardware right now due to the locking around the WAL.
Sorry, should have given a bit more context, but it really just was 5 minutes of benchmarking without any seriousness to it.
Posted Dec 20, 2012 23:23 UTC (Thu) by man_ls (subscriber, #15091)
Posted Dec 21, 2012 0:15 UTC (Fri) by andresfreund (subscriber, #69562)
1) synchronous_commit = off. Is it ok to loose the newest (0.2 - 0.6s) transactions in a crash. Older transactions are still guaranteed to be safe. Thats a choice *YOU* have to make, it really depends the data youre storing (and can be toggled per session & transaction). Obviously things are going to be slower if you require synchronous commits.
2) shared_buffers = 512MB. How much memory can be used to buffer reads/writes. Would I have optimized I would have probably set it to 4-6GB.
3) checkpoint_segments = 40. How much disk can we use for the transaction log. Would I have optimized for benchmarking/write heavy it would have been set up to 300.
I don't know how those could be determined automatically. They depend on what the machine is used for.
And I used plain INSERT & SELECT, no COPY.
Posted Dec 20, 2012 19:21 UTC (Thu) by raven667 (subscriber, #5198)
One thing I see from people who do a lot of work with Postgres and understand it very well is that there is a lot of flexibility on the data structures used and methods to access those data structures that have been optimized by developers competent in the math over the course of decades. So whatever algorithms are used to drive these NoSQL databases and make them fast are probably already implemented as good or better in Postgres, but it does take a lot of familiarity with how the data is stored to make use of this fact. You also have the ability to make plugins to add new data types and storage methods for whatever use case you need.
I guess to distill the point is that many of the techniques which make NoSQL databases fast can also be used with SQL databases if you know how, and you get the benefit of the decades of research and development that has gone into the traditional SQL engines.
a lot of familiarity with how the data is stored
Posted Dec 20, 2012 22:23 UTC (Thu) by Wol (guest, #4433)
Which is EXPLICITLY FORBIDDEN by the relational definition!
Exactly my point about Postgres taking a lot of ideas from the NoSQL crowd.
In Pick, I know *exactly* how my data is stored, because I told the database how to do it. Which is why Pick databases can be a mess because it's not properly normalised :-( The programmer needs to normalise his FILEs properly or you can lose the benefits. I've seen too many badly normalised FILEs that are a pain to work around.
Posted Dec 20, 2012 13:43 UTC (Thu) by paulj (subscriber, #341)
Unless of course the argument is that often the performance-impacting consistency guarantees are not required. But then, aren't ACID DBs tweakable? You can make, e.g., PostGres play a lot more loose with data than it does by default, I gather, and gain a lot of performance.
Posted Dec 24, 2012 9:58 UTC (Mon) by man_ls (subscriber, #15091)
NoSQL stores usually allow you to use different schemas on the same table. Isn't it better to enforce a single schema? It depends; after all you will have to make sure that your data have the correct format before writing to the store (or after reading from it), so having the database reject your data is not a substitute for thorough testing. Also, having a single schema goes directly against reversible DevOps, as it entails offline data migration. Not everyone can afford downtime to migrate data between schemas.
As to consistency, the advantage of NoSQL stores is that they allow you to choose the degree of consistency that you need. You can either read everything from many different "tables" or you can just store every piece of data multiple times. With relational databases you can also denormalize data, but they are usually less flexible as to how it is stored. (It is harder e.g. to store an array inside a table, you can just store the first n items.) If you need total consistency, then by all means go to a relational store since it will give you better guarantees. But consistency is again not a magic pixie dust you can sprinkle on your data; it has to be there from the start.
Same with transactions: if you need them, go to a transactional store. But first think if you need them, and if you do then design them properly. Do not just trust your store to do the right thing because there are 100 ways to mess it up.
You can try to use PostgreSQL as a NoSQL store, but you will be swimming upstream for the rest of your career. How do you share the load, or replicate between nodes? How do you deal with consistency if you need it in a non-relational table? How do you optimize a single database for both consistency and lack of it?
Posted Dec 25, 2012 5:26 UTC (Tue) by raven667 (subscriber, #5198)
There are certainly specific use cases that the various NoSQL databases were designed for and are optimized for, if you have one of those cases then by all means use the right tool for the job, but there seem to be a lot of inexperienced developers who think that traditional SQL databases are slow creaking dinosaurs, unsuitable for any purpose when in fact they contain highly optimized data stores which have been developed over decades by people competent in the relevant math and computer science. Different does not always mean better.
Posted Dec 25, 2012 17:38 UTC (Tue) by man_ls (subscriber, #15091)
Other NoSQL families, like key-value (Dynamo-like) or graph databases, are more specialized and need more effort to keep going. But I would argue that none are near the level of sophistication of a normalized database. Most of you probably don't feel it because you have worked with SQL for many years, but it is a contorted language that uses a highly unnatural data model. Yes, relational databases are highly optimized, but you pay the price every time you read or write anything to them.
Posted Dec 25, 2012 17:56 UTC (Tue) by Wol (guest, #4433)
Your "contorted language" is spot on. Basically, in relational you cannot STORE that most natural of data concepts, the list. You have to MODEL it. At which point your database becomes unnecessarily complex and complicated.
Referring back to the comment in your earlier post about "You can either read everything from many different "tables" or you can just store every piece of data multiple times. With relational databases you can also denormalize data, but they are usually less flexible as to how it is stored."
But that's exactly what Pick does! NFNF is what relational purists would call "denormalised". The fact that it can be mechanically normalised by the DB seems to have passed them by. So in Pick, I don't "read from many tables OR store the data many times", I just store the data ONCE in ONE table. imho it is *relational* that needs to "store the data many times in many tables" - just try to store a list, then ask yourself how many times you have to store the name of the list (or its id, the same thing...) As a system that requires (as part of its definition) that you don't store duplicate data, the relational system is very poor at living up to its own definition!
Posted Jan 4, 2013 16:56 UTC (Fri) by nix (subscriber, #2304)
Posted Dec 20, 2012 16:20 UTC (Thu) by Wol (guest, #4433)
First of all, what's wrong with an ACID NFNF database? ACID has ABSOLUTELY NOTHING to do with relational. Relational (both FNF and NFNF) is mathematical theory. ACID is engineering reality. If I execute my relational database program, my database will reflect that EXACTLY. Any errors are down to a buggy program. If I *run* my database program on a computer, however, reality may intervene and things will go wrogn. ACID is meant to address the vagaries of the real world. One only has to follow what's going on with linux file systems to know that integrity and ACID is very important there!!!
The thing is, because an atomic update in a relational database is a row (indeed, it may even be smaller than that), ACID is extremely important to a relational engine because there are so many points at which failure could corrupt the database. In a Pick database, an atomic update is an object - a complete view - so Pick can *safely* rely on the file system to a far greater extent.
As for NFNF being for database gurus, there's a reason most Pick applications were written by *USERS* - it's actually a lot easier for users than relational FNF, while people who've been indoctrinated in FNF can't get their head round it :-) Plus, for FNF people, it lacks large chunks of what they consider necessary, BECAUSE FNF needs those things to get engineering weaknesses.
Things like integrity, for example. I divide integrity into "strong" and "weak". Attributes that refer to a single instance of an object are strongly related. By storing them in a single RECORD, Pick doesn't need integrity checking. But relational needs to store them across multiple tables and rows, so integrity checking is paramount. Weak integrity (the relationship between two objects) is important to both FNF and NFNF (but, because NFNF doesn't have strong integrity checking, it doesn't tend to have weak checking either).
This is where transactions come in, and MODERN Pick databases have it. But many old applications don't. The nice thing with Pick is I can lock all the *objects* I want to update, then update them in a transaction. So (unless I overflow the lock table) I haven't had to lock an entire table to do a complicated transaction. I've offloaded integrity to the db engine, as you wanted, but because each *object* is an atom, the complexity at the db level is far less, and the chance of a transaction failure is far less, because I'm updating far fewer rows in far fewer tables, for the same change to the db.
Picking up on andresfreund here - Pick does NOT have transactions across a single statement. Why? It doesn't need it! A single statement is always atomic, and ACID will be enforced by the OS. The db engine can ignore the problem. That's my point above, about Pick lacking a lot of "necessities" because the Pick model makes them unnecessary.
As for "no experience of workloads", you did pick up on my comment about a P90? That took 5 mins to run its query (db == UniVerse). The replacement system - the twin Xeon - managed to run the query in 4.5 mins after six months hard work tweaking and optimising (db == Oracle). I guessed (and gather I'm right) that the company in question was William Hill, one of the biggest bookies in the UK.
Posted Dec 20, 2012 19:05 UTC (Thu) by dlang (✭ supporter ✭, #313)
In addition, the OS and disk may re-order your writes so that some parts of your file may be written to disk, but earlier parts of your file are not (i.e. gaps in the file on disk)
since these two assumptions you are making are false, the rest of your discussion falls apart.
Posted Dec 20, 2012 22:16 UTC (Thu) by Wol (guest, #4433)
EXACTLY THE SAME is true (or not true) of a relational database. If my Pick engine can't assume that a write has succeeded, nor can your relational engine. If your relational engine can force the issue, so can my Pick engine.
ACID is nothing to do with relational. A relational app has to deal with the database returning a status of "unable to save data" - so does a Pick app.
A relational database has to make sure that when it tells the app that the data has been saved to disk, it really has. A Pick database, likewise.
And you completely miss my major point about speed. Let's say I am writing that invoice. First of all, when I read the client info, chances are it is a *single* disk access. In a relational database it is probably several reads scattered across multiple rows and tables. Then when I write the invoice, it is a SINGLE write to the invoice table. Done! Then I need to update the sales ledger and the client outstanding. I've already read the client info - that's cached - ONE write to the client table and the client outstanding is done. One read and write to the sales ledger and that is done.
Obviously those three writes need to be wrapped in a transaction to enforce ACID, but that would be true of relational. The difference between Pick and relational is that in relational that could easily be twenty or thirty writes. That's a LOT of i/o. And if your ACID is doing a sync, to ensure it's all flushed to disk, that's a LOT of overhead my Pick database doesn't have. Now add in all your integrity checks - that make sure your invoice has a matching delivery address, invoice address, line items, etc. That aren't needed in Pick because the invoice is one atom, not plastered across multiple tables ...
Where Pick scores massively, as I keep trying to hammer home, is at the (*expensive*!!!) interface between the database engine and the disk, there is FAR LESS TRAFFIC. So Pick is a lot faster. As the Pick FAQ put it - "other databases optimise the easy task of finding data in ram. Pick optimises the difficult task of getting it into ram in the first place". Where relational typically passes *many* writes to the OS, Pick passes a single ATOMIC write.
To drift the topic a bit, do you know why Linux tends to panic rather than fix things? Because, like a true engineer, Linus assumes that - when things go wrong - the code to handle any problems is likely to be buggy and untested, so it's far better to fail noisily and let the user handle the mess, than try to fix it automatically and fail. Yes, that's a judgement call, but a good one. Relational tends to try to enforce perfection. Getting that last ten percent can be VERY expensive. More than expensive enough to be totally counter-productive.
If I was writing a new Pick app I would be more concerned about the integrity of the objects - such as the invoice and information provided by the customer - that they were saved. And that's a single write to disk.
Stitching the customer order into the accounts and shipping can come later. And if it screws up I can use transactions to roll back, or run an integrity check to complete the failed commit. (In actual fact, I used exactly this technique - for a different reason - in an accounting app I wrote many moons ago.)
At the end of the day, Murphy WILL strike. And if you rely on ACID in your database to save you it won't. If, instead, you rely on phased, replayable commits, you're laughing. My database will be vulnerable to a crash while the rep is on the phone to the customer. Once the rep has hit "save order" and that has succeeded, all I'd be afraid of is losing the disk - the order can be committed to fulfilment and accounts as and when.
WHEN Murphy strikes and ACID fails, how easily could you run an integrity check on your database and fix it? In Pick, it's both EASY and FAST. Pretty much all the systems I've worked with do it as a matter of course as an overnight job.
Posted Dec 20, 2012 22:39 UTC (Thu) by Wol (guest, #4433)
ACID sits at the interface between the database and the OS.
A transaction that is seen by the app as atomic is very UNlikely to be passed from a relational database to the OS as an atomic write.
That same transaction is far more likely to be passed to the OS as an atomic write by Pick.
Far less complicated. Far less overhead. Far easier for the user or programmer to understand.
Posted Dec 21, 2012 1:43 UTC (Fri) by dlang (✭ supporter ✭, #313)
you are claiming that you are so much better than the relational databases because you don't do all the hard work that they do to be safe in the face of multiple writes.
you can't then turn around and say "if they do it, I can do"
you could, but then you loose what you are claiming is such a great advantage.
Posted Dec 21, 2012 12:09 UTC (Fri) by Wol (guest, #4433)
What I am claiming is that where a relational database HAS to do MULTIPLE writes, a Pick database usually only has to do ONE!
Who cares if the overhead PER WRITE is the same, if I'm doing half the writes, and that overhead is expensive, I'm going to trounce you for speed! Chances are, I'm doing a lot LESS than half the writes. That's the whole point of NFNF. (And as I keep saying, ACID has nothing to do with relational, and everything to do with reality, so Pick doesn't have to do it the same way as relational. It can if it wants, no reason why not.)
(I also forgot to mention, because Pick is primary-key-driven, data retrieval usually involves a direct key access, not a search via an index for a row - more savings on disk access!)
And chances are I'm doing a heck of a lot less i/o, because I have far less duplicate data all over the place, and I'm storing it much more compactly too. I was involved in porting an app from Pick to SQL-Server, so I've got direct comparisons to hand - dumping the data from the Pick datastore and loading into SQL-Server, the resulting SQL-Server database was MANY times the size of the Pick one. Four, five times? Maybe more. Oh, and I'm including in the Pick datastore all the data overhead we didn't transfer. And the Pick datastore by default (we didn't change it) runs at 80% full. I can't give you figures for SQL overhead because I don't know it.
Posted Dec 21, 2012 12:51 UTC (Fri) by pboddie (subscriber, #50784)
I think it's also pertinent to mention that PostgreSQL has been able to deal with things like multivalued columns for a long time and in an arguably more sane fashion than, say, UniData in various respects, such as in the storage representation which, as I recall with UniData, involved various "top-bit-set" characters as field boundaries that probably made internationalisation a pain.
Certainly, this class of system works well for certain kinds of systems and there's undoubtedly a lot of people still sticking with them, as well as a lot who tried to migrate from them in a bad way, either causing lots of teething troubles and organisational stress with the new system or reinforcing prejudices about the capabilities of "those new-fangled client-server RDBMSs". That the migration exercises probably involved Oracle or some other proprietary solution, where only a single vendor can ease the pain, probably won't have helped.
It's telling that UniData and UniVerse ended up ultimately with IBM after an acquisitions cascade that involved lots of slightly bigger fish eating other fish. I think it was Ardent acquiring the Uni-products, being acquired by Informix, being acquired by IBM. Unlike HP who would have killed the product line in order to add a few cents to the dividend for that quarter, IBM probably see the long-term value in those remaining customers.
Posted Dec 21, 2012 13:23 UTC (Fri) by Wol (guest, #4433)
Yes, I suspect internationalisation may be a bit of a pain, but it has been done. I haven't used it, but I haven't used internationalisation on linux either (I guess it's there, but I'm not conscious of it).
Limited depth hierarchies? In reality, how often do you blow off the end of Pick's ability? The tools aren't necessarily that good, but it handles between five and seven levels "out of the box". How many entities have attributes nested that deep?
You're right about Ardent acquiring the Uni products, but in reality, Ardent took over Informix. Yes, I know Informix the company bought out Ardent, but six months later the Informix board was gone, replaced entirely by Ardent people. The rumour is that IBM bought Informix for the Informix database, only to discover that the company's primary product by then was U2.
And as you say about Postgres, I don't know anything about it but I understood it could handle multivalue columns sort of thing. If you're going to be strictly relational, however, that's not allowed :-) Postgres is moving away from a pure relational db to a more NFNF model. Pick was there first ... :-)
Informix and IBM
Posted Dec 21, 2012 18:21 UTC (Fri) by markhb (guest, #1003)
You're right about Ardent acquiring the Uni products, but in reality, Ardent took over Informix. Yes, I know Informix the company bought out Ardent, but six months later the Informix board was gone, replaced entirely by Ardent people. The rumour is that IBM bought Informix for the Informix database, only to discover that the company's primary product by then was U2.
Posted Dec 21, 2012 15:51 UTC (Fri) by Wol (guest, #4433)
Let's start with the invoice and delivery addresses. Are they attributes of the invoice, stored in the invoice record, or relations to a WORM table of locations? As far as Pick is concerned, it doesn't care, it can store a foreign key or the location itself. Okay, the same is true of relational, but depending on how relational physically stores the data, it may have an impact later on.
Now the line items. Are they an attribute of the invoice, an attribute of some ledger, or an entity in their own right? I'm inclined to make them entities in their own right, not knowing enough about accounting off the top of my head to make the best call. I *could* make them an attribute of the invoice.
Now to save it all. Assuming the addresses all exist on file, that's one write for the invoice record and ten writes for the ten line items (if the line items were invoice attributes, I could have written the lot in just ONE write). Eleven atomic writes, wrapped in a transaction.
In relational, however, I have to add a row to the invoice table. Ten rows to the line item table. And update the line-item index on invoice. That's over and above the fact that I have to muddle data and metadata in the line item table - creating some random field I can sort on to return the line items in the correct order (in Pick, I simply store a *list* of line-item keys in the invoice record). So relational has the extra overhead of more "data" to store, and (unless it's willing to incur a massive hit on reading) the overhead of updating a whole bunch of indexes. The same eleven writes of data (with no option to reduce it to one) plus a bunch of indexes.
Now, let's assume we come back a week later and want to print off the index. I'll ignore how we obtain the invoice number. In Pick, we have ONE read for the invoice record, TWO reads for the addresses, and TEN reads for the line items. By the way, a read is defined as a SINGLE disk seek instruction. Statistics tell me the engine is going to make one mistake, so I need to make 14 seeks.
In relational, however, I guess I need to read the invoice table index to find out where to find the invoice. That's two seeks minimum. Then I need to read the two addresses. Another four seeks. Then the index on the line item table followed by the line items. That's eleven seeks, assuming the location is stored in that index or twenty-one if it isn't. I make that 17 *minimum*, probably a lot more.
Remember I said Pick optimises retrieving data from disk?
What if I made a mistake and stored line items as an invoice attribute when I shouldn't? I end up with the equivalent of the relational line item table, clustered by invoice number. Given that relational has to guess how best to cluster data, chances are my arrangement is just as good :-)
At the end of the day, as soon as we start arguing performance, I have a MASSIVE advantage over you. The relational model explicitly forbids you knowing the internal structure of the database, so that the engine can optimise it as best it sees fit. As an application programmer, I know *exactly* how Pick is storing its data at the disk level. There's a reason why Pick doesn't have a query optimiser - it's a fairly trivial exercise in logic to prove that disk access is so efficient (approx 97%) that any attempt to optimise it will cost more than it saves. Pick enforces primary keys. The primary key enables Pick to calculate the location of any item on disk. The Pick data structure pretty much enforces logically tightly coupled attributes to be physically tightly coupled on disk. The ability to store a LIST of foreign keys in a single atomic record eliminates the need for many indices (and because it's a LIST eliminates the need to muddle data and metadata).
In Pick's worst-case scenario (provided the data has been normalised), it degrades to a weakly optimised relational scenario. (The enforcement of primary keys provides some indexing.) In Pick's typical scenario, any half-way complex query is going to leave relational in the dust. That P90 query I mentioned? I bet those Oracle consultants were adding indexes up the wazoo to try and improve performance. The Pick query was probably thrown together in five minutes, and because it was pretty much solely hunting down pre-known primary keys, could go straight to the data it wanted without needing to search for it on dis.
If you want to know how Pick finds data so fast - http://en.wikipedia.org/wiki/Linear_hashing - given a known primary key, it takes on average 1.05 requests to disk to find what you're looking for!
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds