Cool new Free software
Posted Dec 24, 2012 9:58 UTC (Mon) by man_ls
In reply to: Cool new Free software
Parent article: Status.net service to phase out, replaced by pump.io
There is a subtle difference between premature optimization and sensible design. Premature optimization tends to be low level, while sensible design is more of a big picture thing. You cannot commit all your data and your code to live in PostgreSQL and one day migrate magically to Riak; or go back and forth as you need it. Even migrating between similar databases is an error prone task, and JDBC and similar libraries that try to abstract the underlying database are just an excuse for managers. So changing databases is not an "optimization", and therefore it cannot be premature.
NoSQL stores usually allow you to use different schemas on the same table. Isn't it better to enforce a single schema? It depends; after all you will have to make sure that your data have the correct format before writing to the store (or after reading from it), so having the database reject your data is not a substitute for thorough testing. Also, having a single schema goes directly against reversible DevOps, as it entails offline data migration. Not everyone can afford downtime to migrate data between schemas.
As to consistency, the advantage of NoSQL stores is that they allow you to choose the degree of consistency that you need. You can either read everything from many different "tables" or you can just store every piece of data multiple times. With relational databases you can also denormalize data, but they are usually less flexible as to how it is stored. (It is harder e.g. to store an array inside a table, you can just store the first n items.) If you need total consistency, then by all means go to a relational store since it will give you better guarantees. But consistency is again not a magic pixie dust you can sprinkle on your data; it has to be there from the start.
Same with transactions: if you need them, go to a transactional store. But first think if you need them, and if you do then design them properly. Do not just trust your store to do the right thing because there are 100 ways to mess it up.
You can try to use PostgreSQL as a NoSQL store, but you will be swimming upstream for the rest of your career. How do you share the load, or replicate between nodes? How do you deal with consistency if you need it in a non-relational table? How do you optimize a single database for both consistency and lack of it?
to post comments)