|
API ? Data format is not the API...API ? Data format is not the API...Posted Feb 8, 2008 8:32 UTC (Fri) by khim (subscriber, #9252)In reply to: Shorter release cycle ? Ha! by malefic Parent article: PostgreSQL 8.4 development plan
Well, dump/restore database upgrade is basically the same kind of a trade-off that the kernel does when breaking the internal APIs and ABIs in every release. Nope. API/ABI are within the PostgreSQL/kernel. I don't care about that. If you want the analogy with kernel then sysfs will be a better one: the thing is closely related to internal state of kernel, that state is in constant flux yet kernel developers make sure that halfyear old userspace applications "just work". Sure - at some point compatibility is lost (for example kernel 2.6.24 is only compatible with udev 081+), but you can safely go back and forth with last 5-6 kernel versions without any changes in userspace. Not so with PostgreSQL. You can't possibly maintain the old on-disk format compatibility and innovate with the pace PostgreSQL does. Take a look on kernel. Take a look on MySQL. They all innovate with pretty damn impressive speed. Yet they don't have this attitude "our application is more important then your data". For example, the 8.3 release brings major improvements to the on-disk format, the result of which is that databases are 10-20% smaller now. Oh, for heavens sake. MySQL 5.0 have stored procedures/triggers while 4.x does not have these. Pretty major step if you'll ask me. MySQL 6.0 have Falcon engine - and it's radically different from InnoDB. Also not a trivial change. Yet in both cases you can keep old data for a while! Sure - you'll not be getting new features in this case, but the fact is: you are not forced to update the whole world when you update MySQL. You can install new version of MySQL today, change database format next month and finally use new features after year or two. And if you'll ask any sane SRE you'll know that it's the only way to have upgrade without accidents. Without PostgreSQL the only way to go is to upgrade everything in one huge step. PostgreSQL is the only DB which tries to claim it's "enterprise class DB" which require dump/restore cycle on each upgrade.
(Log in to post comments)
API ? Data format is not the API... Posted Feb 8, 2008 13:36 UTC (Fri) by kleptog (subscriber, #1183) [Link] Minor versions don't require a dump/restore and you can replicate between major versions, so if you don't want to have a switchover time it can be arranged. Other than that, patches gratefully accepted.
API ? Data format is not the API... Posted Feb 9, 2008 4:09 UTC (Sat) by malefic (subscriber, #37306) [Link] > Nope. API/ABI are within the PostgreSQL/kernel. I don't care about that. If you want the analogy with kernel then sysfs will be a better one Well sysfs is a bad analogy too. Sysfs is an explicit *API*. PostgreSQL binary data was never meant to be used other than by the core server itself. I was not saying that dump/restore cycle is a good thing. I was just saying that keeping backwards compatibility with old binary format places a huge burden on the developers and threatens the overall quality of the product. I don't know what is the official position of PostgreSQL developers, but I suspect that it's in line with stable_api_nonsense.txt. I agree, however, that there could be some better way to migrate, than dump/restore. Maybe something like running the old server in backup mode until the new one finishes the restore, and then replaying the logs on the new server. And, again, you can use replication to migrate your data with near zero downtime.
API ? Data format is not the API... Posted Feb 9, 2008 10:27 UTC (Sat) by jd (guest, #26381) [Link] > I agree, however, that there could be some better > way to migrate, than dump/restore. Maybe > something like running the old server in backup mode > until the new one finishes the restore, > and then replaying the logs on the new server. And, > again, you can use replication to migrate > your data with near zero downtime. Let's see. Here is a short-list of options I can think of: 1. Provide PostgreSQL with a platform-independent read and a playform-independent write, such that you could essentially perform the backup and restore as a chain of pipes. 2. Decouple implementation-specific details of the database file format, so that the file structure is outside of PostgreSQL itself. The engine would merely have stubs that run on seperate threads. You could then plug in as many formats as you like, up to one a thread, provided the plugin supported the stub's API. The transfer would not "just happen", it would presumably be up to third parties as to what file routines go with what APIs, but it's a realistic way to do things. 3. Write a PostreSQL "kernel" on which all threads are run, akin to an OS kernel. When upgrading PostgreSQL, you add new upgraded threads and older threads do not get replenished when finised. Upgrades can now be in-situ. However, this still requires the first or second ideas to allow the migration to take place. 4. Build a working TARDIS, obtain the database file format as used in the year 3,000 AD and bring it back here. No further upgrades will then be needed for close to a millenim. As an added incentive, I'm offering my entire Slasdot karma to the first person who demonstrates bi-directional time travel as an open source development model.
|
Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.