When you copy the whole filesystem, you're copying across changes to the WAL (journal) as well as changes to the data/ files it is journalling. As the WAL is frequently (f)(data)sync()ed, this overhead is even higher. I am also sceptical as to its safety: it seems easy to me to synch across a change to the WAL on commit or WAL rollover but lose power before the corresponding change to the datafile is synched: on restart, the remote PostgreSQL will think that no WAL replay is necessary, when in fact one is needed.
PostgreSQL's native replication simply streams the WAL across, and replays it into the datafiles on the remote node. This is guaranteed safe.
Posted Jun 1, 2012 22:33 UTC (Fri) by jberkus (subscriber, #55561)
[Link]
Nix,
We've had reason to destruction-test DRBD+Postgres. From a data safety perspective it works as well as one could hope, provided that you take steps to prevent split-brain. It's just performance which leaves something to be desired.
Also, consider DRBD
Posted Jun 4, 2012 12:36 UTC (Mon) by nix (subscriber, #2304)
[Link]
Oh, I'm willing to believe it works most of the time. I just can't see how it's safe 100% of the time (when WAL rollover happens). However, it might be that the failure cases require long periods of downtime and very unluckily-timed powerdowns, in which case you might never see it.
But quite possibly I'm missing something.
Also, consider DRBD
Posted Jun 4, 2012 15:07 UTC (Mon) by andresfreund (subscriber, #69562)
[Link]
As long as the snapshot is atomic it *has* to work. Otherwise the original purpose of the wal - crash recovery - wouldn't be met.
Checkpoints are crash safe. Whats the problem youre seeing there?
The checkpoint record is only written to the wal *after* everything but the checkpoint information has been written out. Only after the checkpoint has been fsynced to disk resources - like the wal - are reused.
Also, consider DRBD
Posted Jun 4, 2012 18:18 UTC (Mon) by nix (subscriber, #2304)
[Link]
Agreed, this is perfectly fine. I now suspect that my memory is lying to me: it tells me faintly that DRBD may transmit data in arbitrary order and does not do a complete transmit on fsync(), but I now suspect I'm thinking of some other distributed block device and just mixed it up with DRBD. If DRBD respects fsync(), then everything works.
Also, consider DRBD
Posted Jun 4, 2012 18:25 UTC (Mon) by andresfreund (subscriber, #69562)
[Link]
I think you can configure it in a way not all required guarantees are met. They are not generally recommended as far as I remember though.
...
Yep: http://www.drbd.org/users-guide/re-drbdconf.html check the docs for disk-barrier.