LWN.net Logo

Scale Fail (part 2)

Scale Fail (part 2)

Posted May 21, 2011 3:30 UTC (Sat) by kjp (subscriber, #39639)
Parent article: Scale Fail (part 2)

Ironic, but our single point of failure is postgres. I really don't want to have to move to cassandra. But I also don't want to be paged in the middle of the night if an ec2 datacenter goes down.


(Log in to post comments)

Scale Fail (part 2)

Posted May 21, 2011 14:38 UTC (Sat) by jberkus (guest, #55561) [Link]

Scale Fail (part 2)

Posted May 23, 2011 21:51 UTC (Mon) by kjp (subscriber, #39639) [Link]

Nothing there is as attractive as a simple (self healing) quorum system that works over encrypted WAN links. No quorum = down. Else, it works. We have no STONITH, we have no redundant network links. Thus the typical stuff that seems to be used (DRBD) and even the future postgres stuff (postgres r and xc) do not appear like strong candidates.

Scale Fail (part 2)

Posted May 23, 2011 21:56 UTC (Mon) by dlang (✭ supporter ✭, #313) [Link]

you would use the quorum system to control the DRBD of postgres stuff.

this is available today with the linux-ha project.

Scale Fail (part 2)

Posted May 24, 2011 13:58 UTC (Tue) by kjp (subscriber, #39639) [Link]

Thanks. The quorum server looks promising.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds