Why Uber dropped PostgreSQL

Posted Aug 5, 2016 0:45 UTC (Fri) by brong (guest, #87268)
Parent article: Why Uber dropped PostgreSQL

However, the blog post seems to imply that this kind of problem is somehow PostgreSQL-specific and does not really acknowledge that bugs will occur in all database systems (really, all software, of course), including MySQL

I've seen this claim a few times around by people who clearly didn't read or understand the Uber article enough to understand that a whole class of corruptions are possible with the Postgres method of replication (raw binary log shipping) that are simply not possible in the same way with either row based or statement based structured replication.

Sure if the bug is deterministic then replaying the same transactions on the replica will cause the same corruption - but if there's a bug that's dependent on particular server state that corrupts an underlying data structure - it's very likely that the replicas won't have that same on-disk corruption when they play a statement-based replication stream - so you can fail over to a replica and keep going. With Postgres shipping the raw data structures - if they corrupt on the master, that corruption goes straight to all the replicas without an additional sanity check.

Why Uber dropped PostgreSQL

Posted Aug 5, 2016 11:21 UTC (Fri) by niner (subscriber, #26151) [Link] (6 responses)

At the same time, logical replication like MySQL does brings a whole class of corruptions that are simply not possible in the same way with Postgres' WAL based replication. So where does that leave us?

Why Uber dropped PostgreSQL

Posted Aug 5, 2016 12:10 UTC (Fri) by brong (guest, #87268) [Link] (4 responses)

[citation needed]

can you please enumerate the sort of corruptions that occur with statement based replication?

The only sort I can think of are cases where the transactions get re-ordered in the statement log compared to the order they were actually applied on the master due to concurrency, and hence the replica falls out of sync.

Or cases where you flat out allow the two ends to be out of sync by manually fiddling replication log position so that you skip transactions. You can't really call that a bug in statement based replication though.

Why Uber dropped PostgreSQL

Posted Aug 5, 2016 15:11 UTC (Fri) by paulj (subscriber, #341) [Link] (1 responses)

Well, we're talking about bugs. Anything is possible, right?

With the low-level binary log replication, bugs that lead to corruption can replicate.

With the logical level replication, bugs that lead to logical level corruption can also cause inconsistent state. E.g., an update doesn't get applied to slaves because it isn't accepted, which could affect application consistency. Bugs at the binary log level may not replicate of themselves, but could cause a logical level replication to fail to replicate and cause inconsistent state.

Isn't it the case that the logical layer replication system has _two_ layers at which bugs can strike and cause significant problems? You now have two layers that need to be robust? And bugs in the lower layer can still take down the upper layer?

Why Uber dropped PostgreSQL

Posted Aug 5, 2016 21:58 UTC (Fri) by brong (guest, #87268) [Link]

If the response to a logical update failing to apply is rejecting the update, then you know that your replication is broken, and you haven't lost anything except the most recent changes - and you can skip that update and apply something manually to fix it while you fail over to a replica and bring it as close to up-to-date as possible.

If your low level data structures are corrupted - better have a good fsck and/or good backups, because you have have no replica with consistent state any more.

Why Uber dropped PostgreSQL

Posted Aug 7, 2016 16:43 UTC (Sun) by krakensden (subscriber, #72039) [Link]

MySQL has a nice list in their documentation- statement based replication causes corruption if you use nondeterministic functions like now() or random(). If everyone in your org is aware of this, things can work, but I have definitely seen it not work.

Why Uber dropped PostgreSQL

Posted Aug 11, 2016 7:50 UTC (Thu) by ringerc (subscriber, #3071) [Link]

Simple statement-based replication is overwhelmingly flawed. Most importantly, it's completely broken with respect to "volatile" functions, sequence generation, etc. It's utterly hopeless. It can produce different results to what it did on the master in concurrent execution. AUTO_INCREMENT, NOW() and SYSDATE() etc would be very broken.

MySQL works around this somewhat by special-casing some functions, like now(). It evaluates them on the master and stores the results in the binlog, then ensures the invocations on the replica(s) return the same results as the master.

PgPool-II for PostgreSQL does something similar in statement based replication mode.

Clever, but solves only narrow cases. For example, in MySQL SYSDATE() still doesn't work safely. So you have to code very carefully to avoid breakage. (See https://dev.mysql.com/doc/refman/5.7/en/replication-featu...) .

By contrast, PostgreSQL's block-level replication leaves the replica an identical copy.

That's why in practice the most practical MySQL replication option is row-based replication or hybrid row/statement based replication. Many people who are talking about "statement based" replication here are really thinking of row-based replication, or the MIXED replication mode that MySQL can use to hybridize the two. Rather cleverly, I must say. ( https://dev.mysql.com/doc/refman/5.7/en/replication-forma..., https://dev.mysql.com/doc/refman/5.7/en/binary-log-mixed.... ).

That's what I'm involved in working on for PostgreSQL too, at 2ndQuadrant, in the form of BDR and pglogical. There's ongoing work to get this into PostgreSQL core. Though we're not planning on any sort of mixed replication mode at this point.

Why Uber dropped PostgreSQL

Posted Aug 7, 2016 3:54 UTC (Sun) by giraffedata (guest, #1954) [Link]

At the same time, logical replication like MySQL does bring a whole class of corruptions that are simply not possible in the same way with Postgres' WAL based replication.

But are corruptions of that class as dangerous?

I take the complaint to be that with the WAL-based replication, a single trigger of a bug can cost you the whole cluster. But with logical replication, for all it's opportunities to fail, the most you will lose is one replica, and at worst you'll have to blow away that replica and replace it.

Is there a class of bug specific to MySQL that corrupts the entire cluster at once?

Why Uber dropped PostgreSQL

Posted Aug 12, 2016 10:04 UTC (Fri) by moltonel (subscriber, #45207) [Link]

> Sure if the bug is deterministic then replaying the same transactions on the replica will cause the same corruption - but if there's a bug that's dependent on particular server state that corrupts an underlying data structure - it's very likely that the replicas won't have that same on-disk corruption when they play a statement-based replication stream - so you can fail over to a replica and keep going. With Postgres shipping the raw data structures - if they corrupt on the master, that corruption goes straight to all the replicas without an additional sanity check.

Have a re-read of the article: the bug that affected Uber was not trickling from the master to all the replicas. Each replica had corruption on different rows. The mailing list thread also mentions that misconception.

While each replication strategy bring their own class of potential bugs (with statement-based replication generally seen as the most fragile kind), this particular bug was apparently not made more likely by Uber/PG's choice of replication architecture, and MySQL isn't shielded from that kind of bug either.