LWN: Comments on "Klitzke: Why Uber Engineering Switched from Postgres to MySQL"

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

simonat2ndQ — Mon, 08 Aug 2016 13:20:22 +0000

"Regrettably, a number of important technical points are either not correct or not wholly correct because they overlook many optimizations in PostgreSQL that were added specifically to address the cases discussed.", quoted from this detailed discussion of the actual way internals operate with PostgreSQL:
http://blog.2ndquadrant.com/thoughts-on-ubers-list-of-pos...

Feedback has been taken well by the PostgreSQL community and we are actively discussing some improvements in the the areas of concern.

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

Wol — Thu, 04 Aug 2016 21:56:15 +0000

It's partially the ratio of writes to reads, I think. It's also the sheer volume of data collected, which triggers a cascade of events.

You tend to think of a database as a small amount of data entry, and a lot of reporting, as in an accounts system, or a lot of querying and a small amount of entry as in an order/warehouse system.

Uber, it looks like, is very much a "data pouring in" triggers reports and billing. So it's a relatively write-heavy application.

The other point is, most databases work on the assumption that if something goes badly wrong, losing a day's input is acceptable (yes I know, having to do a day's data re-entry can be a nightmare, but it is often possible). In Uber's case, if incoming data gets lost, it's lost for good. Hence schemaless - it's meant to ensure that the data is recorded, and if there's a major database failure, the incoming data is captured and can rebuild the database later on. One of their complaints was that in the event of a failure and rollover, incoming data could get lost in the transition.

Cheers,
Wol

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

HenrikH — Thu, 04 Aug 2016 21:02:21 +0000

What counts as "extreme amounts of updates" ? I'm absolutely curios and the blog post didn't contain any numbers that I could find.

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

k8to — Wed, 03 Aug 2016 09:11:43 +0000

Brand allegiance can be surprisingly strong at times.

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

moltonel — Tue, 02 Aug 2016 21:57:05 +0000

> The point is, updating a record in Postgres requires updating every index. THAT is the killer! Yes, removing little-used indices would reduce the load. But migrating to MySQL has removed the load entirely.

It removed the load "entirely" (mysql indexes still need to be updated when the column changes) but traded update-time work against select-time work (MySQL indexes being indirect make them slower to use). It's a good tradeoff if you do a lot more updates than you do selects, which must be Uber's case. But if you don't do many selects on your table, you probably don't need all those indexes (indexing everything just in case is a classic DB antipattern). Finding the best set of indexes is not easy, and it looks like Uber would have saved a lot of time by hiring a competent PG DBA (who would also have told them about hot_standby_feedback=on and pg_upgrade -k).

> Combined with the fact that 9.2 replication is extremely "chatty", Postgres had a runaway update load.

WAL/streaming replication indeed uses a lot of bandwidth, but it makes the replication much more efficient to apply on the slave (win some, lose some).

> upgrading from 9.2 required shutting down the master database and upgrading off-line - a LONG process

Again, reading their blog it seems like they went with the most naive straightforward method, and could have shortened things a lot even for their 9.1 -> 9.2 migration. There are also ways to make no-downtime migrations (even before logical decoding arrived with PG 9.4), I've managed that with older PG releases on nearly-overloaded hardware so why couldn't they ? PG 9.3 was released in 2013-09, just 6 months after they originally migrated from MySQL to PG (they seem to like changing dbs :p) and they never managed to upgrade past 9.2 ? Come on...

> The article really does make fascinating reading as to the engineering problems they faced and the problems that inappropriate design decisions make. Not saying that the Postgres design decisions are bad, they just don't work here.

Yes, lots of interesting technical discussions here. And yes, whenever a compromise needs to be made, it's going to be the wrong choice for somebody. But that somebody apparently missed a few low-hanging fruits that would have eased the pain... And went to implement a big NoSQL layer on top of a different RDBMS instead :/

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

blitzkrieg3 — Tue, 02 Aug 2016 21:18:26 +0000

> One last note about how the article describes indexing: it uses the word “rebalancing” in context of B-tree indexes. It even links to a
> Wikipedia article on “Rebalancing after deletion.” Unfortunately, the Wikipedia article doesn’t generally apply to database indexes
> because the algorithm described on Wikipedia maintains the requirement that each node has to be at least half-full. To improve
> concurrency, PostgreSQL uses the Lehman, Yao variation of B-trees, which lifts this requirement and thus allows sparse indexes. As a
> side note, PostgreSQL still removes empty pages from the index (see slide 15 of “Indexing Internals”).

That's a pretty serious error if Evan wants to be taken seriously. I know relatively little about Postgres design and even I assumed that bit was wrong.

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

Wol — Tue, 02 Aug 2016 20:50:04 +0000

> While this extreme case is indeed not handled well by PG, it's hard not to wonder wether they could have improved things by removing less usefull indexes (they don't seem to mind MySQL's indirect index, so they must have very few selects, mostly updates), de-normalising the table, using inheritance, or some other schema trick.

The point is, updating a record in Postgres requires updating every index. THAT is the killer! Yes, removing little-used indices would reduce the load. But migrating to MySQL has removed the load entirely.

Combined with the fact that 9.2 replication is extremely "chatty", Postgres had a runaway update load. If they could have backported the newer replication techniques to 9.2, they could have upgraded Postgres live, but that was the other big problem - upgrading from 9.2 required shutting down the master database and upgrading off-line - a LONG process, and then all the satellites needed upgrading - a very expensive (timewise) process over a network. All downtime that they couldn't afford.

The article really does make fascinating reading as to the engineering problems they faced and the problems that inappropriate design decisions make. Not saying that the Postgres design decisions are bad, they just don't work here.

Cheers,
Wol

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

moltonel — Tue, 02 Aug 2016 18:43:07 +0000

While this extreme case is indeed not handled well by PG, it's hard not to wonder wether they could have improved things by removing less usefull indexes (they don't seem to mind MySQL's indirect index, so they must have very few selects, mostly updates), de-normalising the table, using inheritance, or some other schema trick.

They complain about MVCC killing queries on the slave, but didn't use the parameter that's here to solve exactly that. They get (understandably) scared by a data corruption bug (wich was quickly fixed, and the area is now much better unit-tested), but move to a replication setup that is arguably more fragile. They disquality some tools (pg_logical, pgpool) for not being part of core PG, but are happy to implement a whole NoSQL layer on top of MySQL. Etc.

Yes, they encountered some PG weak points. But it seems they didn't really give PG a chance, opting to migrate to a very different setup before trying to fix the existing setup. On the other hand, the PG devs are remarkably level-headed in taking the constructive criticism and seeing what can still be improved.

Sorry for the speculation, but I guess Uber's decision was more political/personal than technical : the new CTO wanting to undo the previous MySQL->PG migration to show that he does things better, some engineers wanting to play with a fancy project, an Oracle sales rep lobbying the right person... :/

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

SEJeff — Tue, 02 Aug 2016 17:25:18 +0000

Hammering a PG database is fine. Their use case is an extreme amount of updates. This causes write amplification and is indeed a very poor use case for PG.

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

cjcox — Tue, 02 Aug 2016 15:02:56 +0000

Well in my case our PostgreSQL db's are many TB's in size. Our largest MySQL is only in the hundreds of GB's.

Again, YMMV... it just doesn't match up with our own experience with the two database systems. To the point of saying "you're doing it wrong".... but hey, you know, whatever works for you (you know?).

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

Wol — Tue, 02 Aug 2016 13:22:06 +0000

Did you read the article? How large are your Postgres databases?

Uber's problem is that their databases are (a) large, (b) are hammered intensely, and (c) require downtime they can't afford, to upgrade. Plus, thanks to a architecture decision by Postgres, they are a nightmare to garbage-collect.

It's unfortunate, but Uber's usage pattern, and Postgres's design, do NOT interact nicely. It happens ...

Cheers,
Wol

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

moltonel — Tue, 02 Aug 2016 12:35:26 +0000

> I really hope that's just a fancy name, and not actually a description of what they're doing ...

They've described what they're doing pretty extensively: https://eng.uber.com/schemaless-part-one/ (the link is in the first paragraph of the article).

While their solution obviously works for them, I doubt I'd reach the same conclusion if faced by the same requirements. The problems they are facing with PG are real and they seem to understand them well, but their choice of workarounds is strange.

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

akkornel — Tue, 02 Aug 2016 05:55:49 +0000

I agree SO MUCH with that. I've got two environments:

* A MySQL environment with MANY databases (thousands), basically a shared DB service for end-user groups, that doesn't take much disk space (<1 TB).
* A PostgreSQL environment with three databases, that is critical to the entire infrastructure.

The MySQL environment used to have replication fall out of sync occasionally, and re-syncs were always annoying. The PostgreSQL environment never fell out of sync except when I was messing with stuff, and then a simple rsync (plus service stop/start) would get things back.

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

cjcox — Tue, 02 Aug 2016 04:51:04 +0000

Obviously you need to pick the right tool for the right job, etc... PostgreSQL 9.2 is very very old now though. We run both PostgreSQL (clear up to 9.5) and MySQL (yes... MySQL) and we have more corruptions with MySQL. Just saying. YMMV.

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

drag — Tue, 02 Aug 2016 02:59:11 +0000

Since MariaDB is a fork of MySQL... how much actual variation does there exist between the two?

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

Wol — Tue, 02 Aug 2016 00:02:25 +0000

Just got your sarcasm ... :-)

And as someone who thinks relational technology is rubbish, I really think going schema-less is an exercise in idiocy! I really hope that's just a fancy name, and not actually a description of what they're doing ...

Pick gives you loads of rope to hang yourself with. And a common way of hanging yourself is to dispense with a documented schema :-(

Cheers,
Wol

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

Wol — Mon, 01 Aug 2016 23:14:56 +0000

> To our knowledge, the internal architecture that we discuss in this article has not changed significantly in newer Postgres releases, and the basic design of the on-disk representation in 9.2 hasn’t changed significantly since at least the Postgres 8.3 release (now nearly 10 years old).

To the best of my knowledge, the internal architecture of dynamic files in Pick has not changed much, if at all, since Pr1me INFORMATION v5, released in the mid 80s. IT HASN'T NEEDED TO!

OUCH!

Like WordPerfect, which I believe still uses the v6 format, released in 1994. If it's designed right at the start, why do things need to change? Why do people still fall for that "new, improved" flannel?

Sometimes, those old greybeards really did get it right.

Cheers,
Wol

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

Wol — Mon, 01 Aug 2016 23:07:18 +0000

Unfortunately, there isn't a Free Software version available (OpenQM and ScarletDME I sadly wouldn't touch without an army of lawyers present), but if you want a key/value store with full SQL capabilities, try a variant of Pick/MultiValue :-)

If you want fast and efficient, then 1NF just doesn't cut it.

Cheers,
Wol

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

riking — Mon, 01 Aug 2016 22:51:21 +0000

Likely because their new VP of Engineering has a lot of experience with MySQL.

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

flewellyn — Mon, 01 Aug 2016 22:13:10 +0000

This quote, in particular, is quite good:

"By now, their use-case seems to be a better fit for a key/value store. And guess what: InnoDB is a pretty solid and popular key/value store. There are even packages that bundle InnoDB with some (very limited) SQL front-ends: MySQL and MariaDB are the most popular ones, I think. Excuse the sarcasm. But seriously: if you basically need a key/value store and occasionally want to run a simple SQL query, MySQL (or MariaDB) is a reasonable choice. I guess it is at least a better choice than any random NoSQL key/value store that just started offering an even more limited SQL-ish query language. Uber, on the other hand just builds their own thing (“Schemaless”) on top of InnoDB and MySQL."

Ouch.

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

xorbe — Mon, 01 Aug 2016 21:41:58 +0000

Also, can we confirm if they prefer emacs or vim?

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

smoogen — Mon, 01 Aug 2016 21:09:27 +0000

My main wonder is not that they moved from PostgresSQL to another SQL for those table sizes.. my main question is why MySQL and not Mariadb?

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

chirlu — Mon, 01 Aug 2016 20:50:49 +0000

I found Markus Winand’s take on the Uber blog post interesting, too, in particular since he is someone who actually has experience with performance issues in various different database systems:

In this post I’ll explain why I think Uber’s article must not be taken as general advice about the choice of databases, why MySQL might still be a good fit for Uber, and why success might cause more problems than just scaling the data store.

Klitzke: Why Uber Engineering Switched from Postgres to MySQL

corbet — Mon, 01 Aug 2016 20:32:42 +0000

For the PostgreSQL developers' point of view, see this mailing list thread, and this note from Josh Berkus in particular.