LWN.net Logo

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

PostgreSQL developer Robert Haas has begun a series of articles comparing the architecture of PostgreSQL and MySQL. "So, all that having been said, what I'd like to talk about in this post is the way that MySQL and PostgreSQL store tables and indexes on disk. In PostgreSQL, table data and index data are stored in completely separate structures.... Under MySQL's InnoDB, the table data and the primary key index are stored in the same data structure. As I understand it, this is what Oracle calls an index-organized table. Any additional ('secondary') indexes refer to the primary key value of the tuple to which they point, not the physical position, which can change as leaf pages in the primary key index are split."
(Log in to post comments)

PostgreSQL committer

Posted Nov 30, 2010 14:29 UTC (Tue) by marcH (subscriber, #57642) [Link]

Robert is a "PostgreSQL committer" : looks like the git transition is not complete yet ;-)

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Nov 30, 2010 16:08 UTC (Tue) by Rubberman (guest, #70320) [Link]

According to this, MySQL's InnoDB stores the data and indexes in the same file. Not a good practice in my professional opinion. To me, the raw data is sacrosanct, and index errors should NEVER compromise the actual data. You can regenerate indexes easily enough (just CPU and clock time is required), but you may not be able to restore the raw data. That's why I much prefer to use PostgreSQL for critical systems where the data has to be kept inviolate. FWIW, I have been using, developing, and teaching relational database theory and SQL programming for almost 30 years. According to some experts in the field (members of the ANSI/ISO SQL standards committees), I am supposed to know what I am about wrt. databases... :-)

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Nov 30, 2010 16:58 UTC (Tue) by iabervon (subscriber, #722) [Link]

There's no a priori reason to think that separate filesystem files are relevant to fault isolation, since the processes that manipulate the indices probably have the file of raw data open as well. Chances are that both are mmapped and a stray pointer could corrupt either equally easily.

Also, it's very easy for index errors to propagate into corruption of raw data; all it takes is a row being matched incorrectly by the conditions on an update because the index was incorrect. If the index on your primary key column is inaccurate and returning the wrong rows, you'll quickly trash your database.

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Dec 3, 2010 0:40 UTC (Fri) by zlynx (subscriber, #2285) [Link]

Not sure if Postgres does it, but that would be a trivial bug to solve: just double check the condition on the data row before using it. That way even if the index provided the wrong row, the double check will catch it.

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Dec 3, 2010 3:52 UTC (Fri) by iabervon (subscriber, #722) [Link]

I know at least some versions of postgres didn't check, because I've seen it (note, however, that the corruption was our fault, not postgres's). There's also the issue that, if the indices are corrupted, it won't necessarily find violations of uniqueness constraints, and may generally act like some of your rows aren't there for long enough that the application corrupts the data portion of the database as well. It's actually more likely (although less obviously damaging) for a corrupted index to make it not return rows which it should have than to return rows which it shouldn't have, so it is probably not actually worth rechecking rows before returning them; you still need to rely on the index being correct.

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Nov 30, 2010 17:41 UTC (Tue) by Simetrical (guest, #53439) [Link]

By default, InnoDB stores basically everything in one file, data and indexes for all tables in all databases. (Or really, it stores everything in one set of files, since it supports using multiples for filesystems with file length limits, but it could easily be one actual file.) This can be changed so that instead it stores each table's data+index in an .ibd file, with the innodb_file_per_table option:

http://dev.mysql.com/doc/refman/5.1/en/innodb-multiple-ta...

In practice, there's little difference between the two options. As I understand it, the transaction log is global, and the data in the per-table files cannot usually be recovered with the transaction log, so the data file for an individual table is mostly (although not entirely) useless.

InnoDB does always store data and indexes for a single table in the same file, AFAICT. However, I don't see why this makes a difference. In practice, the data and indexes are typically going to be stored on the same filesystem anyway, and will certainly be accessed by the same threads. Putting them in separate files will not decrease the chance of data corruption in any way I can think of. Could you elaborate? Of course, InnoDB can rebuild indexes from data if the indexes are damaged but the data is not. Whether they're in the same file seems irrelevant to me.

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Dec 1, 2010 2:24 UTC (Wed) by bk (guest, #25617) [Link]

I have a hard time believing that a 30 year veteran of RDBMSs would go anywhere near MySQL...

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Dec 1, 2010 6:57 UTC (Wed) by aristedes (guest, #35729) [Link]

>> I have a hard time believing that a 30 year veteran of RDBMSs would go anywhere near MySQL...

Did a slashdot troll somehow find their way over to LWN?

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Dec 1, 2010 14:44 UTC (Wed) by ballombe (subscriber, #9523) [Link]

Well, meta-trolling as the activity of nominating random harmless posts to "troll" status is an LWN exclusive. No need to smear slashdot.

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Dec 1, 2010 22:00 UTC (Wed) by jmalcolm (guest, #8876) [Link]

While I agree that calling "troll" is a bit akin to Godwin's Law, it would be nice if there was some more effective response to this kind of comment.

One thing that I appreciate about sites like LWN is that the participants debate and trade facts rather than simply conclusions and prejudice.

Comments like the one above invite content-free flame wars. It calls into question the competency or even honesty of the original poster without establishing any basis. It is human nature to support such comments when we agree with the basic premise. We create the credibility in our minds. The underlying hostility mutes considered opposition. Both of these promote ignorance.

Such comments also invite similarly emotionally charged and content-free responses--flame wars. No thank you.

What if I had responded with "I have a hard time believing you know anybody with 30 years experience in RDBMS" or even "I have a hard time believing that you know much about anything at all"? These are essentially the same comment as above content-wise.

I am not sure what the solution is. Calling 'troll' is not much better. I would just like to see us all keep the level of dialog above the groundless name calling above.

If you do not like MySQL, then say what you consider inadequate, inferior, or dangerous about it. Experts with opposing views will then have a chance to respond. We will all be better informed, and we will have a better basis for making our own choices in the future. That is what is happening in the rest of this thread.

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Dec 1, 2010 22:32 UTC (Wed) by jmalcolm (guest, #8876) [Link]

The funny part of this comment is that Slashdot itself runs on MySQL.

http://www.mysql.com/news-and-events/generate-article.php...

Slash dot has been using MySQL for well over a decade. If we were getting comments from Slashdot RDBMS admins (but maybe not users) I am sure we would have some pretty pro MySQL commentary.

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Dec 1, 2010 5:37 UTC (Wed) by nlucas (subscriber, #33793) [Link]

For professional use it's not the data that is sacrosanct [1], it's database *speed*.

Most of the speed comes from disk cache locality and such things. If putting everything on the same file (according to some order and logic found by performance testing) is faster, then that is what is used.

Philosophic laws about software engineering don't make sense when speed is paramount. Good, non subjective, performance testing yes.

Remember we are talking about ACID databases, so if we remove bugs from the equation the data *must* be correct at the end of a transaction. If not then they are not ACID compliant.

----
[1] There are always backup solutions, and the only way to be sure after a corrupted index was used.

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Dec 1, 2010 22:18 UTC (Wed) by jmalcolm (guest, #8876) [Link]

I suspect that many MySQL detractors do not even realize that MySQL is ACID, supports transactions, has corrected date nonsense stuff, and has addressed other commonly raised objections.

A lot of the MySQL prejudice seems to stem from historical limitations of the MyISAM engine. InnoDB is the default engine now and there are always options like TokuDB and others for those doing something that really needs more.

There are times that the features missing from MySQL matter. It is not the best tool for every job. MySQL often hits the sweet spot for me though and performance is a big part of that.

That said, your statement that speed trumps all invites the following video (language potentially NSFW):

http://nosql.mypopescu.com/post/1016320617/mongodb-is-web...

Imagine how upset a PostgreSQL, Oracle, or DB2 fan would have been in that conversation. :-)

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Dec 3, 2010 10:41 UTC (Fri) by intgr (subscriber, #39733) [Link]

> and has addressed other commonly raised objections.

Nope, MySQL still hasn't addressed many of their gotchas -- particularly silently ignoring things that the user explicitly told it to do. For example, silently throwing away inline foreign key constraints:

Or after a configuration error, falling back to MyISAM, even when user creates tables with ENGINE=InnoDB. For example, set innodb_flush_method=O_DRIECT and then run "create table foo(id integer) ENGINE=InnoDB; show create table foo;" -- gives you a MyISAM table!

Or the implicit commit that happens when running, for example, LOCK TABLES

Bottom line: No system is perfect, but I prefer to be TOLD when I'm trying to do something that shouldn't or cannot be done. MySQL has a long history of wiping problems under the rug, you only find out about problems too late when they have started to stink badly.

> InnoDB is the default engine now

Again wrong, MySQL 5.1 still defaults to MyISAM and 5.5 isn't stable yet.

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Dec 4, 2010 23:53 UTC (Sat) by efexis (guest, #26355) [Link]

"Again wrong, MySQL 5.1 still defaults to MyISAM and 5.5 isn't stable yet"

He's not wrong; 5.5 may not be called 'stable' yet but it definitely exists in the present and it defaults to innodb, so the statement "InnoDB is the default engine now" is factually accurate, even if it is not the full picture. My experience with 5.5 has been without problem; I personally consider it as stable as 5.1 at doing mostly 5.1 type things, but consider its new features (such as semi-synchronous replication etc) to be unstable as they are new. Obviously YMMV, I wouldn't suggest anybody else do anything else risky, but my own judgement is that 5.1 isn't bug free, some bugs that people would be running on a 5.1 system won't exist on a 5.5 system, and vice versa, so it's kind of luck of the draw whether you hit problems on either. For anybody who shares my assessment, even as a minority, the statement "InnoDB is the default engine now" is completely true.

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Dec 6, 2010 15:07 UTC (Mon) by intgr (subscriber, #39733) [Link]

I don't want to take this argument further, but I did check my facts prior to posting -- I went to mysql.com/downloads/ and it says: "MySQL Community Server (Current Generally Available Release: 5.1.53)". Also neither Ubuntu 10.10 nor Fedora 14 have MySQL 5.5 in their repos. I think it's fair to say that MySQL 5.5 isn't "out" yet.

Haas: MySQL vs. PostgreSQL, Part 1: Table Organization

Posted Dec 8, 2010 17:58 UTC (Wed) by jeremiah (subscriber, #1221) [Link]

I think that should read "stores data and PRIMARY index in same file." The secondary indexes were stored someplace else I believe.

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds