|
|
Log in / Subscribe / Register

So how does PostGreSQL compare to MySQL?

So how does PostGreSQL compare to MySQL?

Posted Jan 1, 2010 11:47 UTC (Fri) by rvfh (guest, #31018)
In reply to: The ongoing MySQL campaign by djzort
Parent article: The ongoing MySQL campaign

All this keeps me wondering how hard and risky it would be for a company (for example the one I work for) to move from MySQL (NDB cluster and MyISAM tables only) to PostGreSQL...

If I had tangible information, I might be able to convince my management to make an experiment on one of our test machines, then post more info on what went right and wrong.

My main question so far is: is the syntax 100% compatible? Is is just some corner cases that are not (those we probably do not use anyway)? Is it comparable performance-wise?

Sorry if this has already been answered many time, but I think now is the time to review the state of things for people/companies thinking of steering away from MySQL...


to post comments

So how does PostGreSQL compare to MySQL?

Posted Jan 1, 2010 12:17 UTC (Fri) by ringerc (subscriber, #3071) [Link] (6 responses)

"My main question so far is: is the syntax 100% compatible?"

No, it isn't. It's probably not even 90% compatible. However, if you've stuck to the SQL standard you won't have many things - if any - to change. However, there are some very commonly used MySQL extensions (like AUTO_INCREMENT) which you must do a different way in PostgreSQL, so you must be prepared for a few changes even to fairly simple code.

(Note that there are good reasons why PostgreSQL uses sequences and the SERIAL pseudo-type instead of using AUTO_INCREMENT. It's one of those things that MySQL has but PostgreSQL doesn't implement because it's actually pretty broken in transactional systems.)

The PostgreSQL documentation is extremely complete and discusses the relationship of its syntax with the standard in the documentation for each command/function. So you can learn a lot from there. There's also a lot of information about MySQL to PostgreSQL migrations all around the 'net.

Is is just some corner cases that are not (those we probably do not use anyway)?

If you're using MyISAM tables, then you're not asking much from your database. You don't need it to be reliable or to support basic transactional features, for example. So it's fairly likely you won't have too many issues, though you'd need to handle AUTO_INCREMENT and probably move to sql-standard datatype names for a few things.

Is it comparable performance-wise?

Usually I'd just say "yes"... because MySQL and PostgreSQL have comparable performance for "serious database" uses. For some workloads and queries one is faster, for others another, but there's no clear and obvious winner. However, that's assuming you're using InnoDB tables on MySQL and are relying on transactional behaviour.

You're using MySQL in quick-and-dirty MyISAM mode, where it's not even crash-safe let alone transactional. If you don't need reasonable safety you could get comparable performance from PostgreSQL by setting fsync = off in postgresql.conf but this is extremely unsafe and will damage your database if you have an unsafe server shutdown. Just like with MyISAM there's no guarantee the database can be repaired in such a case, so you may have to restore from backups.

That's really not the right approach, though. If you want that sort of thing, stick with MySQL. It's designed for it, whereas PostgreSQL is "safety first". If you actually want your data to be safe, then you'll want to (a) Move to InnoDB tables and adapt your app so that it does all database-modifying work in transactions, then (b) port over to PostgreSQL if you want that as an option.

You'll never get the raw queries-per-second performance out of PostgreSQL that you do from MyISAM tables in MySQL. Of course, your data will never be that unsafe, either ;-) . If you're prepared to do the work to make your app smarter - let the database intelligently do work you're currently doing by reading stuff into your app and processing it app-side, etc - then you might find that you get better performance out of Pg, because it's very good at more complex and difficult queries and can often do things much faster than you can do with the data in your app.

So how does PostGreSQL compare to MySQL?

Posted Jan 1, 2010 20:28 UTC (Fri) by nevyn (guest, #33129) [Link] (1 responses)

However, there are some very commonly used MySQL extensions (like AUTO_INCREMENT) which you must do a different way in PostgreSQL

I've only ever used PostgreSQL, so I'm wondering how AUTO_INCREMENT is different? Looking at the docs. the only "interesting" property I could see was it's use with more than one primary key for the table (but I'd assume/hope this wasn't common).

So how does PostGreSQL compare to MySQL?

Posted Jan 2, 2010 2:13 UTC (Sat) by ringerc (subscriber, #3071) [Link]

Transactional behaviour.

Postgresql's sequences are lock-free and don't block other transactions. They allow high INSERT concurrency, but don't guarantee gapless sequences (ie: on ROLLBACK or statement failure, you might have a "missing" number, which you shouldn't care about for a primary key).

Looking at some MySQL documentation, though, it looks like for recent versions of InnoDB AUTO_INCREMENT works much like PostgreSQL's sequences. I don't know enough about the MySQL side to go into depth on any differences. It looks like InnoDB's AUTO_INCREMENT these days may behave very like a sequence if InnoDB has the configuration param "innodb_autoinc_lock_mode = 2" set, but that's about all I can say.

http://dev.mysql.com/doc/refman/5.1/en/innodb-auto-increm...

( PostgreSQL doesn't support statement level replication, so the statement-replication concerns about auto-increment don't apply ).

So how does PostGreSQL compare to MySQL?

Posted Jan 3, 2010 7:44 UTC (Sun) by nicku (subscriber, #777) [Link] (3 responses)

The other issue with PostgreSQL is the need for applications to avoid the need for full VACUUMs, which require an outage. All our PostgreSQL databases seem to eventually require this. Avoiding the need for full VACUUMs is a matter of application design, but we don't seem to have the hang of that yet :-)

So how does PostGreSQL compare to MySQL?

Posted Jan 3, 2010 9:34 UTC (Sun) by ringerc (subscriber, #3071) [Link] (2 responses)

VACUUM or something like it is pretty much required in any MVCC design, and should occur in the background without interrupting anything. It's a cost of operation that should be amortized as smoothly as possible across time.

If you're having to use VACUUM FULL rather than plain VACUUM to control table bloat, though, then something is wrong with your configuration. Are you using an ancient version of Pg without built-in autovacuum? Is your autovacuum set to be so conservative that it's useless? Are you running out of free_space_map room so VACUUM is losing track of where it needs to do work?

Some people configure autovacuum to run less, thinking that by doing so they'll somehow reduce the amount of work to be done. If you've done so, there's your problem. In fact, autovacuum should run MORE if you want your database faster, so the work is done promptly (avoiding table and index bloat) and spread evenly rather than occurring in big user-annoying chunks. It's like garbage collection in that regard.

See VACUUM FULL on the Pg wiki.

Note that vacuum and autovacuum improve with every version. 8.4 for example eliminates the free_space_map parameter, making it automatically (and better) managed by the server. If you're on an old version, upgrade. Unfortunately Pg isn't yet clever enough to fully automatically manage the autovacuum parameters, so you may still need to tweak autovacuum params on particularly UPDATE-heavy tables and/or tell autovacuum to run more frequently in general.

MVCC implementation

Posted Jan 3, 2010 20:21 UTC (Sun) by butlerm (subscriber, #13312) [Link] (1 responses)

<em>VACUUM or something like it is pretty much required in any MVCC
design</em>

Not quite. Oracle's MVCC, for example, works something like the following
(I believe InnoDB uses much the same scheme):

1. The latest version of any row is stored in the primary data block, along
with some transaction related meta data.
2. The information necessary to recover prior versions of any row is stored
in separate rollback segments
3. Changes to both are written to the redo log
4. When a query is processed, if the version of the row currently stored in
a data block isn't the one needed, a prior version is recovered by
referring to the appropriate rollback segment.
5. When a transaction commits, or soon thereafter, the rollback data for
that transaction is discarded. No disk I/O required.
6. Alternatively, if a transaction is rolled back, prior versions of
modified rows are recovered from the rollback segment and restored to their
original positions in the primary data blocks.

The main advantage of this scheme is that data blocks generally only get
modified once per transaction, rather than once initially and once some
time later, during the vacuum process. However, redo log overhead is
probably twice as high, and if a transaction is large or long enough, the
rollback entries will physically hit the the disk before they are discarded
as well.

MVCC implementation

Posted Jan 4, 2010 4:12 UTC (Mon) by ringerc (subscriber, #3071) [Link]

Good point.

A redo log does have a large disk I/O cost if you have a high update/delete volume and/or any long-running SERIALIZABLE transactions, but it doesn't require the same sort of work as Pg's in-place storage of superceded rows. On the other hand, it has a HUGE cost to queries - in the form of random disk I/O - if they have to hit the redo log on disk.

I don't know enough to comment on which is "better". I suspect that like most things, it's probably a trade-off where some workloads benefit from one and others from the other.

So how does PostGreSQL compare to MySQL?

Posted Jan 3, 2010 7:36 UTC (Sun) by nicku (subscriber, #777) [Link] (1 responses)

My main problem with deploying PostgreSQL instead of MySQL is to replace the replication of MySQL. The announcement in the Linux Conference PostgreSQL talk program that 8.5 "will support native replication" brought eager, enthusiastic support from members of the team I am in. In the meantime, members of that team will migrate at least one application from PostgreSQL to MySQL to scale read operations.

So how does PostGreSQL compare to MySQL?

Posted Jan 3, 2010 10:42 UTC (Sun) by dlang (guest, #313) [Link]

have you looked at the exitinsg replication options for postgres?

the big change in 8.5 is that there will be a synchronous replication via log shipping built into the core. There are already options (pgpool or slony to name a couple) that will do replication and let you read from all boxes at the same time.

I'm not a MySQL expert, but my understanding of the guarantees that it's replication provides make me think that these existing replication tools let you get equivalent safety to what MySQL replication provides.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds