Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 23, 2013
An "enum" for Python 3
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
Another "is anyone really still using MySQL?" post
Posted Aug 20, 2012 12:56 UTC (Mon) by man_ls (subscriber, #15091)
Now, killing Oracle's product might be a good idea if a better fork were to prevail... but that is also unlikely given the brand recognition MySQL has. I wonder how a package under the GPL can have such a high degree of fragmentation, I don't recall other prominent examples.
Posted Aug 20, 2012 14:44 UTC (Mon) by mpr22 (subscriber, #60784)
I wonder how a package under the GPL can have such a high degree of fragmentation
Very easily: The upstream is a corporation that is widely distrusted by the user community and which only accepts contributions accompanied by an agreement allowing them to issue closed-license versions for profit.
Posted Aug 20, 2012 19:08 UTC (Mon) by robert_s (subscriber, #42402)
Yes, because postgresql doesn't exist.
People being forced onto postgresql would probably be a boon for FOSS databases, as users realize that FOSS databases are far more capable and robust that they thought and perfectly capable of taking over the jobs of the majority of installed proprietary DBMSs.
PostgreSQL, I wish
Posted Aug 20, 2012 20:19 UTC (Mon) by man_ls (subscriber, #15091)
Sad but true: PostgreSQL is the Sybase of the Free software world, and I would love you to prove me wrong. Perhaps the shadow of MySQL is what is keeping PostgreSQL from greater acceptance, who knows; I agree that a wider known PostgreSQL would be a great outcome.
Posted Aug 20, 2012 23:15 UTC (Mon) by khim (subscriber, #9252)
Perhaps the shadow of MySQL is what is keeping PostgreSQL from greater acceptance
Nope. It's the same thing which keeps PHP alive: availability on cheap hosting. Now, just why MySQL is available on cheap hosting and PostgreSQL is premium? First of all is of course, speed: on tiny databases and trivial requests MySQL is still faster. Another important capability: upgradeability. MySQL can be upgraded and downgraded in seconds. The latter capability is important for cheap hosters: they can plan for upgrades, but they can not test these upgrades. They really need the ability to migrate clients to new version and back in seconds. You can easily install two versions from MySQL and PostgreSQL side-by-side, but with MySQL you can easily go back and forth (depending on the reaction of clients) while PostgreSQL requires dump/restore cycle. Not acceptable.
Now, you may say that for "serious users" this is not an important capability. And I agree. But "serious users" are not born that way. They start from some position. And usually they start from MySQL. And then few years down the road PostgreSQL is fighting with users who knows MySQL, who lives with MySQL and for whom PostgreSQL is technology from Mars. Why will they switch without really serious reasons?
It's much easier to attract newbie rather then attract someone who've invested time and money in the competitor technology - and PostgreSQL developers for years refuse to admit that.
Posted Aug 21, 2012 1:27 UTC (Tue) by Cyberax (✭ supporter ✭, #52523)
Posted Aug 21, 2012 9:46 UTC (Tue) by andresfreund (subscriber, #69562)
Getting really seamless migration is still very, very hard.
Posted Aug 21, 2012 14:46 UTC (Tue) by Cyberax (✭ supporter ✭, #52523)
Posted Aug 21, 2012 15:27 UTC (Tue) by andresfreund (subscriber, #69562)
SR/HS work by replaying the wal from the primary on the standby. Unfortunately its format changed in at least each of the last 5 releases. Even 9.3 - with 9.2 being in beta atm - already has significant changes to the format.
> upgrade it (in-place upgrades are fast), let it catch up with the master
And unfortunately, while pg_upgrade upgrades are way much faster than the traditional pg_dump/pg_restore, they aren't that fast because they require the planner statistics to be rebuilt. A whole database analyze can take a bit on a larger database.
Don't get me wrong: I *really* like postgres. I really like SR/HS. I even like that pg_upgrade exists although the way it works isn't that elegant from my POV.
Unfortunately that doesn't make those features appear :(.
Posted Aug 22, 2012 11:12 UTC (Wed) by robert_s (subscriber, #42402)
Yes, that the conventional wisdom. But you never see any real proof of this against a modern and properly tuned postgres.
Posted Aug 22, 2012 18:29 UTC (Wed) by dlang (✭ supporter ✭, #313)
What Postgres really needs is a "benchmark this box and set the parameters accordingly" utility.
The Postgres defaults are suitable for a many-years-old system that is going to be running postgres as well as many other things and try to make sure that postgres isn't going to interfere with the other things running on the system.
If Postgres is the reason for the system, the default values are grossly off from anything sane, and there really aren't many good resources to help a newby who's not an experienced DBA figure out what to set them to (even within an order of magnatude)
Posted Aug 22, 2012 18:56 UTC (Wed) by raven667 (subscriber, #5198)
Posted Aug 22, 2012 19:31 UTC (Wed) by andresfreund (subscriber, #69562)
There are other variables where nobody fought enough to get them changed. Other changes are beneficial in a wide range of systems but hurt others badly...
Posted Aug 22, 2012 19:40 UTC (Wed) by andresfreund (subscriber, #69562)
Unfortunately the "benchmark & set" idea doesn't really work in real life. Many of the parameters really, really depend on the workload you want to run:
* a high shared_buffers hurts in write intensive workloads if the dataset is much bigger than the available ram
* a high shared_buffers greatly improves read intensive workloads with a large hot set if it fits into s_b entirely
* a high shared_buffers hurts predictive answer times in write intensive workloads pretty badly on certain linux kernel versions
* a high shared_buffers setting hurts on high connection counts because of the large page table (can be alleviated with hugepages, probably coming in 9.3)
* a high max_connections hurts performance in high throughput oltp'ish workloads but is needed in beginner setups
* a high default_statistics_target hurts high throughput oltp workloads noticeably but greatly improves olap-ish workloads
* a high checkpoint_segments *greatly* improves write performance
* a high checkpoint_segments setting considerably increases recovery time after a crash/immediate restart
* a low checkpoint_timeout setting + small checkpoint_completion_target decreases response time jitter
* a low checkpoint_timeout setting + small checkpoint_completion_target considerably increases the amount of overall writes (due to checkpoints + full page writes), especially if the workload is update heavy
I could go on without a problem for quite some time.
For some of those idea exists to make a setting more generally acceptable, for others not.
Posted Aug 22, 2012 19:56 UTC (Wed) by dlang (✭ supporter ✭, #313)
I am the type of person who tends to appriciate manual knobs to tune things, however when the normal answer to any performance question is "well, you must have things tuned wrong" without there being any way for a newby to know how to tune it, there is a problem.
The fact that MySQL has historically done better for simple queries with the default configuration leads people to be scared of Postgres.
In many ways, this is the same way the Microsoft made it's way into the corporate datacenter. They pitched it as "no expertise needed to run it, just install and go", and for small, simple setups they were close enough to right for people to believe it. The fact that running a large setup on Microsoft products frequently requires more resources, and more experise than running the same userbase on a *nix solution is missed by many people because they got started cheaply and so they assume that the ramp-up is going to be equally hard for competing products.
Posted Aug 21, 2012 9:49 UTC (Tue) by andresfreund (subscriber, #69562)
While still not too nice results, I think its more meaningful to compare the actual server packages installed because the -common packages are installed as dependencies on *loads* of things.
That comparison only is valid if people install the version independent meta package and not a specific version, but I have no idea how to do a more meaningful comparison ;)
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds