LWN.net Logo

PostgreSQL, I wish

PostgreSQL, I wish

Posted Aug 22, 2012 11:12 UTC (Wed) by robert_s (subscriber, #42402)
In reply to: PostgreSQL, I wish by khim
Parent article: MariaDB: Disappearing test cases or did another part of MySQL just become closed source?

"on tiny databases and trivial requests MySQL is still faster"

Yes, that the conventional wisdom. But you never see any real proof of this against a modern and properly tuned postgres.


(Log in to post comments)

PostgreSQL, I wish

Posted Aug 22, 2012 18:29 UTC (Wed) by dlang (✭ supporter ✭, #313) [Link]

The problem is that it is non-trivial to properly tune postgres.

What Postgres really needs is a "benchmark this box and set the parameters accordingly" utility.

The Postgres defaults are suitable for a many-years-old system that is going to be running postgres as well as many other things and try to make sure that postgres isn't going to interfere with the other things running on the system.

If Postgres is the reason for the system, the default values are grossly off from anything sane, and there really aren't many good resources to help a newby who's not an experienced DBA figure out what to set them to (even within an order of magnatude)

PostgreSQL, I wish

Posted Aug 22, 2012 18:56 UTC (Wed) by raven667 (subscriber, #5198) [Link]

What's your opinion on pgtune fitting this bill? There seems to be only a handful of truly critical postgresql performance tuning knobs with quickly diminishing returns that will be highly workload-specific and not hardware-specific. I wonder if there is any idea in the postgresql community to just have it detect cpu/memory at startup time and automatically adjust the internal buffers unless overridden by the config file, making pgtune unnecessary.

PostgreSQL, I wish

Posted Aug 22, 2012 19:31 UTC (Wed) by andresfreund (subscriber, #69562) [Link]

Some of that is happening (wal_buffers is autotuned since 9.1, max_fsm_* doesn't exist anymore since 8.4, default_statistics_target has a sane default since 8.4, 9.3 won't need adjustment of sysv /proc settings for most things), in most of the other situations its either hard to do or it very much depends on the environment. Many pg instances don't run on a dedicated machine. Many database servers serve multiple pg clusters at the same time. Many database servers run in virtualized environments where you get better average performance if you use less memory.

There are other variables where nobody fought enough to get them changed. Other changes are beneficial in a wide range of systems but hurt others badly...

PostgreSQL, I wish

Posted Aug 22, 2012 19:40 UTC (Wed) by andresfreund (subscriber, #69562) [Link]

I agree that its way too hard to tune it and that lots of folkloristic knowledge is needed to do so.

Unfortunately the "benchmark & set" idea doesn't really work in real life. Many of the parameters really, really depend on the workload you want to run:
* a high shared_buffers hurts in write intensive workloads if the dataset is much bigger than the available ram
* a high shared_buffers greatly improves read intensive workloads with a large hot set if it fits into s_b entirely
* a high shared_buffers hurts predictive answer times in write intensive workloads pretty badly on certain linux kernel versions
* a high shared_buffers setting hurts on high connection counts because of the large page table (can be alleviated with hugepages, probably coming in 9.3)
* a high max_connections hurts performance in high throughput oltp'ish workloads but is needed in beginner setups
* a high default_statistics_target hurts high throughput oltp workloads noticeably but greatly improves olap-ish workloads
* a high checkpoint_segments *greatly* improves write performance
* a high checkpoint_segments setting considerably increases recovery time after a crash/immediate restart
* a low checkpoint_timeout setting + small checkpoint_completion_target decreases response time jitter
* a low checkpoint_timeout setting + small checkpoint_completion_target considerably increases the amount of overall writes (due to checkpoints + full page writes), especially if the workload is update heavy

I could go on without a problem for quite some time.

For some of those idea exists to make a setting more generally acceptable, for others not.

PostgreSQL, I wish

Posted Aug 22, 2012 19:56 UTC (Wed) by dlang (✭ supporter ✭, #313) [Link]

then possibly the answer isn't a 'benchmark ahead of time" approach, but is instead a "run it for a little while and then ask it for recommendations of things to try"

I am the type of person who tends to appriciate manual knobs to tune things, however when the normal answer to any performance question is "well, you must have things tuned wrong" without there being any way for a newby to know how to tune it, there is a problem.

The fact that MySQL has historically done better for simple queries with the default configuration leads people to be scared of Postgres.

In many ways, this is the same way the Microsoft made it's way into the corporate datacenter. They pitched it as "no expertise needed to run it, just install and go", and for small, simple setups they were close enough to right for people to believe it. The fact that running a large setup on Microsoft products frequently requires more resources, and more experise than running the same userbase on a *nix solution is missed by many people because they got started cheaply and so they assume that the ramp-up is going to be equally hard for competing products.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds