> My thought when I read about this problem is that PostgreSQL is causing it with their user-space locking and so PostgreSQL needs to fix it.
So there's a couple major problems with the idea that this issue should be fixed in the PostgreSQL code:
1. PostgreSQL is not the only application with its own spinlock implementatation. There is, for example, Oracle.
2. While we might be able to change locking in future versions of PostgreSQL, we can't change locking in past ones.
Even if the next version of PostgreSQL (9.3) has a modified locking implementation which doesn't hit the issues in 3.6, the number of people running older versions of PostgreSQL will far outnumber the folks running the latest version for quite some time. What you'd be saying to all of those folks is, effectively, "don't upgrade Linux".
So any solution which hinges on "make these modifications to PostgreSQL" will instead result in PostgreSQL users deciding to stay on old versions of the Kernel. If this problem is equally bad for Oracle, you might even see RedHat refusing to deploy a version based on 3.6.
There's also the fact that PostgreSQL's locking implementation is complex and quite highly tuned. So the idea that the PostgreSQL project could make changes which wouldn't result in a worse regression in a few months is optimistic. Implementing things like futex support could take literally years.
Posted Oct 3, 2012 19:00 UTC (Wed) by corbet (editor, #1)
[Link]
FWIW, the discussion in kernelland was based on the assumption that this regression was the kernel's problem. Nobody there suggested telling the PostgreSQL developers to come up with a new locking scheme.