May 14, 2012
This article was contributed by Josh Berkus
The PostgreSQL project has just released a beta of its next major version,
9.2. As usual with its annual release, this version includes many new
features, most of which are targeted at improving database performance. The
developers have been hard at work improving response times, increasing multicore
scalability, and providing for more efficient queries on large data. They also found time to include some other major features, so let's explore a few of the things 9.2 beta has to offer.
JSON support
If the new non-relational databases (or "NoSQL") have proved anything, it's that many application developers want to store JSON objects in a database. With version 9.2, that database can be PostgreSQL.
The JSON support in PostgreSQL 9.2 isn't complete JSON database functionality, but it goes a long way toward that. First, there's a validating JSON data type, so that you can create tables with a specific JSON field:
Table "public.users"
Column | Type | Modifiers
--------------+---------+------------
user_id | integer | not null
user_name | text |
user_profile | json |
Better is that there are multiple JSON conversion functions, including row_to_json() and array_to_json(), which allow you to get the results of a query in JSON format.
select row_to_json(names)
from ( select schemaname, relname from pg_stat_user_tables )
as names;
row_to_json
-------------------------------------------
{"schemaname":"public","relname":"users"}
This means that applications can now send queries to PostgreSQL, get back
results in JSON format, and immediately act on those results without
further conversion. Unfortunately, it is not yet possible to send your
query as a JSON object or JavaScript code, but that's likely to come in some future version of PostgreSQL.
To make the JSON support really useful, though, you need two optional
components, or "extensions" to PostgreSQL: hstore and
PL/v8. Hstore is a data
type that stores indexed
key-value data and ships with PostgreSQL. PL/v8 is a stored procedure language based on Google's v8 javascript engine, sponsored by Heroku and NTT.
Hstore allows you to store flattened JSON objects as a fully indexed dictionary or hash. PL/v8 lets you write fast-executing JavaScript code which can run inside the database to do all kinds of things with your JSON data. One such is to create expression indexes on specific JSON elements and save them, giving you stored search indexes much like CouchDB's "views".
The PostgreSQL project chose to implement its own JSON formatting rather than utilizing any external library, reducing external dependencies and improving portability.
Range types
Anyone who's ever written a calendaring application can tell you that there's no such thing as a "point in time". Time comes in blocks, and pretty much everything you want to do with times and dates involves spanning minutes, hours, or days of time. For a long time, the only way relational databases had to represent spans of time was as two endpoints in different fields, an unsatisfactory and error-prone method.
In 9.2, contributor Jeff Davis introduces "range types" which allow the representation of any one-dimensional linear range, including time, real numbers, alphabetical indexes, or even points on a line. Such ranges can be compared, checked for overlap, and even included in unique indexes to prevent conflicts. PostgreSQL is the first major relational database system to have this concept.
To give you a concrete example, imagine you're writing a conference scheduling application. You want to make sure that no room can be scheduled for two speakers at the same time. Your table might look something like this:
CREATE TABLE room_reservations (
room_no TEXT NOT NULL,
speaker TEXT NOT NULL,
talk TEXT NOT NULL,
booking_period TSTZRANGE,
EXCLUDE USING gist (room_no WITH =, booking_time WITH &&)
);
That odd "EXCLUDE USING gist" clause says not to let anyone insert a record
for the same room at overlapping times. It utilizes two existing
PostgreSQL features, GiST
indexes and exclusion constraints. This substitutes for dozens of lines of application code in order to enforce the same constraint. Then you can insert records like this:
INSERT INTO room_reservations
VALUES ( 'F104', 'Jeff Davis', 'Range Types Revisited',
'[ 2012-09-16 10:00:00, 2012-09-16 11:00:00 )' );
As you can see, PostgreSQL's range types support mathematical closed and open bracket notation, helping you define ranges which do or don't overlap with adjacent ranges.
Scalability to 64 cores
Thanks to its Multiversion Concurrency Control (MVCC) architecture, PostgreSQL does not need to hold any locks for reading data, just for writing data. This should, in theory, allow for near-infinite multicore scalability of a read-only workload. But in reality, PostgreSQL 9.1 only scaled to around 24 cores before throughput per core fell off sharply. This really irritated PostgreSQL contributors Noah Misch and Robert Haas, so they decided to do something about it.
The main reason was that PostgreSQL was actually holding locks for each read. The biggest of these was a unitary lock on the table to make sure that it didn't get dropped while the read query was still running. When you have a lot of very short queries doing reads against the same table, contention on this table lock becomes a major bottleneck. Through a combination of repartitioning the lock memory space, and reducing the time required to get a lock, they largely eliminated that bottleneck.
Other contributors, such as lead developer Tom Lane, made other
optimizations to the read-only workload, by, for example, reducing memory copying for cached query plans. The University of California at Berkeley donated the use of a high-memory 64-core server for testing. The results of all of these optimizations are gratifying and dramatic.
PostgreSQL now scales smoothly to 64 cores and over 350,000 queries per second, compared to topping out at 24 cores and 75,000 queries per second on PostgreSQL 9.1. Throughput is better even at low numbers of cores. Note that this is on an extreme workload: all primary-key lookups on a few large tables which fit in memory. While it may seem obscure, that workload describes many common web applications (such as Rails applications), as well as the kind of workload many of the new key-value databases are designed to handle.
Index-only access
One of the features other database systems have, which PostgreSQL has lacked, is the ability to look up data only in an index without checking the underlying table. In the databases which support it (such as MySQL and Oracle) this is a tremendously useful performance optimization, sometimes called "covering indexes".
The reason why it's useful is that for very large tables an index on one or
two columns could be 1/100th the size of the table, and is often cached in
memory even when the table is not. Thus if you can touch only the index, you can avoid IO, making your query return twenty times faster. It's even more useful if the table in question is only going to be used to join two other tables on their primary keys.
However, the same MVCC which makes read queries scale so well on PostgreSQL made index-only access difficult. Even if the data you wanted was in the index, you had to check the base table for concurrency information. But then contributor Heikki Linnakangas created a highly cacheable bitmap called the Visibility Map, so that the query executor only has to check the base table for data pages which have been recently updated.
This means that, in 9.2, you'll be able to get your query answered just by the index in many or most cases, speeding up queries against large tables. Yes, this also means an end to most "COUNT(*) is slow on PostgreSQL" issues.
The caveat with this feature is that the table or tables in question need to be fairly up-to-date in database maintenance ("VACUUM"), which limits the ability to use the optimization on tables with a lot of "churn". Regardless, for many common use cases and for data warehouses, index-only access will be an order-of-magnitude performance improvement.
Cascading replication
Of course, these days horizontal scalability is a lot more popular than vertical scalability. The PostgreSQL developers, particularly Jun Ishiduka, Fujii Masao, and Simon Riggs, have continued to improve the binary replication introduced in PostgreSQL 9.0. Version 9.2 now contains support for cascading replication, which I will explain by example:
Imagine that you have three load-balancing PostgreSQL servers in Amazon US East, and another cluster of three replicated PostgreSQL servers in Amazon US West for failover in case of another AWS region-wide outage. If you want to use streaming replication, each server in US West needs to replicate directly from the master in US East, driving your transfer costs through the roof.
What you really want is the ability to replicate to database server 1 in US West, and have the two other servers in US West replicate from that server. With PostgreSQL 9.2, you can.
Configuration is fairly simple if you already have PostgreSQL 9.1 replication set up. Simply tell the cascading replica to accept replication connections by setting the wal_senders parameter. Then connect to it from the downstream replicas.
Other features
This isn't everything in the PostgreSQL 9.2 beta. There's performance
enhancements for writes such as group commit, a new class of index called
SP-GiST, reductions in CPU power consumption, multiple enhancements to
modifying database schema at runtime, and even several new database
monitoring commands. You can read about the new features in the
PostgreSQL 9.2 beta release
notes and the beta documentation.
Some features, planned for 9.2, didn't make it into this release due to issues found during development and testing. These include checksums on data pages to detect faulty storage hardware, federated databases, regular expression indexing, and "command triggers" which can launch an action based on arbitrary database events. Hopefully we'll see all of these in PostgreSQL 9.3 next year.
The PostgreSQL project hopes you'll download and test the beta to help
identify and fix bugs in version 9.2. If the project holds true to
its timeline for the last couple of years, the final release of version 9.2
should be some time in September. In the meantime, you can download and test the beta version.
(
Log in to post comments)