LWN.net Logo

Development

PostgreSQL 9.0 arrives with many new features

September 21, 2010

This article was contributed by Josh Berkus

Version 9.0 of the PostgreSQL database management system was released on September 20, with considerably more "buzz" than a PostgreSQL release has had in a while. The PostgreSQL Project does a major release every year, but this one is special. It has more than a dozen major features and nearly 200 minor improvements, so the release has more new goodies in it for database geeks than any release before it, hence the "9.0" version number.

It includes some things which users have been requesting for years, such as built-in replication. Given this large number of new features, I'm not even going to try to cover them all; instead, I'll pick five very different features as examples and you can read about the rest on PostgreSQL.org.

Binary replication

PostgreSQL 9.0 includes a new mechanism for replicating entire databases which uses binary logs to replicate between servers. Since this is very similar to Oracle's Hot Standby Databases, it is often also called "hot standby". While PostgreSQL already has several other replication tools, including Slony-I, pgPool2, and Bucardo, binary replication supports greater scalability and availability for the majority of PostgreSQL users. It is also simpler to configure and initialize for the most basic configuration: two identical databases with failover.

This feature is also expected to help push PostgreSQL in cloud hosting and software as a service (SaaS) environments. Overhead for each standby database is very low, allowing a single master to support tens or even hundreds of standby databases. Binary replication also works quite well together with virtual machine and filesystem cloning tools.

Binary replication works by copying the binary change logs, called "transaction logs", from one server to another. This builds on the mechanism used for "warm standby" in PostgreSQL for several years, with two important differences: you can now run queries on the standby server(s), and logs can now be shipped continuously over a database port connection. The latter also decreases latency between servers, allowing the standby server to be only a fraction of a second behind.

The new replication was a large, community-wide effort which took two years. Most of the work was done by developers working for 2nd Quadrant and NTT Open Source, but also includes contributions by developers for EnterpriseDB and Red Hat. Plans for 9.1 include administration tools, standby promotion, and synchronous replication.

New triggers

PostgreSQL applications frequently put a lot of programming into the database itself. This includes triggers, which are scripts that execute whenever data changes. PostgreSQL 9.0 expands this capability with two new kinds of triggers: column triggers and conditional triggers.

Column triggers are defined in the ANSI SQL Standard. These triggers execute only when data in that specific column changes. Imagine a database where you want to log a message every time a user changes their password, but not for other kinds of changes. You could define a trigger like so:

    CREATE TRIGGER password_changed
	AFTER UPDATE OF password ON users
	FOR EACH ROW
	EXECUTE PROCEDURE password_changed_log();

Conditional triggers are a PostgreSQL-only feature. A conditional trigger executes whenever a specific condition is met, but not otherwise. For example, imagine that you have a database of e-mail users and you want to freeze accounts whenever total storage goes over 5MB:

    CREATE TRIGGER freeze_account
      AFTER UPDATE OF mail_accounts
      WHEN ( total_storage > 5000000 )
      FOR EACH ROW
      EXECUTE PROCEDURE freeze_account_full();

New security features

Given the widespread concerns about database security in the industry, it's no surprise that PostgreSQL is adding new features to control access to data. First, PostgreSQL now supports RADIUS authentication, in addition to password, SSL, LDAP, PAM, and GSSAPI authentication. Second, large objects in PostgreSQL now have settable access permissions.

But the biggest new security feature is new commands to set blanket access permissions across an entire database or schema. This will make it much easier for web developers to make effective use of database permissions and prevent SQL injection attacks and other exploits. The first of these commands is GRANT ON ALL, which allows you to set permissions for a specific type of object over the whole database:

    GRANT SELECT,INSERT,UPDATE ON ALL TABLES IN cmsdata TO webcms;

The second command is SET DEFAULT PRIVILEGES, which allows you to define a set of permissions for each new database object created by users:

    ALTER DEFAULT PRIVILEGES FOR ROLE webcms IN SCHEMA cmsdata 
    GRANT SELECT,INSERT,UPDATE ON TABLES;

Either of these tasks would previously required writing a script to iterate over each object in the database.

HStore: key-value PostgreSQL

The renaissance of non-relational databases, or "NoSQL movement", has had an influence on PostgreSQL as it has on everyone else. One such influence is the HStore optional module included in version 9.0, which implements a key-value store inside PostgreSQL.

HStore has been available in previous versions of PostgreSQL, but built-in limitations made it only useful for a trivial set of tasks. Those limitations are now gone and there are new features of HStore, like key manipulation functions, as well. Application developers can now use HStore for most key-value tasks instead of adding an additional key-value database to their application infrastructure.

For example, say you wanted to store user profile data as a set of key-value data. First, you'd load HStore, since it's an optional module. Next, you'd create a table:

    CREATE TABLE user_profile (
	    user_id INT NOT NULL PRIMARY KEY REFERENCES users(user_id),
	    profile HSTORE
    );

Then you can store key-value data in that table:

    INSERT INTO user_profile 
    VALUES ( 5, hstore('"Home City"=>"San Francisco","Occupation"=>"Sculptor"');

Notice that the format for the HStore strings is a lot like hashes in Perl, but you can also use an array format, and simple JSON objects will probably be supported in 9.1. You probably want to index the keys in the HStore for fast lookup:

    CREATE INDEX user_profile_hstore ON user_profile USING GIN (profile);

Now you can see what keys you have:

    SELECT akeys(profile) FROM user_profile WHERE user_id = 5;

Look up individual keys:

    SELECT profile -> 'Occupation' FROM user_profile;

Or even delete specific keys:

    SELECT profile - 'Occupation' FROM user_profile WHERE user_id = 5;
    UPDATE user_profile SET profile = profile - 'Occupation' 
    WHERE user_id = 5;

JSON & XML explain plans

Like other enterprise SQL databases, PostgreSQL allows users to examine how each query is to be executed in order to optimize database access. The query plan is called an "EXPLAIN plan", and previously was available only as an idiosyncratic text format. For example, a simple single-table sort query might be displayed like so:

     Sort  (cost=1.05..1.06 rows=305 width=11)
       Sort Key: forum_name
       ->  Seq Scan on forums  (cost=0.00..1.03 rows=305 width=11)  

While the above text is fine for a database administrator who is troubleshooting specific queries, it is terrible for any form of automated processing, especially since the text changes slightly for each version of PostgreSQL. So developers Robert Haas and Greg Sabino-Mullaine added some new formats for EXPLAIN plans, including XML:

    <explain xmlns="http://www.postgresql.org/2009/explain">
      <Query>
	<Plan>
	  <Node-Type>Sort</Node-Type>
	  <Startup-Cost>1.05</Startup-Cost>
	  <Total-Cost>1.06</Total-Cost>
	  <Plan-Rows>3</Plan-Rows>
	  <Plan-Width>11</Plan-Width>
	  <Sort-Key>
	    <Item>forum_name</Item>
	  </Sort-Key>
    ...

and JSON:

    QUERY PLAN
    [
      {
	"Plan": {
	  "Node Type": "Sort",
	  "Startup Cost": 1.05,
	  "Total Cost": 1.06,
	  "Plan Rows": 3,
	  "Plan Width": 11,
	  "Sort Key": ["forum_name"],
	  "Plans": [
    ...

This feature is designed to encourage the development of new tools to analyze explain plans and suggest database improvements.

Changes in the PostgreSQL project

It's not just the PostgreSQL database which is changing rapidly; there have been changes in the project and community as well.

During September, the PostgreSQL project is finally moving to the git version control system from CVS. This move was planned before the 9.0 release, but technical difficulties have prevented it from being completed yet; this may be the topic of another article. The PostgreSQL home page is also migrating from an ah-hoc web framework written in PHP to the popular Django web framework.

Most of all, though, what has changed is project processes and attitudes, which have allowed the PostgreSQL project to add dozens of new contributors over the last two years. Selena Deckelmann, PostgreSQL major contributor, described the changes this way:

We started the process toward 9.0 last year when we added new committers and invited many new people into the CommitFest process (our way of getting lots of patches reviewed, approved and committed every two months). What we've found is that we can engage new developers by providing a clear way for them to help in small, well-defined ways.

As a group, we work really hard to recruit and maintain long-term relationships with developers, and that investment in people has paid off really well in 9.0. We have long term commitments from volunteers and independent businesses to implement features that take multiple years to see through to completion. The binary replication is a clear example of that, and we have many other projects underway that are only possible because developers trust our core development team to see them through.

The future of PostgreSQL

The PostgreSQL project's developers are already on to version 9.1; in fact, the second CommitFest for 9.1 has already started. Features currently under development include synchronous replication, SQL/MED federated tables, security label support, per-column collations, predicate locking, benchmarking tools, and new XML functions. Additionally, Google Summer of Code students created draft patches for JSON support and the MERGE operation.

"We have seen incredible increase in the volume of activity within the PostgreSQL community, in user base, development, and infrastructure." commented core team member Bruce Momjian.

The future certainly seems bright for what was once the "other open source database".

Comments (39 posted)

Brief items

Quote of the week

I recognize that my contributions to GNOME aren't necessarily obvious - I don't show up with patches to GNOME components. But my goals in life are very compatible with GNOME's, as they explicitly include the idea of getting GNOME onto as many devices as possible. Perhaps a 10x15 follow-on to Jdub's 10x10. Work that we do, while it may not begin as thoughts on a GNOME mailing list, is always aimed at being part of the broader community. I've always thought of the Ubuntu Netbook interface, now Unity, as a GNOME interface, and we've gone to lots of trouble to be true to that ethos when there were often more expedient options.
-- Mark Shuttleworth talks his way onto planet.gnome.org

Comments (none posted)

Codec2 low-bandwidth voice codec released

Codec2 is a codec intended for use with voice data where bandwidth is at a premium. "Currently it can encode 3.75 seconds of clear speech in 1050 bytes, and there are opportunities to code in additional compression that will further reduce its bandwidth." The 1.0 release is currently available.

Full Story (comments: none)

conf2py released

The conf2py (seemingly with no version number) has been released. It is an application for conference management; its features include single sign-on support, configurable billing policies, paper submission and review, and more.

Full Story (comments: none)

Diaspora source released

The Diaspora project, working on privacy-aware social networking, has made its first source release. "Much of our focus this summer was centered around publishing content to groups of your friends, wherever their seed may live. It is by no means bug free or feature complete, but it an important step for putting us, the users, in control. Developers, our code is on github, our tracker is public, we have a developer mailing list, and we are happily accepting patches." (Thanks to Tom Arnold).

Comments (33 posted)

Firefox, Thunderbird and SeaMonkey updates released

Firefox 3.6.20 3.6.10 and 3.5.13, Thunderbird 3.1.4 and 3.0.8, and SeaMonkey 2.0.8 have been released. This is a single-fix update fixing a startup crash experienced by some users.

Comments (1 posted)

Qt 4.7.0 released

The Qt 4.7.0 release is out. "Although it's a little more than nine months since Qt's last feature release (4.6.0 on December 1st, 2009), the seeds of some of the new stuff in 4.7 were sown much earlier. Indeed, many of the ideas behind the biggest new feature in Qt 4.7.0, QtQuick, were born more than two years ago, not long after Qt 4.4 was released. We hope you'll benefit from the effort and care that went into bringing the implementation of those ideas to maturity." See the "What's new in Qt 4.7" page for more information on this release.

Comments (none posted)

Newsletters and articles

Development newsletters from the last week

Comments (none posted)

An Ecology Of Ardour (Linux Journal)

Dave Phillips has some news about Ardour, an open-source digital audio workstation. "Ardour3 is a significant evolution from previous versions of the program. Perhaps the most anticipated enhancement is the addition of MIDI track support. Standard MIDI files can be loaded, edited, and saved as integral parts of your projects, or you can record MIDI data directly. If you're the meticulous type you can choose to enter MIDI events by hand, one by one. Another welcome feature is the expanded MIDI synchronization support. Ardour3 now supports MIDI Clock, a method that is especially useful for synchronizing devices such as external arpeggiators and synthesizers that use its timing information to control certain aspects of their sounds."

Comments (1 posted)

Page editor: Jonathan Corbet
Next page: Announcements>>

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds