A preview of PostgreSQL 9.5

July 8, 2015

This article was contributed by Josh Berkus

The PostgreSQL developers voted at the 2015 developer meeting to release an alpha version of PostgreSQL 9.5 as soon as possible. As it turns out, that meant July 2nd, so the 9.5 alpha is available now. While this is only a preview release of 9.5, it's full of cool features for database geeks, including "UPSERT", more JSONB utilities, row-level security, and better multicore scaling. However, long-time PostgreSQL users will notice that 9.5 is a bit behind schedule at this point.

Why an alpha?

One question users have is why the PostgreSQL project isn't issuing a beta release at this point, as it normally would in June or July. The answer to that has to do with the reliability bugs the database has had over the last six months. Due to these problems, the PostgreSQL committers have become very cautious about new features that might cause unexpected reliability or security issues, and want to reserve the right to modify or cancel features before the final release. Historically, the project has tried to freeze all APIs by the first beta, so the alpha label shows that they aren't doing so.

The primary reliability issues for the database centered around special transaction counters called "multixacts" that help the database keep track of data visibility when multiple concurrent sessions touch the same rows. Changes made to the multixact mechanism in PostgreSQL 9.3 in order to make foreign keys more efficient had a number of unexpected side effects, including some data-loss bugs that led to multiple update releases within a relatively short period. One of the things that may delay PostgreSQL 9.5 is that developers are still working on the last known issues with multixacts, and plan to fix them before the final release.

One of PostgreSQL's most competitive features is its reputation for ensuring zero data loss. As such, the developers also want to make certain that none of the new features that make changes to data storage or the database transaction log will cause data loss as a side effect. This means more pre-release testing than was done for prior releases. One such at-risk feature is automated transaction log compression to reduce I/O; the other is UPSERT.

UPSERT

One feature which application developers switching from MySQL to PostgreSQL have missed is "INSERT ON CONFLICT UPDATE", otherwise known as UPSERT. This SQL feature allows developers to not worry about whether the row they are adding is new or not, simplifying application programming. It also can eliminate the need for multiple round-trips to the database to check to see if another user has concurrently added the same information.

PostgreSQL finally has UPSERT, thanks to Heroku engineer Peter Geoghegan. The version of the syntax in the alpha looks like this:

    INSERT INTO users (user_id, email, login)
        VALUES ( 1447, 'josh@postgresql.org', 'jberkus' )
        ON CONFLICT (user_id) 
        DO UPDATE SET email = EXCLUDED.email,
	              login = EXCLUDED.login;

What the above says is: insert this user if it is not already present, but if the user ID is already in the table, then update the email and login fields instead. The special EXCLUDED keyword says "do this only with rows that conflicted", and allows UPSERT to work with batch imports of data as well as single rows.

For other situations, you can also choose to DO NOTHING. For example, imagine that you're collecting "likes" for a social web application; if a user tries to insert the same "like" twice, you simply want to ignore it, like so:

    INSERT INTO likes ( user_id, video_id )
        VALUES ( 1447, 20135 )
        ON CONFLICT DO NOTHING;

Adding UPSERT to PostgreSQL has taken Geoghegan more than two years of work, including a lot of "back to the drawing board" redesigns. One of the reasons for the long development period, as well as the long wait for a feature that has been in MySQL for years, is the project's need to accommodate power users. In addition the above examples, the new UPSERT supports designating specific table constraints as the conflicts, and creating arbitrarily complex rules for handing the rejected data, including nesting conflict checks through subqueries and WITH clauses. The new feature was also required to work seamlessly with replication and the new row-level security. Also, since UPSERT is not part of the SQL standard, the project spent a lot of time arguing about desired syntax for the feature.

More than anything, though, the difficult part of developing the feature was making it work correctly in high-concurrency environments. UPSERT needed to yield the correct result and not corrupt data even if 50 users were trying to UPSERT the same row at the same time. This need for "bulletproof concurrency" has been the biggest thing delaying the feature, as well as the biggest reason for concern by committers about it causing future unanticipated issues. Regardless, the long wait has resulted in a much more powerful UPSERT feature than Geoghegan originally specified, so it was probably worth the wait.

PostgreSQL as a document database

While it is adding new SQL features, the project also seems to be hard at work re-implementing itself as a "NoSQL" database competitor. While the project has some grand plans for document database support in future years, version 9.5 includes a bunch of new built-in functions and operators to make PostgreSQL a better JSON document database right away. Various developers have also been creating external tools to make it more non-relational application friendly. JSON is a standard serialization format for object data, and JSONB is PostgreSQL's binary storage format and data type for that type of data.

The central new built-in function is jsonb_set(), a function that allows users to update any arbitrary key within a nested JSONB document, for example:

    SELECT profile FROM profiles WHERE user_id = 1447;

    '{ "type" : "i", "clubs" : { "chess club" { "role" : "member" } } }'

    UPDATE profiles 
        SET profile = jsonb_set(profile, ARRAY['clubs','chess club'], 
	                        '{ "role" : "chair" }', TRUE )
        WHERE user_id = 1447;

    SELECT profile FROM profiles WHERE user_id = 1447;

    '{ "type" : "i", "clubs" : { "chess club" : { "role" : "chair" } } }'

The jsonb_set() statement above would add the '"chess club" : { "role" : "chair" }' document to the user's list of clubs nested inside their profile (which is a JSONB column in the table), or update their chess club membership to "chair" if they were already a member. Since it allows users to modify nested keys "in place" without parsing the entire JSONB document in the application, or installing the PostgreSQL PL/v8 extension to run JavaScript inside the database, this feature allows users to run much more meaningful document database workloads on PostgreSQL. In addition to jsonb_set(), 9.5 includes new functions and operators that support JSONB concatenation, key deletion, and aggregating data in tables into complex JSONB objects.

While the built-in functions allow users to do a lot, they don't support data searches of arbitrary complexity for applications involving large populations of JSONB documents. JsQuery, released this year by PostgreSQL's contributor team from Moscow, adds a special JSONB search language to PostgreSQL, and new indexes to support it. This new language supports wildcards, range searches, and boolean logic. For example, if you wanted to search for the chair of the chess club, you could run this JsQuery:

    SELECT user_id FROM profiles
        WHERE profile @@ 'clubs."chess club".role = chair';

PostgreSQL 9.5 with JsQuery can therefore be used by developers who want to abandon the relational model entirely and just store a collection of documents in the database. Several projects have been created over the last couple of years to take advantage of this and wrap PostgreSQL in a NoSQL API, both to ease migration from MongoDB and other databases, and to allow the creation of "hybrid database applications" with both non-SQL and SQL-based access. One of the most recent of these projects is BedquiltDB by Shane Kilkelly, which supports users who want to use the MongoDB syntax to modify and search data in PostgreSQL.

For users who prefer a fully relational database while supporting a document-oriented API, ToroDB, by Spanish consulting firm 8KData, is also new this year. ToroDB accepts data requests and updates using the MongoDB protocol. Data is automatically decomposed into relational tables and transformed into JSON documents for client requests. At the Big Data Spain conference, developer Álvaro Hernández Tortosa claimed that this kind of hybrid database is more flexible and scales better for very large data sets than pure non-relational approaches.

Regardless of which tools end up being the most successful, it seem that the PostgreSQL community plans to take on a lot of current and future document database workloads. The next couple of years of competition with non-relational databases should be interesting.

Row-level security

For the last three major PostgreSQL releases, the project has been adding features to allow increasingly specific data security controls. This has included column level permissions, security "labels", integration with SELinux, and in 9.5, row-level security (RLS). What RLS does is allow administrators specify rule-based permissions required for each individual row in a table, and is also known by names like "virtual private database" and "fine-grained security". RLS has been in demand by users with strong security needs around their data, such as credit card processors and healthcare companies.

RLS is disabled by default on PostgreSQL tables. However, it's easy to enable and the syntax is straightforward. For example, say you wanted to allow users to read their own profiles, but not other people's. You could take advantage of the special database variable current_user and check that the current database user matched the login column of the table profiles, like so:

    ALTER TABLE profiles ENABLE ROW LEVEL SECURITY;

    CREATE POLICY read_own_data ON profiles
        FOR SELECT USING (current_user = login);

Much more sophisticated policies are possible, including arbitrary constraints and setting specific policies for specific database login roles. RLS can also be combined with column permissions to effectively give "cell-level" permissions control.

For the past three years, a lot of the work to bring about RLS has been driven by NEC engineer KaiGai Kohei. In the 9.5 development cycle, that work was taken up by Dean Rasheed and Stephen Frost. While Frost's involvement in security features for PostgreSQL is longstanding, another reason for his interest in RLS became apparent on May 22. On that day, the US National Reconnaissance Office (NRO) announced that it was rolling out a relational, geospatial database solution that would support "multilevel security" — the first of its kind. The partnership to deliver this database consists of Lockheed Martin, Red Hat, Seagate, and Frost's employer, Crunchy Data Solutions.

Multilevel security (MLS) is a design for data control that centers around the idea that different personnel should be able to see different data based on their clearance level. Lower-level staff should not even be aware that high-level data exists, and in some cases should be given misleading data in its place. MLS is popular with intelligence agencies, some of whom have been looking to add it to PostgreSQL as far back as 2006. The NRO, which manages US spy satellites, is an obvious user of such a system.

According to various press releases, Red Hat SELinux security policies combined with PostgreSQL RLS delivers effective MLS for the agency's Centralized Supercomputing Facility. Exact details on the implementation are not yet available, but the NRO seems to be prepared to put all of the code for it into GitHub projects. This seems to be part of a trend in the US Department of Defense to release various components as open source, showing that at least one part of the US government believes that open is also more secure.

Even if you don't work for an intelligence agency, though, there are uses for RLS for securing more mundane data like password tables.

Multicore scalability

An area that the PostgreSQL project works on constantly is multicore scalability. While developers are working on scale-out to multiple servers, users also want to run PostgreSQL on bigger and bigger machines. Version 9.5 will bring substantial improvements in read-request throughput on high-core-count machines, such as the IBM POWER-8 machines on which PostgreSQL 9.5 was tested. As these machines offer 24 cores and 192 hardware threads, they make a good target for multicore scalability. IBM's Stewart Smith has been using the same kind of system to push MySQL up to one million queries per second.

PostgreSQL 9.4 peaked at around 32 concurrent requests, with overall throughput dropping beyond that even if there were idle cores available. According to EnterpriseDB engineer Amit Kapila, multicore scalability is a matter of improving lock handling: eliminating as many locks as possible, and reducing the cost of the others. To this end, Andres Freund rewrote PostgreSQL's "lightweight locks" (LWLocks) mechanism to use atomic operations on processors where they are supported, instead of spinlocks. This reduced CPU contention caused by waiting for locks as well as speeding up the process of acquiring a lock.

To further improve throughput, Robert Haas reduced the amount of time that the database holds locks in order to evict buffers from memory, and increased the number of mapping partitions for buffers from 64 to 128. That work, combined with the LWLock improvement, means that PostgreSQL 9.5 now scales smoothly to 64 concurrent requests and up to double the throughput on read-only workloads that 9.4 did — increasing from 300,000 transactions per second to over 500,000 in Kapila's tests [PDF]. Note that Kapila is using a different benchmark than Smith is, so the PostgreSQL and MySQL numbers are not directly comparable.

In version 9.5, the developers have also decreased memory requirements per backend and added transaction log compression to improve memory and I/O performance. Work in PostgreSQL 9.6 is now focusing on other areas of less-than-optimal performance on large servers, such as those with large amounts of memory. Memory management on servers with over 256GB of RAM is inefficient, sometimes causing large amounts of RAM to have little or no benefit for users. Ideas to fix this are under discussion.

Conclusion

There are, of course, more features than the above. The foreign data wrappers facility now supports importing remote database schemas, partitioning across multiple foreign tables, and using index scans on the remote database. The SKIP LOCKED query qualifier makes PostgreSQL a better database for storing application queues. Replication and failover has become more reliable with the pg_rewind tool and other changes.

Version 9.5 also includes a bundle of features targeting "big data" use cases. These include: Block Range Indexes (BRIN) for indexing very large tables, faster data sorting, and data analytics features GROUPING SETS and CUBE. These will be covered in an upcoming article when 9.5 beta 1 is released.

The PostgreSQL project expects to release a beta every month starting in August, until 9.5 is ready for release sometime in late 2015. Historically, the project has released new versions in September, but due to falling behind schedule, mid-to-late October is considered more likely at this point. In the meantime, the alpha release is available for download, including as a Docker container. The PostgreSQL project would like you to try it out for yourself, see if the features are what they're promoted to be, and report a few bugs while you're at it.

Index entries for this article
GuestArticles	Berkus, Josh

A preview of PostgreSQL 9.5

Posted Jul 8, 2015 18:47 UTC (Wed) by petergeoghegan (guest, #84275) [Link] (7 responses)

The article states:

"""
This need for "bulletproof concurrency" has been the biggest thing delaying the feature, as well as the biggest reason for concern by committers about it causing future unanticipated issues. Regardless, the long wait has resulted in a much more powerful UPSERT feature than Geoghegan originally specified, so it was probably worth the wait.
"""

To be clear, the requirements for UPSERT -- that it not exhibit various types of anomalies [1] -- were requirements that I insisted on from an early stage. I was successful in convincing Heikki Linnakangas that certain things were necessary fairly early on (in particular, the avoidance of what I called "unprincipled deadlocks", but also other anomalies that SQL MERGE seemingly allows). Acceptance of these requirements from other community members (those initially less involved) came much later.

While it's certainly true that much of the difficulty in implementing the feature came down to the tangential issue of how the core "value locking" mechanism should work, as well as the syntax and interactions with other, existing features (e.g. updatable views, triggers, exclusion constraints), I was more concerned about the avoidance of concurrency anomalies than anyone else. I was the one who insisted on the fundamental user-visible requirements for the feature loudest and earliest. The issues started to become clearer when I began maintaining that enormous Wiki page at the suggestion of Simon Riggs. That, more than any other factor, is why it took so long -- the fundamental requirements made discussion of implementation issues hard, and vice-versa. There were some pretty counter-intuitive implications of these fundamental requirements [2].

Hacker News reached out to Andres [my co-author], requesting that we do a blog post on UPSERT development due to general interest from their community. I will get around to doing this at some point. For now, perhaps it suffices to say that the feature is complicated enough that a thrifty approach involving iterative prototyping was the only approach that I thought had *any* hope of being successful, precisely because the requirements were so unclear.

[1] https://wiki.postgresql.org/wiki/UPSERT#Goals_for_impleme...
[2] https://wiki.postgresql.org/wiki/UPSERT#Miscellaneous_odd...

A preview of PostgreSQL 9.5

Posted Jul 8, 2015 20:50 UTC (Wed) by jberkus (guest, #55561) [Link] (1 responses)

Yeah, perhaps I should have said "a more powerful UPSERT than was originally discussed".

Seriously, I didn't realize half of what was implemented for UPSERT until I wrote this article. It satisfies most of my personal use-cases for MERGE as well. It's really impressive.

A preview of PostgreSQL 9.5

Posted Jul 8, 2015 22:05 UTC (Wed) by petergeoghegan (guest, #84275) [Link]

Thanks, Josh.

The way I might now explain SQL MERGE is by making a comparison to GROUPING SETS in PostgreSQL.

GROUPING SETS are very useful, but do not allow you to do anything that was not already fundamentally possible with a single SQL statement before 9.5 (you could accomplish something equivalent with a bunch of UNION ALLs in earlier versions). GROUPING SETS are very useful because the syntax is a more direct and natural way of expressing a common requirement, and because internally, the output from one grouping set can be reused for another, making performance far superior. But the GROUPING SETS feature plays by the same rules as conventional queries, and comes from the SQL standard, and so is not a controversial idea.

MERGE is similar. It's nice to be able to express a common requirement (reconciling two tables) more directly, and it's really nice to be able to have that reconciliation driven by a conventional join, with all of the attendant performance benefits. But that does not (and, for all practical purposes, cannot) have anything to do with *guarantees* about outcome. You can accomplish the same thing with data-modify WITH clauses, even if that is ugly and potentially slow.

How does this relate to UPSERT? MERGE (and the data-modifying WITH approach) have certain problems that are just inherent to how they need to work. "Is some row version in the target not visible to your MVCC snapshot? Guess that means that you'll get a duplicate violation when the WHEN NOT MATCHED THEN INSERT handler is taken." (and so on)

I wanted to give PostgreSQL users an easy way to express a "simple" requirement, giving them a guarantee about an *outcome*, which is what makes UPSERT easy to use correctly. That necessarily implies that if you pay close attention to what I came up with, you can see that to a limited extent it takes liberties with MVCC semantics (in READ COMMITTED mode). We don't think that's a problem now, but issues like that certainly made for a difficult and protracted discussion. SQL MERGE definitely has some upsides for bulk loading and so on, but I think it would have been a far less worthwhile (and far easier) project.

A preview of PostgreSQL 9.5

Posted Jul 16, 2015 10:26 UTC (Thu) by chojrak11 (guest, #52056) [Link] (4 responses)

I'm a SQL Server guy. But since 1990s I've also been using MySQL, PostgreSQL and Oracle here and there. Rarely anything more than simple websites and applications, and for Oracle some random occasional stuff. So I wasn't aware of all PostgreSQL features, but still, PostgreSQL was, and is, my favorite Open Source RDBMS. But I've taken for granted that PostgreSQL contains MERGE. In fact I was shocked to read that it doesn't. It's even in MySQL!

MERGE is not only about UPSERT, its much *much* more than that. It allows to INSERT, UPDATE, and DELETE at the same time. On SERIALIZABLE isolation level it's very safe. It allows for infinite flexibility:
- 2 WHEN MATCHED clauses: one with a condition, the second one unconditional: to update target (THEN UPDATE), delete matching rows (THEN DELETE) or ignore some changes (by using the condition)
- 2 WHEN NOT MATCHED BY TARGET clauses: with/without condition: to INSERT interesting data, or ignore some source rows (by using the condition)
- 2 WHEN NOT MATCHED BY SOURCE clauses: with/without condition: to mark target rows as deleted (THEN UPDATE), or really delete (THEN DELETE)
- OUTPUT clause to retrieve updated/inserted rows
- using CTEs
- it can have complex source structure, using JOINs, lateral joins, not only simple SELECT.

It's so powerful I'm unable to comprehend how the dev team decided to not implement it. Combined with OUTPUT clause it's the ultimate UPSERT/DELETE machine with auditing. I can't imagine not having MERGE in my daily work.

With MERGE it's even possible to do full Slowly Changing Dimension type 2 processing (including inferred members handling) during data warehouse load using just one statement! And SCD2 is sometimes UPDATEs, sometimes INSERTs. It's important, because you're doing it using one source object scan (which may be a heavy view for example). There's even a tool 'SCD MERGE Wizard' to assist with creation of this MERGE, because it's syntax is not easy, but the task is doable.

I recommend reading this: https://msdn.microsoft.com/en-us/library/bb510625.aspx
which is a documentation for the best MERGE implementation I'm aware of.

And MERGE is the standard anyway.

If PostgreSQL wants to keep it's reputation of the most advanced relational database management system, it needs MERGE. It needs it badly.

A preview of PostgreSQL 9.5

Posted Jul 16, 2015 20:01 UTC (Thu) by dlang (guest, #313) [Link] (3 responses)

you need to read the posts (mentioned in other comments here and on the LWN article first covering upsert) that cover the problems with MERGE (race conditions, etc)

Unlike MySQL, PostgreSQL chooses to not implement something that it can't implement reliably.

A preview of PostgreSQL 9.5

Posted Jul 20, 2015 22:18 UTC (Mon) by chojrak11 (guest, #52056) [Link] (2 responses)

Yeah I did. If PostgreSQL cannot implement MERGE reliably then it's the problem of PostgreSQL relational engine's architecture. Microsoft was somehow able to implement reliable MERGE (of course it can be reliable only at SERIALIZABLE transaction isolation level). I enjoy it for years.

A preview of PostgreSQL 9.5

Posted Jul 20, 2015 23:00 UTC (Mon) by andresfreund (subscriber, #69562) [Link] (1 responses)

Well, UPSERT/ON CONFLICT UPDATE works with lower isolation levels. Which was an explicit design goal.

Nobody argued against also implementing MERGE. Nobody said it was impossible. It was just not what was most desired by the involved developers (including me). If you, or somebody else, is really interested in having MERGE in postgres, please help out with time or other resources.

A preview of PostgreSQL 9.5

Posted Jul 21, 2015 10:07 UTC (Tue) by chojrak11 (guest, #52056) [Link]

Thanks for your reply.

MERGE also works with lower isolation levels, but it can cause locking problems (no different than individual INSERT, UPDATE, DELETE), so to be 100% sure that nothing interferes with it, SERIALIZABLE is required.

I'd love to help on MERGE with PostgreSQL, I just think I'm not skilled enough. However after your comment I'll try to look what I can do and if I can jump over that barrier :-) The thing is that if you're already implemented UPSERT, it seems very close to full working MERGE. Simplifying things, all the infrastructure is already in place, as I said it's no different than the 3 individual INSERT, UPDATE, DELETE statements (with some conditions), just combined into one statement and possibly interleaved. However I might be wrong, as I haven't done any real RDBMS development.

NRO changes

Posted Jul 9, 2015 9:34 UTC (Thu) by bradh (guest, #2274) [Link] (1 responses)

It actually appears that the NRO is not going to open source the CSCF, because the cross-domain certification people think its constrained by ITAR.

https://groups.google.com/forum/#!topic/mil-oss/9ElRIg_V9_U provides the story. It isn't clear to me how much this affects the PSQL Row Level Security / MLS stuff.

NRO changes

Posted Jul 9, 2015 16:33 UTC (Thu) by jberkus (guest, #55561) [Link]

Well, RLS was already pushed into main-stream Postgres. But it would have been nice to see the top-to-bottom MLS configuration with RHEL + SELinux as an example of how to do full stack security. I'm disappointed that they took it offline, and I hope things will change.