It's been in the works for quite a while, but PostgreSQL 8.0 is finally
out the door. To get the full
scoop on 8.0, we spoke to Josh
Berkus, one of the members of PostgreSQL's steering committee, to learn
about PostgreSQL's new features and plans for future development.
The PostgreSQL press release highlights a number of new features and
improvements, including improved memory usage and I/O improvements. We
asked Berkus if the PostgreSQL team had any benchmarks to share with
regards to these improvements. Berkus said that the project did not have
benchmarks yet, and that the team had been tinkering with performance
"right up to the release candidate." Berkus did elaborate on
the nature of improvements, however.
The basic idea was to make PostgreSQL a little bit smarter about managing
its own cache and its own memory usage. A lot of that effort was
spearheaded by Jan Wieck, who works for
Afilias... their big interest in
improving memory usage was really to flatten out spikes. One of the tests
at the Open Source Development Labs of online transaction processing where
you see that your peak rates of transaction processing is like 4,000 or
4,800 transactions per minute, but then you have these checkpoint spikes
while the system is doing memory synchronization and the like, suddenly
your throughput rate drops by like 1,000 transactions per minute.... from
the perspective of people supporting interactive Web applications, this is
particularly bad because the customer suddenly sees a 30-second lag where
nothing's happening. A lot of the changes were designed to alleviate that
condition.
Berkus noted that the average transaction time for Web applications may not
go down a great deal, but that the median transaction time did go down. He
also said that several developers working on performance tweaks were
pushing for a short development cycle for PostgreSQL 8.1 because they're
"not necessarily satisfied that they're done." Berkus also
pointed out that they would probably never be done improving performance.
Other performance improvements include changes to maintenance routines to
avoid saturating disk I/O. Berkus said that some maintenance routines may
take longer, but would have less of an impact on system performance while
running.
The Savepoint feature has changed as well, according to Berkus. Savepoints
allow parts of a transaction to be rolled back without failing an entire
transaction if part of the procedure fails. Berkus said that savepoints
were initially "implemented as nested transactions" but that
the syntax for Savepoints is now SQL-compliant.
Inevitably, PostgreSQL will be compared to "enterprise" databases like DB2
and Oracle. We asked Berkus how PostgreSQL would compare to products like
Oracle and DB2 given the features that were introduced in 8.0. He said that
there were "still plenty of high-end features that they have that we
don't have yet though each new release of PostgreSQL adds features
that make it "adequate or even superior" for new users. One
feature that PostgreSQL still needs, said Berkus, is multi-master
replication. Right now, there are three separate teams working on two
different forms of multi-master replication, which should be ready within
"a year or a half, if not sooner."
Berkus said that the PostgreSQL project planned to keep replication
facilities, such as Mammoth and Slony,
as add-ons rather than part of PostgreSQL. The reason, according to Berkus,
is that replication "is not a single problem... it's a set of related
problems not all of which should be solved by the same software."
Another feature in 8.0, which may be of little interest to LWN readers, is
the native version of PostgreSQL for Windows. Berkus said that the Windows
release looked to be very popular, judging by early downloads of the
release. We did ask how the performance of PostgreSQL on Windows compared
to performance on Linux or other UNIX-type systems. Berkus said that they
didn't know, since most of the PostgreSQL testing is done through the Open
Source Development Labs, which means that testing is limited to Linux
systems. He did say that he expected that performance on Windows would lag
behind Linux, since PostgreSQL is primarily developed on Unix and POSIX
systems.
What will we see in 8.1? It's too early to tell, but Berkus did mention a
few projects that he's aware of that might be in the works. One issue that
he mentioned is the idea of per-user quotas for PostgreSQL.
Somebody's revived the issue of per-user quotas. People are interested in
it, but the people who are interested don't seem to have the coding talent
to implement it... you don't know how much space something is taking up
without calling a maintenance procedure, so it's a very hard problem to
solve. It's much harder than implementing user quotas on the filesystem.
If the 8.1 release cycle is a short cycle, Berkus says that "a lot
will be deferred to 8.2 because of the requirement for catalog changes in
initdb." Berkus told LWN that the changes were necessary to allow
PostgreSQL to do in-place upgrades rather than requiring users to migrate
data from an older PostgreSQL installation to the new installation.
Currently, the way you upgrade a major version [of PostgreSQL] is to in
install the binaries to a new location, prepare the new location and then
you do a backup of the old database and restore onto the new
platform. There are other ways of making this easier, like using
replication to move the data, but it still amounts to running two
PostgreSQLs at once and moving between those two instances. If you happen
to be running a data warehouse with 300 GB of data, it's quite time
consuming... it's one of the things we have on our plate that nobody wants
to work on.
We asked Berkus why PostgreSQL didn't use a timed release cycle, as opposed
to a feature-based release cycle, like the GNOME Project does. Berkus said
that "nobody's really raised that as an idea" and said that it
would be difficult to do since other projects could release
half-implemented features or features that were still a little buggy, but
PostgreSQL could not. "For us as an enterprise database system, we
can't release anything that could corrupt your data, even a little."
Even if PostgreSQL were to move to a timed release cycle, Berkus said it
would probably be a yearly release cycle rather than a six-month cycle like
GNOME.
Current users of PostgreSQL can count on security and data integrity
patches for the prior two releases (7.3 and 7.4) until the 8.1 release of
PostgreSQL. Berkus added that patches may be released for 7.2 "if the
patch can be released to 7.2 without extra effort." He also said
that support for older versions of PostgreSQL, including backporting new
features, was a role for commercial providers of PostgreSQL and could
provide a value-add for vendors to provide to their customers, without
making it a "headache for developers."
While PostgreSQL may not have all the features of DB2 or Oracle, the
database is closing the gap between itself and proprietary "enterprise"
database systems. With the 8.0 release, PostgreSQL should be able to find
many more adopters in small and large organizations that are looking to
replace expensive proprietary systems with an open source solution.
(
Log in to post comments)