PostgreSQL version 9.2
was released on September 10th, with many enhancements that web application developers—and companies who host web applications—are excited about. One of the most excited is Ruby on Rails and PHP application host Engine Yard, who recently switched its default database option from MySQL to PostgreSQL. Combined with recent data on database migrations, Engine Yard's switch of databases signifies a sea change in the long rivalry between PostgreSQL and MySQL, as well as in Ruby on Rails development.
For coverage of the PostgreSQL 9.2 features, see the LWN article on PostgreSQL 9.2 Beta.
For information about the switch, I interviewed Engine Yard Lead Data
Engineer Ines Sombra. But first, some background. Readers who are already
familiar with Rails, MySQL, and PostgreSQL can skip down to the interview.
Ruby on Rails and MySQL
Ruby on Rails is a full-stack web
framework, based on automated code generation. Rails is implemented in Ruby, a programming language which
existed for ten years before Rails was introduced in 2004. It's "full
stack" because it handles all parts of a web application above the database
and operating system level, including querying the database, generating HTML pages, and responding to HTTP requests.
Rails has become one of the top five web development platforms because of
its rapid and beginner-friendly web application building. The
centerpiece of this is a code-generation engine, which creates the initial
"scaffolding" for the web application developer, using the
Model-View-Controller (MVC) design pattern, saving the developer a lot of
time by handling many of the repetitive tasks that are normally required. The simplest Rails applications perform what is known as CRUD, for Create, Read, Update, and Destroy records in a database.
Early Rails supported only the MySQL database system, since it was regarded
as the simplest, most developer-friendly SQL database available. That
allowed developers to focus on keeping all of the application logic inside Rails. While there were early attempts to introduce PostgreSQL support, none of them really caught on until Rails 3.0 was released, so the vast majority of Rails developers used only MySQL through 2010.
Rails hosting, Engine Yard, and Heroku
While Rails made creating and developing a web application much easier than
before, it did nothing to reduce the difficulty of hosting a web
application. If anything, the highly dynamic nature of Rails makes it one
of the most resource-consumptive web frameworks, requiring skill and
experience in deployment, scaling, and uptime maintenance. Recognizing
this dichotomy in 2006, Tom Mornini, Lance Walley, and Ezra Zygmuntowicz
founded Engine Yard, a "fully managed web host" or "cloud host" for Rails projects. Engine Yard allowed the developer to just write code, and leave installation, scaling, and uptime to others — for a monthly hosting fee.
Initially, Engine Yard supported only MySQL for data storage, and
hired a large, expert MySQL team to manage thousands of
MySQL databases. Engine Yard built a sophisticated "database as a service" infrastructure on MySQL to support its customers. As the number one Rails hosting option, this meant that the majority of hosted Rails applications were not just on MySQL, but on Engine Yard's MySQL.
However, a year after Engine Yard launched, another team of
a competing Rails hosting service: Heroku.com. Heroku introduced a Git-centric
model of application deployment that appealed to startups practicing agile
development, and by 2010 it had grown into a strong competitor to
Engine Yard for Rails users. Unlike Engine Yard, Heroku used PostgreSQL as
its default—originally its only—database for customers. Heroku began
promoting PostgreSQL among Rails developers and its user base,
culminating with the introduction of a
PostgreSQL cloud hosting service this year.
Today, both Engine Yard and Heroku support additional platforms, such
as PHP, Python, and Java, in addition to Rails.
Migrations from MySQL to PostgreSQL
In April of this year, Engine
Yard introduced PostgreSQL 9.1 as an option for its users. In August,
Engine Yard announced that, with the release of PostgreSQL 9.2, it would become the default database option for new applications.
Engine Yard's users are not alone in migrating from MySQL to PostgreSQL.
451 Research recently released a subscription-only study, called "MYSQL
VS. NOSQL AND NEWSQL: 2011-2015", that looked at
tech sector MySQL users and their plans two years after the Oracle
acquisition. For the one out of six MySQL users planning to migrate away
from MySQL, the most popular option is PostgreSQL. In the Rails community, Planet Argon found that 60% of Rails developers now preferred PostgreSQL, up from 14% just three years ago.
Interview with Ines Sombra
To fill us in on the details of Engine Yard's move from MySQL to
PostgreSQL, I interviewed Ines Sombra. Ines, who was born in Argentina,
became a Rails developer and enthusiast while working at Engine Yard.
Josh: What's your role at Engine Yard?
As the Lead Data Engineer, I'm fortunate to work with a
fantastic cross-functional team of engineers and DBAs [database administrators] to formulate Engine Yard's data storage strategy. We help customers meet their data needs and develop products to bring emerging data storage technologies into our platform. Our work also includes maintenance of all supported databases.
Josh: What was the primary motivation for Engine Yard to offer PostgreSQL support?
Our primary motivation is to provide an extensive and robust stack and give our customers a wide array of excellently supported choices. MySQL had traditionally been our only supported database on Engine Yard Cloud but we decided to expand our stack with PostgreSQL to provide:
- Feature parity with Engine Yard Managed where PostgreSQL has always been supported.
- Access to flexible and reliable replication modes: Streaming replication and hot standby in major version 9.0 were huge. Hot standby, in particular, was one of the final points of oft-asked-for, and needed, feature parity with MySQL.
- Rails 3.1 came out with significant performance enhancements using PostgreSQL's prepared statements.
- Versatility of natively supported features like full-text search, geolocation, and foreign data wrappers reducing the need for third party tools in our clients' deployments.
- Outstanding third party support options for our customers, both commercially and from the PostgreSQL community.
PostgreSQL also was internally loved, since the majority of our internal applications are already running on Postgres. We truly believe that PostgreSQL is the future of relational open-source databases and we are happy to provide our customers with great support for it.
Josh: Do you expect Engine Yard customers currently using MySQL to migrate to the new PostgreSQL databases?
We do, a few of our largest customers (with established MySQL applications) have already inquired about the feasibility of migrating to PostgreSQL. We are working with them to address their needs and find ways to better perform these types of migrations.
While we will always support customers using MySQL, we expect the number
of new PostgreSQL applications to grow as the new default makes it
easier than ever to get started.
Josh: What was the timeline of this change? How long did it take?
It took us a little over a year to roll out our new PostgreSQL infrastructure. We started in Q2 2011 and our release schedule included both 9.0 and 9.1 versions ...
Our early PostgreSQL 9.0 release was amongst the most popular we've ever had. Customers started using it in production applications immediately, so we accelerated our engineering processes to better serve their requirements. Within 4 months after 9.1 became available in our platform we were able to make it our new default database.
Josh: What were the biggest technical obstacles you encountered in deploying a PostgreSQL infrastructure, and how did you solve them?
We didn't encounter major technical obstacles per se, but rolling out PostgreSQL support helped us restructure and enhance some of our existing architecture and processes. Here are a few of the notable changes:
Redefined Assumptions: We have traditionally been a MySQL shop and our product made assumptions based on the existence of a MySQL database. Our tests and continuous integration suite assumed that every new environment would have a MySQL database associated with it. Defaulting to PostgreSQL in our codebase allowed us to introduce the concept of a new default. We were able to break away from dependencies by refactoring tests and redefining what we expect from customer environments.
Allowed Extensions: PostgreSQL has a rich extension ecosystem and we want to encourage our customers to explore it. Engine Yard Cloud customers have dedicated instances that can be further customized by applying custom Chef recipes. We provide a way to enable over 30 available extensions on PostgreSQL environments. We curate this repository for all supported versions of PostgreSQL and continually add new extensions based on customer requests. PostGIS and hstore have been the most popular extensions installed.
Standardized Architecture: Engine Yard Cloud sits on top of AWS [Amazon Web
Services] and we rely
on EBS [Elastic Block Storage] volumes for database persistence. Unfortunately, snapshots taken on
32-bit [PostgreSQL] instances cannot be used on volumes mounted in 64-bit architectures. Customers with PostgreSQL databases had to dump and restore their data in order to vertically scale to bigger instance sizes. We solved this problem by rolling out 64-bit instance types for small and medium sizes and defaulting all databases to use a 64-bit architecture.
Ease of Upgrades and DR [disaster recovery]: At the moment we are working to make the process of transitioning between database versions easier and more automated. We are looking at tools (like repmgr) that would allow us to replicate across versions and environments. One of our high priority items is to roll out PostgreSQL 9.2 support based on best practices we've learned along the way and allow customers to upgrade with minimal impact.
Josh: Which 9.2 features are Engine Yard users most interested in, and why?
We are very excited about the features that are included in the PostgreSQL 9.2 release. We think our customers will be particularly interested in native JSON support, covering indexes, replication and performance enhancements.
Over the last year, we have seen an increase in the use of document-oriented databases like MongoDB. With native JSON support, developers have access to the schema flexibility of NoSQL databases while continuing to enjoy the ACID guarantees, operational simplicity, and transactional efficiency of PostgreSQL.
JSON validation in the database helps simplify application logic and ensures that any client that connects to the database will have a consistent way to manipulate and save this type of data. We think this feature alone will be a great upgrade motivator and are looking forward to seeing it live.
[Josh Berkus is a member of the PostgreSQL Core Team.]
to post comments)