February 16, 2011
This article was contributed by Koen Vervloesem
The 2011 FOSDEM conference had a Configuration
and Systems Management developer room on its second day. This first
meeting about configuration management and automation with open source
tools was organized by the people from Puppet Labs and had a focus on
Puppet, but other tools like Chef and Cfengine were also discussed.
Configuration management is about establishing and maintaining
consistency of a system throughout its life. For software, this means that
the system has to track and control all configuration changes, which can be
the contents of files in /etc, the installation of specific
packages, file permissions, users, and so on. Having a configuration
management tool for your systems is useful in a lot of ways: you can
automatically repair a system's configuration after a failure, you can
easily reproduce a specific configuration on another system, you can audit
changes, and, if you pair the configuration management system with a
version control system like Git, you can always return to a known-good configuration if things go wrong. Where configuration management systems really shine is when you have a large number of systems networked together: by automating the configuration, you save the system administrator's time and you're sure that all systems are configured consistently.
The big three configuration management systems for Linux are Puppet (used by Red Hat,
Citrix, and the Los Alamos National Laboratory), Chef (used by Engine Yard,
37signals, and Scribd), and Cfengine 3 (used by Facebook, AMD
and the Joint Australia Tsunami Warning Centre). Puppet and Chef are
broadly similar in architecture, but Puppet has a language designed
specifically for the task of describing resources, while Chef is using the
general-purpose programming language Ruby to configure resources. Also,
Chef seems to be more aimed at developers that want to deploy their web
applications, and it doesn't support as many platforms as Puppet
does. Cfengine is the grandfather of these configuration management systems
(with Cfengine 3 as a total rewrite); one of its advantages is its
lower memory footprint and higher performance than Puppet and Chef, but in
recent years its popularity has declined. Other configuration management
systems that were present in the developer room are FusionInventory, GLPI, and OPSI.
A meta-distribution
In his case study about Linux system engineering in air traffic control,
Stefan Schimanski showed how scalable Puppet really is and how it can
guarantee reliable mass deployment of the Linux-based, mission critical
applications needed in air traffic control centers. Air traffic is growing
yearly, so the number of computer systems that have to handle these flights
is also growing, as is the work load for the system
administrators. Moreover, the systems really need 24/7 365
high-availability: if they go down for 30 minutes, air traffic control has a really big problem. For example, if a computer in a control center freezes, the operator is essentially blind.
These strong requirements coupled with the growing number of servers mean that air traffic control centers need automatic installations of every system with minimal downtime and fast rollbacks. Moreover, all informal requirements documents, described by non-technical people, should be converted into formal specifications of the configuration of the system, to be able to standardize the systems and make their configuration reproducible. Therefore, Schimanski rethought his system engineering approach in 2010 and turned to Puppet.
One thing that Puppet makes easy is distinguishing between the abstract
requirements and the concrete implementation. For each node, the system
administrator can define how the node has to be configured in an abstract
way, e.g. by including classes for a desktop node, a server node, a
webserver node, and so on. By reading these node definitions, you can
easily see what the node is supposed to be doing, without having to bother
with the concrete implementation, which is written in separate files for
these classes. For example, the webserver class installs and configures
Apache and also includes the configuration of the server class. Moreover,
according to Schimanski a good Puppet configuration introduces
traceability, which is essential in that kind of environment: "If someone asks where requirement #91 of the requirements document is implemented, it's easy to point out the Puppet code that implements this."
Another interesting idea that Schimanski introduced in his talk was the
concept of a meta-distribution: the air traffic control systems are
implemented as SUSE Linux Enterprise and Red Hat Enterprise Linux servers,
but the Linux distribution itself is completely interchangeable. The
AutoYaST or Kickstart files of the installation are minimal, and almost all
configuration is done in the form of Puppet modules, e.g. for NTP and other
services. The result is a heavily customized enterprise Linux distribution,
but all these customizations are documented in a completely formal
way. Schimanski explains the rationale behind this approach:
We don't want to depend on one operating system, so if,
hypothetically, Novell stops the development of SUSE Linux
Enterprise, we could migrate our systems to Red Hat Enterprise
Linux or even Ubuntu Server in only four days without redoing all
the configuration work.
To a certain degree, Puppet modules can be written in an operating system independent way. There are always some minor differences, such as where the distribution puts its configuration files, but this can be abstracted away with variables that get their value (e.g. the file path) depending on the operating system. Of course you have to check these little things before migrating to another operating system, so it's not effortless, but according to Schimanski, Puppet makes migrating a lot easier.
The Puppet ecosystem
The talks also showed that there is a nice ecosystem of tools developing
around Puppet. For example, Henrik Lindberg gave a demo of Geppetto, a new Eclipse-based project developing tools
to simplify the process of authoring and using Puppet manifests and
modules. The near-term objectives of the project are flattening the
learning curve for new Puppet users, supporting best practices, and
encouraging the sharing of Puppet modules. Under the hood, Geppetto has a
grammar for the Puppet DSL (Domain Specific Language), written with Xtext. Thanks to Xtext, this also
automatically results in an Eclipse editor that knows the Puppet language
and offers syntax coloring, code completion, code folding, and syntax
errors and warnings. Moreover, when creating a Puppet module you can enter
metadata and choose dependencies, and at the end you can export the module
to a zip file which can be uploaded to the Puppet Forge. The Geppetto
integrated development environment can be downloaded as a stand-alone
product for Linux, Windows or Mac OS X, or as a separate plug-in for
Eclipse.
Another rising star in the Puppet ecosystem is Foreman, presented by its creator Ohad
Levy, who joined the ranks of Red Hat in August 2010 as a principal
software engineer in its cloud team. This project is now a year and a half
old and has 20 contributors, but according to Levy, Foreman will at some
point be part of Red Hat's cloud portfolio. Foreman integrates with Puppet
and acts as a web based dashboard for it, providing real time information
about the status of hosts based on Puppet reports, statistics, and so
on. Moreover, Foreman takes care of the low-level details of setting up
machines and installing the Puppet client on them, until Puppet is able to
take care of the configuration defined in your Puppet modules. It even
supports creating virtual machines using the libvirt API, with RHEV-M and Amazon EC2
support in the works. The largest installation managed by Foreman that Levy
knows about is running 4000 active hosts. This is clearly a project to
watch, as it is backed by Red Hat and it has the potential to make managing
an environment with Puppet a lot easier.
Configuration management is not only useful for system administrators
installing servers, but also for developers setting up their development
environment. Gareth Rushgrove talked about using configuration management
tools to get new employees up and running quickly with a development
virtual machine. Especially interesting was his coverage of Vagrant, a tool for automated virtual
machine creation for Oracle's VirtualBox. Using automated
provisioning of the virtual environments using Puppet or Chef, developers
can get a complete development environment up and running in no time. Users
can configure Vagrant to forward ports to the host machine, to configure
shared folders, and so on. It's also possible to package an environment in
a distributable box, and rebuilding a complete environment from scratch or
tearing down the environment when you're done is possible with a single
command. Normally users start by downloading a base box to use with Vagrant
(the default one is Ubuntu Lucid Lynx), but they can also build their own
base box with a tool like VeeWee.
Lessons for disaster recovery
While Puppet clearly was the most visible configuration management
system at FOSDEM, it was not the only one. Joshua Timberman, Sr. Technical
Evangelist at Opscode (the creators
of Chef), gave a short "Chef 101" talk, followed by an overview of how to
use Chef
to deploy applications with nothing but the source code repository and data
about the application configuration. Traditionally, one deploys
applications with tools like tar, rsync and (in the Ruby world) cap
deploy, but what do you do then with the server configuration, like
that needed for web servers, load balancers, database servers? Timberman showed how you can easily deploy web applications with their corresponding servers using various server roles configured in Chef cookbooks. The Chef server itself is a lightweight Ruby on Rails application, and the largest Chef deployment that Timberman knows about has 5000 nodes checking in to the Chef server each 30 minutes.
The first talk of the day was by Nicolas Charles and Jonathan Clarke who presented their use of Cfengine in their company Normation and focused on their experiences with disaster recovery. All their services (web, email, Git repository, Redmine, ...) were running on one hosted server. This used a three-disk RAID5 array, with daily backups, separate virtual machines for each service, and all services automatically installed and configured using Cfengine 3.
When two hard drives failed simultaneously, they first thought this
would be easy to repair, as they had backups and used a configuration
management system. However, it seemed they had forgotten some things. For
example, they hadn't automated nor made a backup of the configuration of
the virtual machines, so these had to be re-created manually. Moreover,
after watching all the services coming back online with the right
configuration thanks to Cfengine 3, they saw that they had to manually
restore the backups, after which they saw that a couple of files were
missing. The three big lessons here are: don't forget to describe your
virtualization setup in your configuration management system, tie in your
configuration management system to your backup tool, and always test your backups.
The system administrator as glue
The best quote that summarized the don't reinvent the wheel approach of
configuration management came from Levy's talk: "Automate as many
processes as possible, using best practices where available, and act as the
glue between the gaps." In this regard, it is interesting to know
that everyone can share their Chef "cookbooks" (packages of "recipes") on
cookbooks.opscode.com, and
Puppet users can share their Puppet modules on the Puppet Forge. This is great for new users who can research the modules of other users and reuse them in their own infrastructure. Your author had already automated some of the services on his home network with Puppet, and this configuration management track at FOSDEM was inspiring enough to continue this approach and decrease the amount of glue in his network.
(
Log in to post comments)