An overview of Project Atomic
Terms like "cloud-native" and "web scale" are often used and understood as pointless buzzwords. Under the layers of marketing, though, cloud systems do work best with a new and different way of thinking about system administration. Much of the tool set used for cloud operations is free software, and Linux is the platform of choice for almost all cloud applications. While just about any distribution can be made to work, there are several projects working to create a ground-up system specifically for cloud hosts. One of the best known of these is Project Atomic from Red Hat and the Fedora Project.
The basic change in thinking from conventional to cloud-computing operations is often summed up as that of "pets versus cattle". Previously, we have looked at our individual computers as "pets", in that we think about them as individual entities that need to be protected. If a server went down, you would carefully fix that server, or in a worst case replace it with a new host by copying over a backup. In a cloud environment, we can create and destroy hosts in seconds, so we take advantage of this by treating those hosts as largely disposable "cattle". If a host encounters a problem, we just destroy it and create a new host to take over its function.
Closely coupled with this paradigm shift is a move to containerization. Container systems like Moby (formerly known as Docker) or rkt allow you to deploy software by packaging it into an image, similar to a very lightweight virtual machine, complete with all dependencies and system configuration required. Containerized, cloud-based deployments are quickly becoming the most common arrangement for web applications and other types of internet-connected software that are amenable to horizontal scaling.
With disposable servers running software that comes packaged with all its requirements, there's no longer as much need to manage the underlying servers at all. Ideally, they are set up once and never changed—even for updates. Instead of being "administered" in the conventional sense, an out-of-date cloud host is simply destroyed and replaced with a new one. This pattern is referred to as "immutable infrastructure", and it's perhaps the largest technical shift involved in modern cloud computing.
In a typical container-based cloud deployment, there are actually two levels at which a Linux distribution is involved. First, there is the operating system on the cloud hosts. Second, there is the Linux environment in the application containers running on those hosts—these share the kernel but have their own user space. The flexibility to use different distributions for different applications is one of the advantages of containerization, but in production systems the distribution installed in the container will often be a lightweight one like Alpine or a stripped down Ubuntu variant. The Linux distribution on the cloud host itself should be capable of running the containers and should provide whatever services are needed for maintenance and diagnostics.
Project Atomic produces such a distribution and, more generally, builds tools for Linux cloud hosts and containers. Project Atomic is the upstream project for many components of OpenShift Origin, which is the basis for Red Hat's commercial container-deployment platform OpenShift. The main Atomic product is Atomic Host, a Linux distribution for immutable cloud hosts running containers under Moby and Kubernetes.
Atomic Host comes in two different flavors depending on the user's risk appetite and desired update cycle: one derived from CentOS and one derived from Fedora. On top of the base distribution, Atomic Host makes a number of modifications. The most significant is the rpm-ostree update system. Rpm-ostree is a based on OSTree, which is described as "Git for operating systems", and is integrated with the familiar Red Hat package-management ecosystem but operates quite a bit differently. Conceptually, rpm-ostree manages the system software much like a container image. The entire installation is one atomic version, and updates replace it completely with a new version while keeping the previous version available for rollback. These versions are managed in a Git repository.
So, with Atomic Host, instead of installing an operating system on a server and then installing and configuring a variety of system software, you can instead create a virtual machine or cloud host from an Atomic Host image and then, in one step, install a Git-controlled system configuration. The system is ready to serve an application in just two steps that can be easily automated by a cloud orchestrator. When the image becomes outdated, whether because of updates to the kernel or just to some configuration files, you build a new image and completely replace the filesystem on the running hosts with the new image.
Atomic Host's Fedora variant updates at a strict biweekly cadence, providing new images with the patches and updates that would normally come as updates to individual packages. The CentOS option currently releases irregularly, about once per month, although one of the goals of the CentOS Atomic SIG is to establish a regular release cycle for the future. In a notable difference from a more conventional update strategy, users will have to apply these as complete system updates in order to receive security patches and bug fixes. Since per-package updates may come more quickly than every two weeks, there may be a somewhat larger window of vulnerability for Atomic Host when compared to Fedora or CentOS.
Composing with rpm-ostree
Let's take a closer look at how an rpm-ostree configuration is created. The Git repository controlling rpm-ostree contains a number of files that describe the cloud host in terms of installed package versions and their configurations. The rpm-ostree tool then "composes" the system, assembling the packages and configuration into a filesystem and storing the result as an image file. The underlying OSTree tooling actually records the state of the filesystem after the compose and then installs the resulting version by copying the entire filesystem onto a host. This is conceptually similar to a disk image but, since OSTree operates at a filesystem level, it is both more efficient and capable of using filesystem features to retain older versions for easy rollback of changes.
This is a different way of installing updates from the one we've all become accustomed to. There are several major advantages, though. First, the filesystem replication process guarantees that the installed software and operating system configuration on one host will be identical to those installed on the others. The rpm-ostree mechanism also allows thorough testing (including automated integration testing) of a complete host configuration, from the operating system up, before making changes to the production environment. Second, it prevents a huge number of potential problems with software installation and update consistency, making it much easier to automatically update hosts—every update is an installation from scratch with no concerns about the starting state. Finally, it makes it far easier to replace hosts. Since all of the software on the host came from a known, version-controlled configuration, there's no need to worry that there is special configuration on a host that will be lost. In short, there's no need for backups as everything can be exactly recreated from the start.
Rpm-ostree extends the basic OSTree image approach by allowing "layering" of RPM packages on top of an OSTree configuration. If you have worked with Docker/Moby, this concept of image layering will be familiar. If an existing OSTree image meets your basic needs but you require some additional software, the RPM packages for the additional software can be added on top of the OSTree image without building an entirely new image. This eases the transition to immutable updates by allowing administrators to stick to one of their most familiar tools: making software available by installing a package.
Also included in Atomic Host is the atomic command, which wraps
Moby to simplify installing and running container images that have specific
expectations about the way they are run. atomic is particularly
well suited for Moby containers used for system functions; an example use case
is automatically placing systemd unit files to manage the Moby container as a
service. Instead of pulling a Moby image and going through several steps to
set up a systemd unit to run it with appropriate settings, containers with the
right metadata can just be installed with atomic install.
It should be clear that Atomic Host is a significant change from traditional server-oriented Linux distributions. Atomic Host might sound difficult to use for small environments and one-off servers, but those are not in its use cases. Immutable infrastructure systems work best in at-scale environments where hosts—servers, virtual machines, or cloud instances—can be obtained and abandoned easily. Philosophically, Atomic Host focuses on providing a testable, consistent environment overall, rather than maintaining individual hosts for maximum reliability. As a result, it is not a good choice for "pet" servers that are individually important and cannot easily be replaced. To truly take advantage of Atomic Host, the applications you run should be able to tolerate individual hosts disappearing and add new hosts as they become available.
The benefit of this change is simpler and more automated maintenance of multiple servers. When your application runs on hosts that are nearly stateless and can be rebuilt in minutes, there's far less need for traditional system administration. Instead, your time can be spent on thorough design and testing of your system configuration.
Atomic Host is not the only product of Project Atomic. There is also Cockpit, which is a web application that allows for easy remote management and monitoring of hosts running Atomic Host. Cockpit is a step between manually installing and managing Atomic Host nodes and a complete configuration management system such as Puppet; Cockpit is still a tool to monitor and manage individual hosts, but allows you to do so from a central interface instead of via SSH. Because of Atomic Host's simple, low-maintenance design, this may be all that's needed for cloud environments with tens of hosts.
While it's perfectly possible to make changes to an Atomic Host installation on the fly via SSH or even Cockpit, doing so negates many of the advantages of immutable deployments by introducing inconsistent and likely poorly tested changes. As with many technologies, immutable deployments also require an element of discipline by the operators. The temptation to make a "quick fix" must be overcome in favor of a tested, controlled change to the entire deployment.
Finally, Project Atomic has also put quite a bit of work into developing the basic nuts and bolts for a well-run Linux cloud environment. A major example is their significant work on integrating Moby with SELinux. This effort, like Atomic Host and Cockpit, will help to keep Linux solidly at the front of cloud computing.
| Index entries for this article | |
|---|---|
| GuestArticles | Crawford, J. B. |
