New projects from day two of CoreOS Fest

May 20, 2015

This article was contributed by Josh Berkus

While day one of CoreOS Fest 2015 introduced CoreOS architecture, plans, and specifications, day two introduced multiple open-source projects and tools. Presentations showed systemd-nspawn, Project Calico, Sysdig, and others. Most of these projects have been in development for a year or more, but the talks at the conference were the first look for most attendees.

While the talks themselves were interesting, the most remarkable thing was the sheer number of new tools that have been developed in the last year or so. Building up the software scaffolding for Linux containers seems to have happened faster than many other major changes introduced in Linux. One of the most fundamental pieces of this new infrastructure is systemd — the new init system for Linux — with its support for containers "out of the box".

Systemd and CoreOS

Lennart Poettering of Red Hat gave a presentation on systemd and CoreOS, describing the systemd tools that integrate with container management. Building containers using systemd really displays its benefits compared with other init systems, according to Poettering; systemd supplies all of the tools required to manage diverse containers on a single machine.

He noted that his talk was not an official Red Hat presentation, but then spent a fair amount of time speaking for the systemd team at Red Hat. The team isn't a product team, he explained, "we consider ourselves more of a research department than people who work on products."

This attitude explains some of the design decisions, such as choosing Btrfs as the primary filesystem for systemd-nspawn templates and containers. "Btrfs has a reputation for instability, but [the Btrfs project] is trying to solve fundamental filesystem issues," he said. Also, he explained that it is acceptable for containers to run on an unstable filesystem because "they're not where the data is." Important user data should be stored in external volumes, not in the container.

Systemd has multiple daemons that support containers, including systemd-machined, systemd-networkd, and systemd-resolved. In general, all of systemd is container-compatible because, according to Poettering, "systemd is tested on containers more often than on bare metal". Using containers allowed him to test init without rebooting his laptop frequently. He sees this deep integration with containers as a vital feature for Linux; "containers should be part of the OS itself, like Solaris Zones are."

It is also the goal of his team to be container-agnostic, supporting not just rkt, but also Docker, libvirt-lxc, OpenVZ, and others. The idea is that while systemd supplies a lot of container utility, it should be a low-level building block and not provide a sophisticated user interface. Projects like CoreOS and Kubernetes can then use systemd's functionality for basic operations.

Systemd-machined and its command-line tool, machinectl, are the most obvious piece of container management in systemd. With machinectl, users can list, start, stop, and even login to containers interactively. Systemd-machined is "really just a registry of containers" with which any container can register. Further, it can be used together with systemd to run any command inside a container using "systemd-run -M". Systemd-machined also allows running containers to appear in ps command listings and in GNOME's system monitor.

Systemd-nspawn is a lightweight container executor that provides a Docker-like tool that can start and run containers. It can be used to start a container using any filesystem or block device containing an MBR or GUID partition table. For users who want a limited-feature container manager that requires no configuration, systemd-nspawn will be an attractive option. Rkt uses systemd-nspawn under the hood to run container instances.

Systemd-networkd and systemd-resolved, the network and host-name-resolution daemons of systemd, also support containers. Systemd-networkd will automatically start a container's networking and do internal DHCP address assignment. Systemd-resolved provides host names for containers, using "link-local multicast name resolution", or LLMNR, an automatic name-discovery system invented by Microsoft. While LLMNR was designed for client applications and mobile devices, it can be used by containers to find each other on the network.

Based on Poettering's presentation, it seems like systemd will offer a strong alternative to Docker's libcontainer and other container initialization and management tools. Since the systemd tools will be built into most versions of Linux, they will eventually be widely available by default in many user environments. Perhaps that's why so many of the companies in the container business are focused on orchestration, which is one area where systemd doesn't concern itself.

Go and containers

One of the things that CoreOS, Inc. CEO Alex Polvi announced during his keynote was the company's sponsorship of the second Gophercon, the conference for Go language programmers. In fact, if you look at the list of sponsors for Gophercon, you'll see six of the major container-promoting companies listed there, which is around a quarter of the overall sponsors. This is not a coincidence; both CoreOS, Inc. and Docker, Inc. use Go almost exclusively. "Etcd could only have been built with Go," said Polvi.

In almost every talk and meeting room at the container conferences I've been to, people are talking about, and coding in, Go. Docker is written mostly in Go. Etcd, fleet, Swarm, Kubernetes, Kurma, and many other utilities and daemons for containers were built with it. The rise of Linux containers as a platform is likely to also be the rise of Go as a language.

Go started at Google in 2007 as an internal project with three developers, and today has over 500 contributors both inside and outside Google. The project is open source under the BSD license, but it is still run by Google staff and contributing requires signing a Contributor License Agreement (CLA) to Google. Increasingly, Go is used as an "automation language" for scalable server infrastructure; prior to Linux containers, it was popular for implementing network proxies, cloud server management tools, distributed search engines, and redundant data stores. So it's perhaps unsurprising that container utility programmers should also have chosen the language.

Because of CoreOS's close ties with Go, Brad Fitzpatrick gave a general session on Go's continuous build infrastructure. Fitzpatrick, known for LiveJournal, memcached, and OpenID, is now on the Google Go team. He presented at the conference on the automated build infrastructure that is used to test the language on many possible platforms. It started out as a Google App Engine application, plus a chain of mobile devices on Fitzpatrick's desk, and grew. His talk covered some of the history and mechanics of how it works.

Since Go is a compiled, rather than interpreted, language, it's critically important for users to know that binaries will execute on different platforms. Every check-in of Go gets built on hundreds of platform variations in a large machine lab at Google. Containers play a minor part in this because so many of the platforms to be tested don't support, or work with, containers. Linux variants are tested using Docker, but operating systems like Mac OS X and Android need special-purpose hardware to test them. You can see the current build test status and which builds are broken for various platforms on the Go Dashboard.

Project Calico

While Project Calico has been open source for almost a year, it was new to most of the audience when core developer Spike Curtis presented it. Calico is multi-host network routing software that includes a distributed, per-service firewall. It is designed for containers and virtual machines, especially Docker and OpenStack environments. The project is written in Python and developed by Metaswitch Networks, which is currently Calico's only commercial support vendor. Calico looks like a potential solution for users who want to deploy containers in production, but have stringent security requirements.

"Remember three-tier architectures?" complained Curtis. "That's still how admins secure networks. You have your external network, your DMZ with web resources, and your data layer, which needs to be the most secure."

"Microservices" running in containers on an orchestration network break down this three-tier model. First, microservices are defined by what service they provide, rather than their security characteristics. Second, orchestration frameworks expect an undifferentiated data center network and aren't designed with the concept of security tiers. Most of all, microservices require defining security policies and zones for literally hundreds of entities, instead of the few dozen network administrators expect. As he described, "it's a zoo and you've torn down the walls."

However, microservices offer a security opportunity as well. Because each one only does one thing, you can characterize its security requirements in simpler terms. This means that services can be compartmentalized in a more sophisticated way without added complexity, and that's what Project Calico is designed to do.

With each microservice or container mapped to a single IP address, Calico implements a simple iptables-based firewall running on each physical host for each of those IP addresses. Each service is defined by tags stored in etcd, and a JSON-formatted configuration file defines which other services are allowed to connect to it — or if it's available to the Internet.

Project Calico is designed to integrate with any orchestration framework that supplies an IP address for each service. Curtis demonstrated using Calico with Kubernetes, including using an extended Kubernetes pod definition to define security settings for each container. Apache Mesos is currently working on the IP-per-service feature, so it doesn't work with Calico yet.

Sysdig

The final "new" project described at CoreOS Fest was Sysdig. Like Project Calico, it was released about a year ago but most attendees saw it there for the first time. Also, like Project Calico, Sysdig is backed by a single company, Sysdig Cloud, which offers commercial support for the tool. Loris Degioanni, CEO of Sysdig Cloud, presented the tool at CoreOS Fest.

Sysdig is a traffic-monitoring system that is partially implemented as a Linux kernel module. The module captures all network traffic on the system, especially traffic between containers. The Sysdig tool supports writing filters in Lua (called "chisels") for this information, which allows users to aggregate it for statistical analysis. It can be thought of as a more advanced version of wireshark and tcpdump combined with container-awareness.

Degioanni said that Sysdig is an improvement on the Google cAdvisor project — frequently used with Docker containers — because cAdvisor only tells you about overall CPU, memory, and network usage of containers. Sysdig also gives you the ability to distinguish the endpoints and content of traffic. This means that you can, for example, filter for certain database queries, or troubleshoot unusual lag between two specific IP addresses.

One of the things Degioanni demonstrated was the soon-to-be-released open-source curses-based user interface for Sysdig, which is intended to allow system administrators to do interactive monitoring over SSH. He showed how to dig into traffic between containers and summarize it, as well as how to look into network delays. At the Sysdig Cloud booth, its staff showed off a much fancier, proprietary graphical user interface that supports clicking through to nested layers of servers, pods, and containers.

Day two wrap up

The new projects, tools, draft standards, and architectures I learned about at CoreOS Fest showed the rapid pace of development in the Linux container world. A year ago, when I reported on the first DockerCon, most of the techniques and tools covered at CoreOS Fest had just been launched or didn't even exist. Next year, we will see if development is still so high-velocity.

Of course, there's one major topic we haven't yet covered: the ongoing issue of storing persistent data in containers. As mentioned above, there is currently an expectation that containers are stateless and do not keep data. Removing that expectation raises a number of problems for container management and orchestration that are only beginning to be addressed, such as management of external volumes, container migration, and load-balancing of stateful services. Join us next week for coverage of multiple topics related to persistent data and containers from both CoreOS Fest and Container Camp.

Index entries for this article
GuestArticles	Berkus, Josh
Conference	CoreOS Fest/2015

Go as a faster Python

Posted May 22, 2015 16:36 UTC (Fri) by ncm (guest, #165) [Link] (2 responses)

I guess we should see this use of Go as an example of its displacing Python in places where performance sort of matters.

Go as a faster Python

Posted May 22, 2015 17:07 UTC (Fri) by jberkus (guest, #55561) [Link] (1 responses)

Yes, I think that's the case a lot of places. At least, I know of more than a few Python/Go shops, although I know about even more Ruby/Go shops, for which the same argument would hold.

Go as a faster Python

Posted May 22, 2015 20:55 UTC (Fri) by sjj (guest, #2020) [Link]

As someone more on the Ops side, static binaries FTW! Not having to worry about different versions of interpreters and supporting libraries on different machines is a bigger reason for me to really learn Go. Plus simple cross-compilation.

New projects from day two of CoreOS Fest

Posted May 22, 2015 17:59 UTC (Fri) by sciurus (guest, #58832) [Link]

Sysdig gives you insight into more system activity than just network traffic; see http://www.sysdig.org/wiki/sysdig-examples/ for examples.

New projects from day two of CoreOS Fest

Posted May 31, 2015 12:51 UTC (Sun) by kleptog (subscriber, #1183) [Link] (4 responses)

Project Calico looks interesting, but it says it uses etcd and last time I checked they have no kind of security against a rogue agent. So an attacker in a container would just need to fiddle the settings in etcd and then wait for the firewall to open up.

Seems a strange thing to overlook, but the whole containerisation craze recently has shown little interest in security.

New projects from day two of CoreOS Fest

Posted Jun 2, 2015 19:36 UTC (Tue) by philipsbd (subscriber, #33789) [Link] (3 responses)

etcd supports TLS client certificates for authentication to the service and has supported this from very early on in the project. And in the upcoming version of etcd you can pair transport security with authz using users and roles.

New projects from day two of CoreOS Fest

Posted Jun 2, 2015 20:11 UTC (Tue) by kleptog (subscriber, #1183) [Link]

> And in the upcoming version of etcd you can pair transport security with authz using users and roles.

Now that is excellent news, thanks for pointing it out. That definitely puts etcd back in the running for me (not that there were many competitors...). The API described in that page looks easily sufficient for the most important requirements, namely making configuration read-only for most users and preventing key enumeration.

New projects from day two of CoreOS Fest

Posted Jun 2, 2015 23:35 UTC (Tue) by jberkus (guest, #55561) [Link] (1 responses)

Am I reading this correctly that users and roles are stored in etcd itself? There's a bit of a security catch-22 with that; how does etcd prevent attackers from exploiting the authentication method?

New projects from day two of CoreOS Fest

Posted Jun 3, 2015 18:10 UTC (Wed) by philipsbd (subscriber, #33789) [Link]

I don't follow your concern; storing users/roles in etcd keys makes it very similar to how SQL databases work. The keys that hold the users/roles aren't exposed via the normal keys API if that is your concern?