LWN.net Weekly Edition for February 25, 2016
Trouble at Linux Mint — and beyond
When the Linux Mint project announced that, for a while on February 20, its web site had been changed to point to a backdoored version of its distribution, the open-source community took notice. Everything we have done is based on the ability to obtain and install software from the net; this incident was a reminder that this act is not necessarily as safe as we would like to think. We would be well advised to think for a bit on the implications of this attack and how we might prevent similar attacks in the future.It would appear that the attackers were able to take advantage of a WordPress vulnerability on the Linux Mint site to obtain a shell; from there, they were able to change the web site's download page to point to their special version of the Linux Mint 17.3 Cinnamon edition. It also appears that the Linux Mint site was put back on the net without being fully secured; the attackers managed to compromise the site again on the 21st, restoring the link to the corrupted download. Anybody who downloaded this distribution anywhere near those two days will want to have a close look at what they got.
The Linux Mint developers have taken a certain amount of grief for this episode, and for their approach to security in general. They do not bother with security advisories, so their users have no way to know if they are affected by any specific vulnerability or whether Linux Mint has made a fixed package available. Putting the web site back online without having fully secured it mirrors a less-than-thorough approach to security in general. These are charges that anybody considering using Linux Mint should think hard about. Putting somebody's software onto your system places the source in a position of great trust; one has to hope that they are able to live up to that trust.
It could be argued that we are approaching the end of the era of amateur distributions. Taking an existing distribution, replacing the artwork, adding some special new packages, and creating a web site is a fair amount of work. Making a truly cohesive product out of that distribution and keeping the whole thing secure is quite a bit more work. It's not that hard to believe that only the largest and best-funded projects will be able to sustain that effort over time, especially when faced with an increasingly hostile and criminal net.
There is just one little problem with that view: it's not entirely clear that the larger, better-funded distributions are truly doing a better job with security. It probably is true that they are better able to defend their infrastructure against attacks, have hardware security modules to sign their packages, etc. But a distribution is a large collection of software, and few distributors can be said to be doing a good job of keeping all of that software secure.
So, for example, we have recently seen this article on insecure WordPress packages in Debian, and this posting on WebKit security in almost all distributions. Both are heavily used pieces of software that are directly exposed to attackers on the net; one would hope that distributors would be focused on keeping them secure. But that has not happened; the projects and companies involved simply have not found the resources to stay on top of the security of these packages.
It is not hard to see how this could be a widespread problem. When users evaluate distributions, the range of available packages tends to be an important criterion. Distributors thus have an incentive to include as many packages as they can — far more than they can support at a high level. The one exception might be the enterprise distributions which, one would hope, would be more conservative in the packages they choose to provide for their customers. But such distributions tend to ship old software (which can have problems of its own) and are often accompanied by add-on repositories filling in the gaps — and possibly introducing security problems of their own.
The situation is seemingly getting murkier rather than better. Some projects try to get users to install their software directly rather than use the distribution's packages. That might lead to better support for that one package, but it adds another moving part to the mix and shorts out all of the mechanisms put in place to get security updates to users. Language-specific packages are often more easily installed from a repository like CPAN or PyPI, but these organizations, too, do not issue security advisories and almost certainly do not have the resources in place to ensure that they are not distributing packages with known vulnerabilities. Many complex applications support some form of plugins and host repositories for them; the attention to security there is mixed at best. Projects like Docker host repositories of images for download. Public hosting sites deliver a lot of software, but are in no position to guarantee the security of that software. And so on.
Combine all this with a net full of bad actors who are intent on installing malware onto users' systems, and the stage is set for a lot of unhappiness. Indeed, it is surprising that there have not been more incidents than we have seen so far. There can be no doubt that we will see more of them in the future.
As a community, we take a certain amount of pride in the security of our software. But, regardless of whether that pride is truly justified, we are all too quick to grab software off some server on the net and run it on our systems — and to encourage others to do the same. As a community, we are going to have to learn to do a better job of keeping our infrastructure secure, of not serving insecure software to others, and of critically judging the security of providers before accepting software from them. It is not fun to feel the need to distrust software given to the community by our peers, but the alternatives may prove to be even less fun than that.
New video and music features in Kodi 16
Version 16.0 of the Kodi free-software media center was released on February 21. Unlike some previous releases that debuted major new features (such as Android support in 13.0 or H.265 video support in 14.0), this new release appears comparatively quiet on the surface. Under the hood, though, there are several changes that will make life easier for users, particularly where user-interface issues and system maintenance are concerned.
The installation procedure is a rather cut-and-dried affair at this point. Official builds are provided by the project for several Ubuntu-based distributions, as are community-maintained builds for Fedora, Debian, and OpenELEC. There are also packages for download for Android (both ARM and x86), a variety of Raspberry-Pi–specific distributions, Windows, iOS, OS X, BSD, and for an ever-growing list of consumer hardware devices like Amazon's Fire TV. Apart from the iOS and consumer-hardware cases (which may involve jailbreaking steps), little is required to get up and running.
The Kodi user interface, likewise, has stabilized over the past few release cycles, and navigation is more streamlined and logical than it was in years past. There are still UI inconsistencies to be found, such as whether the "close" button appears on the right or the left of a pop-over window. And in some instances, multi-page blocks of text (such as the "Description" block in the Add-ons Manager) can only be scrolled by using the mouse, not with arrow or page up/down keys. But, on the whole, media sources and configuration options are reachable in only a few clicks, it is difficult if not impossible to get lost, and the interface is virtually devoid of arcane internal terminology. The latter is a considerable accomplishment indeed, given that it includes explaining video calibration and debug logging.
Speaking of logging, one of the most-highlighted new features in this release is Kodi's new event-logging framework. This is a mechanism that provides an in-application, browseable view of a wide variety of events: adding new media to the library, altering settings, installing add-ons, and so forth. The logged events include errors and warnings, which the release notes highlight as a feature that users have missed in past releases—leaving them unable to troubleshoot problems when newly-added media fails to show up in the library, for example.
An example of the subtle improvements in 16.0 is the revamped Music Library feature. Kodi has developed a reputation for admirable handling of video content (both local and remote), but has let its support for serving as a music manager languish. The 16.0 release marks the start of a renewed effort to rectify the situation. Adding new audio content is now much simpler, and Kodi automatically scans and adds the relevant metadata from the files.
There is also a new framework in place to support advanced audio processing. Though it is not active yet, in future releases it will pave the way for a number of audio features, like multi-channel equalizers, "fake surround sound," and a variety of other effects.
Deeper under the hood, the new release brings two changes to the way skins and other add-ons are stored. The first is that the file layout used within skin add-ons has been changed to match that of other add-on types; this was primarily done to make it easier to migrate settings from one skin to another. The second and perhaps more interesting change is that add-ons can now share image resources. It may take some time for add-on authors to begin taking advantage of the feature, but it will enable skins to, for example, provide a customized look to other add-ons (such as theming the icons of media sources to match the UI).
On the video front, perhaps the most obvious new addition is support for non-linear stretching of 4:3 content to fit on 16:9 displays. The technique employed tries to retain the center of the screen without visible distortion and progressively stretches out the image closer to the sides of the display. Of course, purists still might scoff at anyone deigning to watch Citizen Kane in anything other than the original aspect ratio, but there are surely instances when such elongation is necessary. Users who employ Kodi as a digital video recorder (DVR) will be pleased to note that Kodi's PVR module (from "personal" video recorder) now supports "series" recording rules, which is a staple of most other DVR applications.
Those users with 3D displays (either 3D-capable TVs or virtual-reality headsets) will get to sample a unique new UI feature: Kodi UI "skins" can now employ 3D depth effects, with the default skin ("Confluence") providing an example. The much larger group of users without 3D displays also get some UI improvements, however. Most notably, the "long press" action is now supported in Kodi's remote-control command mapping. That makes it possible to use Kodi with a number of modern, simple remotes—where the recent trend is toward directional arrow keys, a "Select" or "OK" button, and little else.
This style of remote is particularly popular with consumer hardware like the Fire TV; users who control Kodi through other means (such as a wireless keyboard) are unaffected. The long press is bound to Kodi's context-menu action by default, so it pops up a menu of additional commands. Those using Kodi on a Linux box with touchscreen support have yet another UI option, as Kodi now supports multi-touch gestures. Gesture support has been available in the Android and iOS releases for some time; there is a small set of gestures recognized by default, though it is configurable.
Finally, the Android rendering stack has been reworked to cope with 4K displays. In earlier releases, both the Kodi UI and any video content being displayed were rendered to the same surface, using libstagefright. But that made it impossible to render the UI and the video at different resolutions. Rendering the 4K version of the Kodi UI brought interactivity to a crawl on most Android devices, while limiting video playback to 720p or 1080p resolution would defeat the purpose of 4K support. Starting with the 16.0 release, the video stream and the UI are rendered to separate MediaCodec surfaces (rather than libstagefright), thus enabling 4K hardware-accelerated video while keeping the UI at its native, non-4K resolution.
As a project, Kodi relies heavily on the community of add-on and skin developers for implementing new user-facing features. So as the core application matures, there may be fewer big developments in every release cycle. Nevertheless, as the 16.0 release shows, there will always be room for improvements. Some of the new under-the-hood functionality will take time to trickle out as developers update add-ons and skins, but there is certainly enough in the new release for users to be happy with the upgrade.
Systemd vs. Docker
There were many different presentations at DevConf.cz, the developer conference sponsored by Red Hat in Brno, Czech Republic this year, but containers were the biggest theme of the conference. Most of the presentations were practical, either tutorials showing how to use various container technologies like Kubernetes and Atomic.app, or guided tours of new products like Cockpit.
However, the presentation about containers that was unquestionably the most entertaining was given by Dan Walsh, Red Hat's head of container engineering. He presented on one of the core conflicts in the Linux container world: systemd versus the Docker daemon. This is far from a new issue; it has been brewing since Ubuntu adopted systemd, and CoreOS introduced Rocket, a container system built around systemd.
Systemd vs. Docker
"This is Lennart Poettering," said Walsh, showing a picture. "This is Solomon Hykes", showing another. "Neither one of them is willing to compromise much. And I get to be in the middle between them."
Since Walsh was tasked with getting systemd to work with Docker, he detailed a history of code, personal, and operational conflicts between the two systems. In many ways, it was also a history of patch conflicts between Red Hat and Docker Inc. Poettering is the primary author of systemd and works for Red Hat, while Hykes is a founder and CTO of Docker, Inc.
According to Walsh's presentation, the root cause of the conflict is that the Docker daemon is designed to take over a lot of the functions that systemd also performs for Linux. These include initialization, service activation, security, and logging. "In a lot of ways Docker wants to be systemd," he claimed. "It dreams of being systemd."
The first conflict he detailed was about service initialization and restart. In the systemd model, all of this is controlled by systemd; in the Docker world, it is all controlled by the Docker daemon. For example, services can be defined in systemd unit files as "docker run" statements to run them as containers, or they can be defined as "autorestart" containers in the Docker daemon. Either approach can work, but mixing them doesn't. The Docker documentation recommends Docker autorestart, except when mixing containerized services with services not in a container; there it recommends systemd or Upstart.
Where this breaks down, however, is when services running as containers
depend on other containerized services. For regular services, systemd has
a feature called sd_notify
that passes messages about when services are ready, so that services that
depend on them can then be started. However, Docker has a client-server
architecture. docker run and other commands are called in
the client for each user session, but the containers are started and
managed in the Docker daemon (the "server" in this relationship). The client can't send sd_notify status messages because it doesn't actually manage the container service and doesn't know when the services are up, and the daemon can't send them because it wasn't called by the systemd unit file. This resulted in Walsh's team attempting an elaborate workaround to enable sd_notify:
- systemd requests sd_notify from the Docker client
- That client sends an sd_notify message to the Docker daemon
- The daemon sets up a container to do sd_notify
- The daemon gets an sd_notify from the container
- The daemon sends an sd_notify message to the client
- The client sends an sd_notify message to tell systemd that the Docker container is ready
Walsh was unsurprised when the patches to enable this byzantine system were not accepted by the Docker project. sd_notify does work for the Docker daemon itself, so systemd services can depend on the daemon running. But there is still no way to do sd_notify for individual containerized services, so the Docker project still has no reliable way to manage containerized service dependency startup order.
Systemd has a feature called "socket activation", where services start automatically upon receiving a request to a particular network socket. This lets servers support "occasionally needed" services without running them all the time. There used to be support for socket activation of the Docker daemon itself, but the feature was disabled because it interfered with Docker autorestart.
Walsh's team was more interested in socket activation of individual containers. This would have the benefit of eliminating the overhead of "always on" containers. However, the developers realized that they'd have to do something similar to the sd_notify workaround, only they'd be passing around a socket instead of just a message. They didn't even try to implement it.
Linux control groups, or cgroups, let
you define system resource allocations per service, such as CPU, memory,
and I/O limits. Systemd allows defining cgroup limits in the
initialization files, so that you can define resource profiles for services
when they start. With Docker, though, this runs afoul of the client-server
model again. The systemd cgroup settings affect only the client; they do
not affect the daemon process, where the container is actually running.
Instead, each one inherits the cgroup settings of the Docker daemon. Users
can pass cgroup limits by passing
flags to the docker run statement instead, which
works but does not integrate with the overall administrative policies for
the system.
The only success story Walsh had to relate was regarding logging. Docker logs also didn't work with systemd's journald. Logging of container output was local to each container, which would cause all logs to be automatically erased whenever a container was deleted. This was a major failing in the eyes of security auditors. Docker 1.9 now supports the --log‑driver=journald switch, which logs to journald instead. However, using journald is not the default for Docker containers, so the switch needs to be passed each time.
Systemd inside containers
Walsh also wanted to get systemd working in Fedora, Red Hat Enterprise
Linux (RHEL), and CentOS container base images, partly because many
packages require
the systemctl utility in order to install correctly. His first
effort was something called "fakesystemd"
that replaced systemctl with a service that satisfied the
systemctl requirement for packages and did nothing else. This turned out to cause problems for users and he soon abandoned it, but not soon enough to prevent it from being released in RHEL 7.0.
In RHEL 7.1, the team added something called "systemd-container", that was a substantially reduced version of systemd. This still caused problems for users who needed full systemd for their software, and Poettering pressured the container team to change it. As of RHEL 7.2, containers have real systemd with decreased dependencies installed so that it can be a little smaller. Walsh's team is working on reducing these dependencies further.
The biggest problem with not having systemd in the container, according to Walsh, is that it goes "back to the days before init scripts." Each image author creates his or her own crazy startup script for the application inside the container, instead of using the startup scripts crafted by the packagers. He showed how easily service initialization is done inside a container that has systemd available, by showing the three-line Dockerfile that is all that is required to create a container running the Apache httpd server:
FROM fedora
RUN yum -y install httpd; yum clean all; systemctl enable httpd;
CMD [ "/sbin/init" ]
There is a major roadblock to making systemd inside Docker work, though: running a container with systemd inside requires running it with the --privileged flag, which makes it insecure. This is because the Docker daemon requires the "service" application run by the container to always be PID 1. In a container with it, systemd is PID 1 and the application has some other PID, which causes Docker to think the container has failed and shut it down.
Poettering says that PID 1 has special requirements. One of these is killing "zombie" processes that have been abandoned by their calling session. This is a real problem for Docker since the application runs as PID 1 and does not handle the zombie processes. For example, containers running the Oracle database can end up with thousands of zombie processes. Another requirement is writing to syslog, which goes to /dev/null unless you've configured the container to log to journald.
Walsh tried several approaches to make systemd work in non-privileged containers, submitting four different pull requests (7685, 10994, 13525, and 13526) to the Docker project. Each of these pull requests (PRs) was rejected by the Docker maintainers. Arguments around these changes peaked when Jessie Frazelle, a Docker committer, came to DockerCon.EU 2015 with the phrase "I say no to systemd specific PRs" printed on her badge (seen at right).
The future of systemd and containers
The Red Hat container team has also been heavily involved in developing the runC tool of the Open Container Project. That project is the practical output of the Open Container Initiative (OCI), the non-profit council established through the Linux Foundation in 2015 in order to set industry standards for container APIs. The OCI also maintains libcontainer, the library that Docker uses to launch containers. According to Walsh, Docker will eventually need to adopt runC as part of its stack in order to be able to operate on other platforms, particularly Windows.
Using work from runC, Red Hat staff have created a patch set called "oci-hooks" that
adds a lot of the systemd-supporting functionality to Docker. It
makes use of a "hook" that can activate any executables found in a specific
directory between the time the container starts up and when the application is running. Among the things executed by this method is the RegisterMachine hook, which notifies systemd's machinectl on the host that the container is running. This lets users see all Docker containers, as well as runC containers, using the machinectl command:
# machinectl
MACHINE CLASS SERVICE
9a65036e4a6dc769d0e40fa80871f95a container docker
fd493b71a79c2b7913be54a1c9c77f1c container runc
2 machines listed.
The hooks also allow running systemd in non-privileged containers. This PR (17021) was also rejected by the Docker project. Nevertheless, it is being included in the Docker packages that are shipped by Red Hat. So part of the future of Docker and systemd may involve forking Docker.
Walsh also pointed out that cgroups, sd_notify, and socket activation all work out-of-the-box with runC. This is because runC does not use Docker's client-server model; it is just an executable. He does not see the breach between Docker Inc. and Red Hat over systemd healing over in the future. Walsh predicted that Red Hat would probably be moving more toward runC and away from the Docker daemon. According to him, Docker is working on "containerd", its new alternative to systemd, which will take over the functions of the init system.
Given the rapid changes in the Linux container ecosystem in the short time since the Docker project was launched, though, it is almost impossible to predict what the relationship between systemd, Docker, and runC will look like a year from now. Undoubtedly there will be plenty more changes and conflicts to report.
[ Josh Berkus works for Red Hat. ]
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Security: The Glibc DNS resolution vulnerability; New vulnerabilities in chromium, kernel, libssh, ntp, ...
- Kernel: Networking performance; Sigreturn-oriented programming; DAX and fsync.
- Distributions: The end of the Iceweasel Age; Zephyr Project, Linux Mint downloads compromised, FreedomBox, ...
- Development: The OpenStack development cycle; Ardour 4.7; Upcoming features in GCC 6; The Qt roadmap; ...
- Announcements: OSI annual report, The new Board of Directors of TDF, ZFS licensing, ...
