Leading items

A different approach to application containers with Limba

By Nathan Willis
April 8, 2015

Over the past two years, application packaging has become a topic of much debate within the GNOME project. The packaging effort that has garnered perhaps the most attention is the sandboxed applications work done by Alexander Larsson, but it is not the only one. Matthias Klumpp, for example, has been pursuing an application-packaging strategy of his own that takes a different approach. Klumpp's project is called Limba, and a new release was recently unveiled.

Klumpp started working on Limba in November 2014, citing as his inspiration the same proposal that eventually led to Larsson's work. That proposal was Lennart Poettering's blog post about the difficulty of packaging third-party applications for Linux desktops. But Poettering proposed packaging applications into bundles that were quite isolated from the rest of the distribution, an approach Klumpp found problematic. It duplicates shared libraries and other components, makes the running system harder to profile, and, quite simply, bypasses the distribution rather than working with it.

Beaucoup bundles

Larsson and others pursued many of the ideas set out in Poettering's post, with some changes, of course, and the result is the xdg-app tool set, which we looked at in January. Limba bears some resemblance to xdg-app in the broad strokes. Both models bundle the application into a self-contained package that is isolated from the rest of the operating system using various technologies common to other containerization efforts, like SELinux, control groups, and namespaces (specifically, filesystem, process, and network namespaces).

The xdg-app sandbox design, however, relies on the host system providing a static "runtime" environment that will provide each app bundle with the libraries and system resources it requires. The GNOME work in this area has been focused on defining a "GNOME runtime" tailored to each GNOME release— the 3.16 runtime, for example, would receive critical security updates, but app bundles using it can count on it remaining unchanged even after the GNOME 3.18 release and its accompanying runtime make their debut. Other vendors (including, perhaps, Linux distributions as well as other application platforms like KDE) could produce their own runtimes that could also serve as targets for app bundles.

In contrast, Limba's approach lets each application distributor define its own runtime tailored to its app bundle. The runtime definition is dependency-based, much like existing package dependencies are specified today. Thus, the pieces within the runtime can be updated if a new version of some dependency is released.

Practically speaking, these dependencies are specified with XML according to the AppStream metadata format. When the Limba bundle is executed, the controller process is responsible for merging the contents of the bundle, the local libraries, and potentially other Limba bundles into a single OverlayFS filesystem that satisfies the bundle's dependencies. This OverlayFS filesystem is then set up in a mount namespace, inside of which the app bundle is launched.

A Limba bundle's dependencies can be specified as "version X or greater," in addition to matching a specific package version. The bundle can also include its own libraries (a feature that is also supported by xdg-app) for those packages not provided by a distribution repository.

Thus, the Limba bundle represents something of a middle-ground between the locally installed package as provided by most distributions today and the containerized bundle of xdg-app. In a March 30 blog post, Klumpp highlighted some trade-offs he sees between the two designs:

While the static runtime of XdgApp projects makes testing simple, it is also harder to extend and more difficult to update. If something you need is not provided by the mega-runtime, you will have to provide it by yourself (e.g. we will have some applications ship smaller shared libraries with their binaries, as they are not part of the big runtime).

Limba does not have this issue, but instead, with its dynamic runtimes, relies on upstreams behaving nice and not breaking ABIs in security updates, so existing applications continue to be working even with newer software components.

He also points out that a Limba bundle could be made to mimic the behavior of an xdg-app bundle by specifying an exact dependency set that happens to, say, match the contents of the GNOME 3.16 xdg-app runtime.

Current status

Neither Limba nor xdg-app are yet production-ready, of course; both are still undergoing plenty of changes in development. But both are testable. Klumpp released version 0.4.0 in February, which has subsequently been followed by two minor updates. Included at present are the lipkgen tool to both generate a Limba bundle template as well as build the final package, the lipa tool to install bundles, and the runapp command to launch an installed bundle. Because Limba uses OverlayFS, the system does require a kernel built with support for that filesystem.

In the March blog post, Klumpp also provided a real-world application bundle for those interested in testing Limba. The application in question is Neverball, the same video game that Larsson released in an xdg-app bundle in February. Whether or not this series of events will lead to a rise in Neverball bundle tournaments at development events has yet to be seen; hopefully bundles with other types of application will be forthcoming.

It will be interesting to see where Limba goes next. At the moment, it does not have as much of a sandbox system in which to isolate the bundled application, but that is on the agenda and it is worth noting that xdg-app's sandbox is still under construction as well.

Considering how much overlap there is between the designs, one might be tempted to think that the success of the projects would come down to how software distributors feel about the runtime issue. If GNOME (or another project) makes a compelling argument for its all-in-one runtime, that could draw more developers toward xdg-app. But distributors might find it easier to work within the existing package-distribution model, which Limba targets.

In the longer term, though, both projects make a case that their respective bundle format provides a stable platform that the application developer can count on. Whether the large-runtime approach of xdg-app or the more traditional system-provided package environment of Limba provides better stability is something only time will tell.

Comments (15 posted)

Realtime using the PRU

By Jake Edge
April 8, 2015

ELC 2015

Realtime applications on Linux are generally run on the RT_PREEMPT kernel, but Ron Birkett presented an alternative at the 2015 Embedded Linux Conference (ELC) in San Jose: using the programmable realtime unit (PRU) available on some ARM chips. It is, in fact, a popular alternative, as several of the Linux drone makers presenting at the conference were using the PRU to offload various realtime tasks from Linux. That was the main difference Birkett said he noticed from ELC Europe last October—there, he was the only one talking about PRUs.

Birkett introduced himself as a firmware developer, "not a Linux kernel guy", working on Sitara ARM processors for TI. The PRU was specifically designed as a special-purpose RISC processor to help with realtime requirements by minimizing latency response. Beyond that, it is "really cool" what the PRU can do when you hook it up to Linux.

Realtime does not mean "ultra fast", he reminded attendees. Instead, it means that there is determinism in the system; that events will happen when the system designer thinks they will happen. If you need to get up for work at 6am every day or lose your job, he said, that is a realtime requirement at some level. Typically, though, realtime means more than just deterministic latency; it also means that the response time will be quite low.

But the PRU has not always existed, and we have been making realtime systems for years. Why does the PRU do determinism better than other options? That is a question he was setting out to answer in the talk, he said.

Realtime is typically only one element of the complex systems we are building these days. Getting realtime response requires trading something off, which is throughput. But what if you want both throughput and realtime response? Normally, you can't have both.

The AM355x system-on-chip (SoC) family (as found in the BeagleBone Black) has a Cortex-A8 CPU that is designed for throughput. It has a long, deep pipeline and multiple levels of memory and cache. But cache throws away determinism, he said. That leads to people using a 2GHz processor to solve a 200MHz problem, because they have to look at the worst-case latencies with a cold cache, nothing in the processor pipeline, and slow RAM access.

Enter the PRU. Typically, there are two PRU cores, each with its own dedicated instruction and data RAM, without any caching. There is a dedicated interconnect between the two PRUs, with some shared RAM, an interrupt controller, and some peripherals. The PRU does connect out to the rest of the system, but you lose determinism when you do so. Putting everything required for the realtime portion of the application inside the PRU "box" gives assurances on access times. In the PRU, access to the instruction RAM takes one cycle and the data and shared RAM can be accessed in three cycles.

Each PRU core is a 32-bit RISC processor that runs at 200MHz. There is no pipeline and instructions are executed in a single 5ns cycle. He showed the system diagram for one particular flavor of PRU, which had a scratchpad to move the entire register set between the PRU cores, an interrupt controller, some dedicated peripherals (e.g. a UART), and a fast I/O interface that has 30 general-purpose (GP) inputs and 32 GP outputs per PRU core.

The GPIO controllers have direct access to the pins, unlike other processors where there are multiple levels of controllers and other hardware between the processor and the pin. There are several different input modes including a 16-bit parallel capture. By way of comparison, his team wrote a small program to simply blink an LED attached to a GP pin for both the CPU and the PRU. Looking at the output on an oscilloscope, the 2GHz Cortex-A8 could transition the pin in 200ns, while the PRU could do it in 5ns.

There are lots of different things that can be done with such a device, Birkett said. For drones, one PRU is often used to handle the radio-control interface, while the other is used to drive the pulse-width modulation (PWM) for the motors. There are also dedicated peripherals as part of the PRU that can be used as an extra UART, timer, or PWM controller that is accessible from the Cortex-A8.

The PRU does not support interrupts. Instead, it must poll the interrupt controller to determine if an interrupt has occurred. Polling is more deterministic; asynchronous interrupts can cause jitter in the execution time.

So the PRU makes a great complement to a high-end core like the Cortex-A8, he said. PRUs are available in the AM335x and AM437x and planned for more SoCs in the future.

An audience member asked about support for I²C on the PRU. Birkett said that it is easily done in software on the PRU, as are SPI and other communication protocols. There are no open-source implementations, yet, but there are plans to release code for those over time.

It is not possible to run Linux on the PRU—it doesn't make sense to do so even if you could. Linux will run on the main CPU and communicate with the PRU using interrupts or messages. There is only 8KB of instruction RAM available, so some kind of bare-metal stack in C or assembly makes the most sense. You could run a small realtime operating system (RTOS), but even that might be difficult.

There is a C compiler for the PRU available to use, though it is not free software. There have been a few years worth of work on optimization put into the compiler, so it generates "pretty good code" at this point, Birkett said. There is a GCC version available too, though it lacks the optimizations that the TI compiler provides.

Linux's role is to load the firmware for the program into the PRU's instruction RAM, initialize the resources (e.g. memory, interrupts) for the device, and manage its execution. Meanwhile, Linux can continue doing whatever general-purpose processing it needs to.

Linux and the PRU communicate using interrupts via the remoteproc framework or with messages using rpmsg on top of virtio. Birkett noted that he had often heard kernel developers say that new features should not be added, but that existing facilities should be enhanced, if possible, instead. That is why remoteproc and rpmsg were chosen to be supported for the PRU.

As Birkett noted at the outset, the PRU was mentioned in several other talks throughout the conference. Since Dronecode was one of the themes at ELC this year, and the BeagleBone Black was a common platform for drones, the PRU came up frequently. It frees Linux up to do other tasks, such as computer vision processing, mapping, navigation, video streaming, and the like. Since the realtime needs for drones tend to be small and specialized, offloading them to hardware targeted for that kind of task seems to make a great deal of sense. Other use cases are undoubtedly out there as well.

[I would like to thank the Linux Foundation for travel support to San Jose for ELC.]

Comments (7 posted)

Page editor: Jonathan Corbet
Next page: Security>>