LWN.net Weekly Edition for March 16, 2017
Notes from Linaro Connect
The first of two 2017 Linaro Connect events was held March 6 to 10 in Budapest, Hungary; your editor had the privilege of attending. Reports from a number of the sessions there have appeared in separate articles. There were a number of discussions at the event that, while not being enough to fill an article on their own, were nevertheless worthy of some attention.Connect is an interesting event, in that it is a combination of an architecture-specific kernel developers' gathering and a members-only meeting session. Not being a member, your editor only participated in the former aspect. Sessions at Connect are usually short — 25 minutes — and focused on a specific topic; they also routinely run over their allotted time. There is an emphasis on discussion, especially in the relatively unstructured "hack sessions" that occupy much of the schedule. Many of the sessions are focused on training: how to upstream code, for example, or kernel debugging stories in Mandarin (video).
A lack of Mandarin language skills will prevent coverage of that last session, but there were others that were somewhat more comprehensible to the Mandarin-impaired.
Kernelci.org
Mark Brown, Tyler Baker, and Matt Hart ran a session about kernelci.org, arguably one of the community's most underappreciated testing resources. This operation, run by Linaro and BayLibre, performs automatic build-and-boot testing for the kernel; its infrastructure is hosted by Linaro and Collabora. Tests are run on the ARM, ARM64, MIPS, and x86 architectures. For every commit that is made, tests are run with every in-tree defconfig file — over 260 builds for every commit. The resulting kernels are booted on over 100 different boards. This operation, Brown said, greatly increases the likelihood that kernels will build and, as a result, the number of failed configurations is going down over time. That, in turn, makes merge windows less stressful.
As Baker explained, this testing structure is driven by the LAVA tool; it serves as a scheduler and job runner for board farms. LAVA is used and distributed by Debian, he said.
LAVA is completing a big transition to the v2 release, which makes a number of significant changes, Hart said. Job files are now created with Jinja2 templates, an improvement over the hand-written JSON used in v1. Jobs are run asynchronously, without polling, and ZeroMQ is used for communications. ReactOBus is used to run jobs from messages. LAVA v1 tried to apply a fair amount of magic to hide the differences between different test systems, but that proved hard to work with. So v2 requires more explicit configuration in this area.
The v2 system is settling in, but a permanent home for the ReactOBus daemon is yet to be identified. [Video]
Load tracking
Vincent Guittot ran a session on load tracking — keeping track of how much load each process puts on the CPU. Accurate load tracking can help the scheduler make better decisions about task placement in the system; it can also be helpful when trying to minimize the system's power consumption. The per-entity load tracking (PELT) mechanism in the kernel is better than what came before, but it is proving to not work as well as desired, especially when it comes to power management. The window-assisted load tracking (WALT) patches (described in this article) improve the situation, but that work has not made it into the mainline.
The complaints with PELT are that it is not responsive enough and that its metrics are not always stable. The tracking of small tasks can be inaccurate, causing a mostly idle CPU to appear to be busy. Load is not propagated between control groups when a task is migrated which, among other things, can cause erroneous CPU-frequency changes. The good news is that around 20 patches improving PELT have been merged since 4.7; they fix the small-task tracking and load-propagation issues. Upcoming work should improve the handling of blocked loads and address some of the frequency scaling issues.
A related problem is that the community has lacked realistic benchmarks to measure the results of load-tracking changes; that is being addressed with new tests. There are some interesting interactions with the processor's thermal management mechanisms, though. When asked, Guittot said that there has not been a lot of power-consumption testing so far; most of the work to this point has been focused on performance.
Future work includes improving utilization tracking for realtime tasks, which are currently not part of the load-tracking mechanism. There are also some practical problems on current hardware. Realtime tasks want to run at the maximum frequency, but a frequency change on a HiKey board takes 1.5ms. A realtime task needing 2ms of run time will not get maximum-frequency performance. A more responsive load-tracking mechanism could help the scheduler ensure that the CPU is running at the needed speed. There is also a focus on improving responsiveness, which comes down to ensuring that the CPU frequency is increased quickly when the need arises. A slow ramp-up will lead to observable behaviors like jumpy scrolling. Finally, there is a desire to improve the responsiveness of the PELT system, perhaps by introducing the windowing technique used in WALT. [Slides]
ARM Mali drivers
Many ARM systems come equipped with the ARM Mali GPU. In a session on the state of the Mali drivers, John Reitan and Ketil Johnsen described some of the work that is being done in this area. There is a software development kit for the Vulkan graphics API available under the BSD license. Not all GPUs support Vulkan, though; in particular, the HiKey board used by many ARM developers has no Vulkan support. Within a month or so ARM should be releasing its compute library that allows running code on the GPU. It may be useful for image-recognition tasks and more. It will show up on the ARM GitHub page once it's ready.
With the news items out of the way, the audience quickly moved the discussion to the topic its members were really interested in: prospects for open-sourcing the Mali driver code. The answer was that ARM has no intention of doing so, mostly out of fear of unspecified "patent issues". The risk of patent trolls is simply too great to allow that release to happen. This was not, of course, the answer that the audience wanted to hear, but nobody was particularly surprised.
Arnd Bergmann suggested that perhaps a free Vulkan driver could be released; Vulkan is simpler than the full OpenGL API and might thus pose a lower risk. The speakers are not lawyers and could not respond to that suggestion beyond agreeing that it is worth considering. Meanwhile, there is a possibility that free drivers for some subcomponents could be released in the relatively near future.
A related pain point around Mali is the lack of device-tree bindings in the kernel. The normal rule is that bindings are only accepted for drivers that are, themselves, in the kernel; there is no Mali driver there, thus no bindings. But that has led every SoC vendor to come up with its own customized bindings. There has been talk of loosening the rules a bit to allow the addition of bindings for some out-of-tree drivers to reduce this pain.
John Stultz pointed out that running the Mali drivers on mainline kernels is often difficult, and wondered if there were any improvements expected in that area. Development effort on the binary-only driver tends to be focused on kernels the customers are using, and those kernels are usually old. Internally, the Mali driver does usually work on the mainline, but it can take months for the patches to get out to the rest of the world.
It's also hard for distributors who would like to make the binary-only driver available to their users. One recent improvement, at least, is that the license on that driver has changed to allow it to be distributed. But it is still difficult to make a package that works on even a subset of boards. Meanwhile, every driver release tends to break systems, and the driver tends to break with kernel updates. As Grant Likely pointed out, having to keep the kernel and the user-space driver code in lockstep makes the creation of any sort of generic distribution difficult. It was agreed that a better job needs to be done here. [Video]
For those interested in other Connect sessions, the full set of videos and slides from the "BUD17" event can be found on this page. The next Connect will be held September 25 to 29 in San Francisco, California.
[Thanks to Linaro and the Linux Foundation for funding your editor's travel to Connect.]
2038: only 21 years away
Sometimes it seems that things have gone relatively quiet on the year-2038 front. But time keeps moving forward, and the point in early 2038 when 32-bit time_t values can no longer represent times correctly is now less than 21 years away. That may seem like a long time, but the relatively long life cycle of many embedded systems means that some systems deployed today will still be in service when that deadline hits. One of the developers leading the effort to address this problem is Arnd Bergmann; at Linaro Connect 2017 he gave an update on where that work stands.That work, he said, is proceeding on three separate fronts, the first of which is the kernel itself. He has been working for the last five years to try to prepare the kernel for 2038. Much of that work involves converting 32-bit timestamps to 64-bit values, even on 32-bit systems. Some 32-bit timestamps also show up in the user-space API, which complicates the issue considerably. There is a plan for the enhancement of the user-space API with 2038-clean versions of the problematic system calls, but it has not yet gotten upstream. One recent exception is the statx() system call, which was merged for 4.11; statx() will serve as the year-2038-capable version of the stat() family of calls. There are quite a few other system calls still needing 2038-clean replacements, though.
There is one other person, Deepa Dinamani, working on the kernel side of
things; she started as an Outreachy intern and has
continued to work on the
problem after the internship ended. Dinamani has a set of virtual
filesystem layer patches in hand, which address one of the hardest
problems, and she has plans for some other system calls as well. One of
the trickier ones might be setsockopt(), which isn't easily
fixed or emulated at the glibc level. There are device-mapper and input
subsystem patches in an advanced state. Bergmann had a patch for the
video4linux subsystem, but that was rejected and needs a new approach; a
similar situation exists for the audio subsystem. Other areas needing work
in the kernel are key management and realtime clocks.
For some system calls, there will be no replacement, since the best approach appears to be emulation in the C libraries — the second focus for the year-2038 effort. There has been a lot of work done in the glibc community in particular, he said; the plan is to be fully backward compatible at that level. That means that it will be possible to build a program with either 32-bit or 64-bit timestamps, and to use the larger timestamps even on older kernels. In other words, the glibc developers are trying to make things work everywhere, with a minimum of disruption. (See this draft design document for lots of details on the glibc plan.)
The third focus is on distribution builds, which can only really proceed once the first two pieces are in place. Most distributors, Bergmann said, are unlikely to even bother with 32-bit systems in 2038, so they won't have much to worry about. One big exception may be Debian, which seems interested in maintaining support, even though it looks like it will be a painful task. It may require a full rebuild at some point, which isn't much fun for anybody involved, but it is at least a process that is known to work. Compatibility is key in such a system; there is code being deployed now that may not be 2038-clean, but people want it to keep working if at all possible.
One big area of concern is automobiles. A lot of devices, such as handsets, will have long since failed for any of a number of other reasons by then, so there is little point in ensuring that they can survive 2038. But people keep cars going for a long time. There may still be cameras in use by then, and there is highly likely to be a lot of deeply embedded systems such as building infrastructure. Some of these systems are going to fail in 2038. That is why it is important to get the problem fixed as soon as possible.
There are some things that are going to be difficult to fix, though, even when the kernel, C libraries, and distributions are mostly taken care of; many of these are the result of the use of 32-bit time_t values in file formats. Thus, for example, cpio will fail, which is problematic because it is used by the RPM package format. The NFSv3, ext3, and XFS filesystems all have problems resulting from their use of 32-bit timestamps. The first two are likely to have gone out of use long before the problem hits, and plans for the repair of XFS are in the works. Then, of course, there is a whole list of applications that nobody has yet noticed are broken, and lots of in-house applications that cannot be fixed by the community.
When asked which tools he is using for this work, Bergmann replied that his core technique involves building kernels with the 32-bit time types removed completely. That will quickly point out the places that still need to be fixed. Beyond that, he said, it's mostly a manual process. It was suggested that sparse or GCC plugins could maybe help with this task.
As things wound down, John Stultz asked how much the work in the BSD camp, which has (in some variants) already solved its year-2038 problems, might help with Linux. The answer would appear to be "not much". BSD-based distributions have the advantage of being able to rebuild everything from scratch, so they do not need to maintain user-space ABI compatibility in the same way. There is some value in in the work that has been done to fix applications with year-2038 problems, but it's not clear how much that will help the Linux community.
[Thanks to Linaro and the Linux Foundation for funding your editor's travel to Connect.]
Page editor: Jonathan Corbet
Inside this week's LWN.net Weekly Edition
- Security: A kernel TEE party; Quotes; Struts 2 vulnerability; ...
- Kernel: Five-level page tables; Deadline scheduling.
- Distributions: Debian Project Leader election—2017 edition; Chakra, NetBSD, Ubuntu, ...
- Development: Delayed execution for Python; LLVM, MATE, SciPy, U-Boot, ...
- Announcements: Support GNU Toolchain, FSF on GitHub's ToS, challenges for the web, ...