|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for September 17, 2020

Welcome to the LWN.net Weekly Edition for September 17, 2020

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Key signing in the pandemic era

By Jake Edge
September 16, 2020

The pandemic has changed many things in our communities, even though distance has always played a big role in free software development. Annual in-person gatherings for conferences and the like are generally paused at the moment, but even after travel and congregating become reasonable again, face-to-face meetings may be less frequent. There are both positives and negatives to that outcome, of course, but some rethinking will be in order if that comes to pass. The process of key signing is something that may need to change as well; the Debian project, which uses signed keys, has been discussing the subject.

In early August, Enrico Zini posted a note to the debian-project mailing list about people who are trying to get involved in Debian, but who are lacking the necessary credentials in the form of an OpenPGP key signed by other Debian project members. The requirements for becoming a Debian Maintainer (DM) or Debian Developer (DD) both involve keys with signatures from existing DDs; two signatures for becoming a DD or one for becoming a DM. Those are not the only steps toward becoming formal members of Debian, but they are ones that may be hampering those who are trying to do so right now.

DDs and DMs use their keys to sign packages that are being uploaded to the Debian repository, so the project needs to have some assurance that the keys are valid and are controlled by someone that is not trying to undermine the project or its users. In addition, votes in Debian (for project leaders and general resolutions) are made using the keys. They are a fundamental part of the Debian infrastructure.

Individual DDs have their own policies regarding when they are willing to sign someone's key, Zini said, but they often require meeting in person and showing government-issued identification. "Meeting in person has always been a good safe bet, if only for the [reason] that it's been accepted without question for many years." That is difficult to do these days, so it makes sense to think about alternatives:

For example, speaking of myself only, if my goal is to raise the cost of impersonation or sock puppet identities, then probably signing someone's key after having worked with them online for a significant time, would require a much higher cost than showing up at a keysigning party with a fake ID good enough to fool me. [...]

I think the world has changed enough in the last months that currently perceived project expectations about key signing are getting out of alignment with practical realities, and it might be time to explore other options.

That sparked a long discussion. Many participants were glad that Zini had raised the subject. It turns out that there is a fair amount of diversity with people's requirements before they are willing to sign the key of someone relatively new to the community.

Federico Ceratto wondered about "the real threat that we want to mitigate" with the key-signature requirement. He suggested that it was "a malicious DD uploads a package containing a backdoor". Jonas Smedegaard added "a malicious DD votes twice". But, as Russ Allbery pointed out, the votes in Debian are generally not particularly close—and may not really be that consequential outside of the project such that is worth the effort to sabotage them.

What have we voted on that you think anyone would care sufficiently about to do the tedious and time-consuming work required to get a fake identity with voting privileges?

I'm dubious of the threat model. Injecting malicious code into the archive seems to have a far, far higher reward to effort ratio than voting in our rare and generally not very close project votes.

Johannes Schauer described his situation with a prospective DM; that person has been working with him for a few months, signing Git commits and tags, as well as email, with their key. He doesn't think that a government ID makes any real difference in whether he should trust (and thus sign) that key:

Why would it be wrong of me to sign the key of this person? No matter who is behind that key: the person with that key has shown to produce great contributions for a couple of months *or* there is a really dedicated evil person trying some scheme over a really long period of time with me. If the latter is the case, would a person with that much commitment not also be able to fool me with a fake national ID?

He suggested that any key-signing policy be based on prospective new members establishing their key as being associated with work benefiting the project over a period of a few months. Smedegaard described his methodology as: "I will sign the key of someone whom I feel I would be able to recognize if randomly bumping into them years later on a bus". That generally comes from spending a little time with the person face to face, but he has occasionally been able to get comfortable about signing keys based solely on online experiences with someone.

The standard of "doing useful work" associated with a particular key was considered reasonable by multiple participants in the thread. Alexandre Viau noted that he became a DD because he "seemed to produce work that is good enough to be let in the archive", but that he still needed to meet with two other DDs in coffee shops to "prove" that he "received or intercepted emails" to the proper address in order to get his key signed. He will be signing the key of someone he is sponsoring as a DM even though he does not plan to ask for ID or to meet in person:

Feel free to attribute whatever value that you want to that signature. I think that given my history with that person it holds much more values than the 2-minutes KSP [key-signing party] ones.

Alberto Garcia said that signing a key is not really the same thing as trusting the person who holds that key; it is simply a verification that you have communicated securely with that person using that key.

That means that your communicate with that person in a trusted way, not that you necessarily have to trust what they do. And it doesn't even matter if the name written on the key is the same that appears on the passport or ID card (people can use a different name for a variety of reasons).

The lack of a requirement for government-issued ID as part of the key-signing process surprised Adrian Bunk, who thought it was "the sole fixed requirement for keysigning". He wondered why there was any key-signing requirement without specifying that photo IDs must be presented and scrutinized. As noted multiple times in the thread, though, most people are not experts in detecting forged IDs so the value of examining them is somewhat limited.

As he has been doing over the last year or two, former project leader Sam Hartman summarized the thread. It went in a lot of different directions, as he described:

I don't think we were seeking a consensus, and we didn't find one. What we did find is a number of approaches that seem to have sufficient support. If one of those works for you as a person contemplating signing a key, my take is that you should go for it.

One idea that came in response to Hartman's summary was from "Ángel", who suggested using expiration dates on the PGP signatures as a way to work around the current inability to meet in person. More permanent signatures could be used after that. Another suggestion, from Pierre-Elliott Bécue was to lower the number of required signatures from one to zero for new DMs, but to increase the number of DD sponsors needed from one to two or three.

Given that DDs are free to sign keys based on their own criteria, there will need to be other mechanisms used to enforce any rules on the keys that get accepted into the DD and DM keyrings. In mid-September, Zini posted a message on behalf of the Debian account managers (DAM) that described that group's policies going forward. It explicitly removes the signature requirement, replacing it with: "The person controlling the GPG key needs to have an established track record of work within/for the project."

Key signing will still be done, as it provides evidence that a person only has a single identity in Debian. DAM is formalizing that idea ("A natural person may only have one identity in Debian."), but it is not explicitly requiring key signing to enforce it. Instead, it is introducing the idea of a "key endorsement", where project members can explicitly state that they have interacted with someone using a particular key, along with the details of that interaction. That information can be used when deciding whether to grant DD or DM status. In addition:

If your key has no trust path towards the Debian Web of Trust when you are applying, we will require that you GPG-sign a statement saying that the identity of the person controlling the key corresponds to what is in at least one key User ID, and that the person does not already have a DM or DD account under a different name.

Key endorsements mean that one can join Debian with a key that is not connected to their legal identity - as long as the key is connected to a significant history and reputation within Debian. We however still strongly encourage people to cross-sign keys as much as possible.

Overall, the changes represent a reasonable middle ground without usurping any of the rights that DDs have as individuals to determine how and when they sign keys. As the announcement put it:

This mail effectively moves the entry barrier from "meet 2 random people, somewhere" to "you are represented by the work you did and do in Debian". We believe that this fits better both the current COVID-19 situation, and the general do-ocracy attitude of Debian.

So what started out as a way to perhaps temporarily handle the problems associated with the current pandemic have effectively morphed into a more flexible system overall. New Debian contributors in places where project members are scarce have always been at something of a disadvantage by needing to travel to a conference or other gathering for their signatures. Now anyone can potentially become a DD or DM without ever leaving the comfort of home.

Comments (3 posted)

BPF in GCC

By Jake Edge
September 15, 2020

LPC

The BPF virtual machine is being used ever more widely in the kernel, but it has not been a target for GCC until recently. BPF is currently generated using the LLVM compiler suite. Jose E. Marchesi gave a pair of presentations as part of the GNU Tools track at the 2020 Linux Plumbers Conference (LPC) that provided attendees with a look at the BPF for GCC project, which started around a year ago. It has made some significant progress, but there is, of course, more to do.

There are three phases envisioned for the project. The first is to add the BPF target to the GNU toolchain. Next up is to ensure that the generated programs pass the kernel's verifier, so that they can be loaded into the kernel. That will also require effort to keep it working, Marchesi said, because the BPF world moves extremely fast. The last phase is to provide additional tools for BPF developers, beyond just a compiler and assembler, such as debuggers and simulators.

[Jose E. Marchesi]

Binutils support for BPF has been upstream since August 2019, while the GCC BPF backend was added in September 2019. In August 2020, just a few weeks before he gave the talk, support for BPF in GDB was added along with a simulator for it. The simulator is intended to be used with a board file for the DejaGnu test framework in order to run the GCC test suite for the BPF backend.

The binutils support is complete at this point, he said, as is the GCC backend. The GDB support is basic; it only handles loading a BPF program, single stepping, setting breakpoints, examining BPF registers, and listing the assembler code. Similarly, the simulator support is basic; it is integrated with GDB and most of the BPF instructions are supported so programs can run on it. There are plans to add various kinds of kernel contexts to the simulator so that different BPF program types can be run; for example, a program to be attached to a kprobe would be provided with a kprobe context in the simulator. The simulator also needs support for more kernel helper functions beyond just printk().

In addition to support for standard BPF, the project has been adding support for "experimental BPF" (or xBPF, though he said "the name is not important"). It adds features that get around the limitations imposed by the kernel verifier so that the full GCC test suite can be run. There are thousands of GCC tests that will not even build for BPF because they need features the language does not support, such as indirect calls and functions that need to do their own save and restore of the register values. Beyond just the compiler tests, xBPF extensions are needed for debugging BPF in GDB using the DWARF format; that will allow getting backtraces from BPF programs, for example.

With those pieces in place, Marchesi said, the project has turned to support for BPF type format (BTF), which is the debugging format used by BPF. It is similar to Compact C type format (CTF), which has been added to the GNU toolchain; CTF and BTF share a common ancestor. When the -g option to GCC is used to request debugging information, it should generate BTF and not DWARF for BPF programs. BTF is integral to the "compile once, run everywhere" (CO-RE) plan for being able to run BPF programs on multiple kernel versions.

The basic problem is that when BPF is compiled, it uses a set of kernel headers that describe various kernel data structures for that particular version, which may be different from those on the kernel where the program is run. Until relatively recently, that was solved by distributing the BPF as C code along with the Clang compiler to build the BPF on the system where it was going to be run. Now, the compiler generates BPF along with BTF information about the data structures and struct members being used so that the BPF loader can fix up those references when it loads the program on a different kernel version. "It is amazing that it works", but it seems to work well, he said.

By looking at the LLVM BPF backend, Marchesi learned that there is a class for handling debugging information as part of the intermediate representation (IR). Adding a new debugging format is a matter of extending that class to support it; currently LLVM supports DWARF, CodeView, and BTF. The situation for GCC is completely different. There is something called debug_hooks, but it is called from many different places within the compiler: from the frontend, backends, link-time optimization (LTO) stage, and so on. There are several different formats that can be generated via those debug_hooks, many of which are legacy (e.g. VMS dbg, DBX); his team recently wanted to add CTF support as well.

Initially, the plan was to simply extend the current scheme, which is "very old", he said, for CTF, but the feedback from the GCC maintainers was that the legacy format support should not hold things back. It was suggested that using the DWARF support was the right path forward for adding new formats, so CTF and BTF will be added via that mechanism. Once it is working, older formats can be ported to the new scheme and, eventually, the older mechanism can be removed. One thing that needs to be considered in all of this is that the new debugging formats (e.g. BTF, CTF) are compact, while DWARF is large and comprehensive; when only generating one of the more compact formats, the overhead of using the DWARF path may be problematic.

There was a fair amount of overlap between the two talks (first talk: raw video [YouTube] and slides [PDF]; second talk: raw video [YouTube] and slides [PDF]), but the second was targeted at engaging the LLVM and BPF developers at LPC. That effort was not particularly successful, as attendees from those projects were apparently busy elsewhere at the conference. But the issues raised will need to be resolved at some point. This thread on the BPF mailing list indicates that there may be some resistance to the xBPF plan, however.

There were three separate items that Marchesi raised in that second talk. The first is with regard to the declaration of BPF helper functions, which are auto-generated into bpf_helpers.h. The existing declarations look like this:

    static __u32 (* bpf_get_prandom_u32)( void ) = ( void *) 7;
Both GCC and LLVM fail if those declarations are used without optimization level two (i.e. -O2) or higher; LLVM generates an invalid instruction and GCC emits an error. Instead of the helper number being cast to a void *, the GCC hackers have come up with a kernel_helper attribute that will allow the declaration to work at any optimization level:
    static __u32 (* bpf_get_prandom_u32)( void )
        __attribute__ (( kernel_helper (7)));
He wondered if LLVM could use the same solution; it is more robust than the existing code, he said. In the chat, Mark Wielaard suggested that silence meant assent, but in reality, until that can be worked out, GCC will have to do its best to support the existing declaration, Marchesi said.

The BPF FAQ says that there is no signed division instruction because it would be rarely used. He wondered, though, why it would be a frequently asked question if signed division was so infrequently used. In any case, its lack is a big problem for the GCC test suite, which has lots of tests that use signed division. So support for four instructions (sdiv, smod, and their 32-bit versions) was added to xBPF. He asked if the BPF developers would reconsider signed division. In the chat, Lorenz Bauer said that adding the support would be tricky due to the kernel's support for just-in-time (JIT) compilation of BPF.

Adding those instructions to xBPF brings up another problem area, Marchesi said. Instructions have to be assigned to opcodes, which are a finite resource. If BPF expands in the kernel, it could use opcodes that xBPF has already used. For example, the signed division instructions are currently using the last two available opcodes in each of the ALU and ALU64 instruction classes. He wondered if there could be a range of opcodes that were set aside for extensions. Many instruction classes are running out of opcodes, but there are 23 in LD and 28 in ST that are available, so perhaps some space could be found there. Those questions will be taken to the mailing lists, Marchesi said, to try to resolve them that way.

Having a second toolchain that supports BPF will clearly be a benefit; both GCC and LLVM have gotten better over the years due to their "competition". It would seem that the approach taken by the GCC hackers is different, at least from a testing standpoint, than that taken by LLVM; those areas will need to be worked out before too long. Beyond that, though, the GCC BPF simulator and GDB support will bring new tools to the table for BPF developers.

Comments (13 posted)

OpenPGP in Rust: the Sequoia project

By John Coggeshall
September 11, 2020

In 2018, three former GnuPG developers began work on Sequoia, a new implementation of OpenPGP in Rust. OpenPGP is an open standard for data encryption, often used for secure email; GnuPG is an implementation of that standard. The GPLv2-licensed Sequoia is heading toward version 1.0, with a handful of issues remaining to be addressed. The project's founders believe that there is much to be desired in GnuPG, which is the de facto standard implementation of OpenPGP today. They hope to fix this with a reimplementation of the specification using a language with features that will help protect users from common types of memory bugs.

While GnuPG is the most popular OpenPGP implementation — especially for Linux — there are others, including OpenKeychain, OpenPGP.js, and RNP. OpenPGP has been criticized for years (such as this blog post from 2014, and another from 2019); the Sequoia project is working to build modern OpenPGP tooling that addresses many of those complaints. Sequoia has already been adopted by several other projects, including keys.openpgp.org, OpenPGP CA, koverto, Pijul, and KIPA.

Sequoia was started by Neal H. Walfield, Justus Winter, and Kai Michaelis; each worked on GnuPG for about two years. In a 2018 presentation [YouTube] (slides [PDF]) Walfield discussed their motivations for the new project. In his opinion, GnuPG is "hard to modify" — mostly due to its organic growth over the decades. Walfield pointed out the tight coupling between components in GnuPG and the lack of unit testing as specific problem areas. As an example, he noted that the GnuPG command-line tool and the corresponding application libraries do not have the same abilities; there are things that can only be done using the command-line tool.

Community is a big part of the Sequoia project, as Walfield explained in his presentation. The project is financially backed by the p≡p (pep) and Wau Holland foundations, with "all development done in the open". Before code was even written, the project founders met with prominent members of the OpenPGP community as well as end users to discuss the project's plans and make sure their approach was sound. The project's current status, Git repository, and issue tracker are all available. The repository logs indicate approximately 30 contributors to the project and, since announcing the pending version 1.0 release in April 2020, three releases have been made. The latest release, version 0.19.0, was made in August 2020 — its most notable improvement is the inclusion of Windows Cryptography API: Next Generation (CNG) as a backend, replacing Nettle which has issues in non-POSIX environments.

Because it is written in Rust, Sequoia benefits from all of the memory-safety advantages that the language provides. The repository shows significant efforts are being made to write unit tests to prevent regressions and improve quality. Unlike GnuPG, where the command-line gpg tool is more powerful than the library, Sequoia is an OpenPGP library first; all of its functionality is available via the exposed APIs. The project plans to provide two "levels" of API, a low-level unopinionated implementation of the OpenPGP specification and a high-level API with sensible defaults to make common tasks like signing and signature verification easier for users. Walfield's presentation was clear that, while the project endeavored to be unopinionated on the low-level API implementation, it does avoid the outdated parts of the specification such as the use of MD5 hashes.

Sequoia is targeted at "modern platforms" including Linux, Windows, macOS, Android, and iOS. When possible, this includes using existing cryptographic tooling; it is a design goal of the project to tightly integrate with platform-specific cryptographic services. For example, Sequoia plans to make use of the Secure Enclave coprocessor on iOS devices when available. It also provides a foreign function interface (FFI) for integrating the project with programs written in other languages. Currently, Sequoia offers C and Python FFI bindings. Readers should note that programs using the bindings lack memory safety, so the project's rules must be followed in order to use them correctly.

Walfield responded to my question on the Sequoia development mailing list with information about the current state of the project and the upcoming version 1.0 release:

First, for our upcoming 1.0 release (anytime now (tm), although the API and feature set have been stable for months; we are just documenting everything, and carefully reviewing the API & code), we are only releasing the low-level API, i.e., the sequoia-openpgp crate, and its dependencies.

This means readers shouldn't expect Sequoia to be an end-user replacement for tools like GnuPG just yet; the first major release will be focused on a library for developers. For the project to be ready for end users, it will need the equivalent of the gpg command found in GnuPG. For Sequoia, this is the sq command-line tool, which is not included for version 1.0. Walfield described sq as "missing quite a bit of functionality," explaining "we want to reserve the right to change its interface" before it is released to the public.

The sq tool isn't the only thing missing from the version 1.0 release; key-storage services are still in progress. According to Walfield, it is one of the project's "top priorities" after the version 1.0 release.

Key storage is one of the areas where Sequoia's plans are unique compared to other OpenPGP implementations. For added security, the project plans to use process separation between the services handling public and private keys; Cap'n Proto will be used for interprocess communication. Walfield notes in his presentation that process separation is not always possible, such as in iOS environments. When it is unavailable, Sequoia plans to use a shared SQLite database to communicate between the two services in a process Walfield described as "colocation".

Conceptually, Sequoia takes an identity-based approach to its public keyrings, where the keyring is designed to be "more like a per-domain address book than a PGP keyring." Keys are designed to be stored and accessed using a user-assigned Petname, with the ability to associate arbitrary structured data that will be useful in the implementation of trust models. Walfield also argues that this approach is more in line with how users actually think about keys: associated with names, rather than a collection of abstract IDs. Further, all keys are assigned a "realm" in Sequoia indicating the intended purpose of the key; presently realms include "contacts" and "software update key" designations. When completed, Sequoia's keyring service will automatically update public keys from remote servers (similar to parcimonie), ensuring that changes like new sub-keys and revocations are discovered in a timely fashion. The API documentation indicates this can be done using anonymized services like Tor in addition to more common TLS-encryption methods.

The private keyring service is slated to offer an optional one-password solution for unlocking local keys. The library will provide access to the private keyring using what Walfield describes as a "Smartcard-like" API. Currently Sequoia does not support Smartcards, but based on this ticket, the community would like to see it happen in the future. Sequoia's private keyring service will be written with forward secrecy in mind, by using the OpenPGP specifications supporting the distinction between data "at rest" (encrypted storage) and data "in motion" (encrypted transmission), which is not found in many other implementations. It is a sound security practice to frequently rotate "in motion" keys, but that needs to be done in a way that still allows archived encrypted data to be decrypted. A presentation (slides [PDF]) by Winter compares the project's forward secrecy features to other OpenPGP implementations.

Wrapping up

Those who are interested in getting involved in the project should be aware that it requires contributors to assign their copyright to the p≡p foundation. The project further states that code may then end up being published under multiple licenses, but "all software is also published under a GNU License as Free Software, without exception."

In all, it is good to see a project focused on making OpenPGP a more easily accessible technology, and it looks like the project has made steady progress since it started almost three years ago. The project's documentation also provides a reasonable starting point to using the library in applications. That said, Sequoia has a ways to go before it will be a trusted cryptographic tool; for starters, the code still needs to be audited. The project's status page states it has "has not been audited yet, but as soon as we release the core Sequoia crate, it will be audited by a third party." Readers may be interested in checking out the project's contributors page, which provides details on their mailing list and IRC channel. As Walfield said, there isn't a timetable for when Sequoia version 1.0 will land, but it does sound like it will be soon. End users of OpenPGP will still have to wait a bit longer, however, before Sequoia becomes a viable alternative.

Comments (41 posted)

Android kernel notes from LPC 2020

By Jonathan Corbet
September 10, 2020

LPC
In its early days, the Android project experienced a high-profile disconnect with the kernel community. That situation has since improved considerably, but there are still differences between Android kernels and the mainline. As a result, it is not possible to run Android on a vanilla kernel. That situation continues to improve, though; much evidence to that effect was on display during the Android microconference at the 2020 Linux Plumbers Conference. Several sessions there showed the progress that is being made toward unifying the Android and mainline kernels — and the places where there is still some work to be done.

The generic kernel image

Todd Kjos started things off by introducing the Android Generic Kernel Image (GKI) effort, which is aimed at reducing Android's kernel-fragmentation problem in general. It is the next step for the Android Common Kernel, which is based on the mainline long-term support (LTS) releases with a number of patches added on top. These patches vary from Android-specific, out-of-tree features to fixes cherry-picked from mainline releases. The end result is that the Android Common Kernel diverges somewhat from the LTS releases on which it is based.

From there, things get worse. Vendors pick up this kernel and apply their own changes — often significant, core-kernel changes — to create a vendor kernel. The original-equipment manufacturers begin with that kernel when [Todd Kjos] creating a device based on the vendor's chips, but then add changes of their own to create the OEM kernel that is shipped with a device to the consumer. The end result of all this patching is that every device has its own kernel, meaning that there are thousands of different "Android" kernels in use.

There are a lot of costs to this arrangement, Kjos said. Fragmentation makes it harder to ensure that all devices are running current kernels — or even that they get security updates. New platform releases require a new kernel, which raises the cost of upgrading an existing device to a new Android version. Fixes applied by vendors and OEMs often do not make it back into the mainline, making things worse for everybody.

The Android developers would like to fix this fragmentation problem; the path toward that goal involves providing a single generic kernel in binary form (the GKI) that all devices would use. Any vendor-specific or device-specific code that is not in the mainline kernel will need to be shipped in the form of kernel modules to be loaded into the GKI. That means that Android is explicitly encouraging vendor modules, Kjos said; the result is a cleaner kernel without the sorts of core-kernel modifications that ship on many devices now.

This policy has already resulted in more vendors actively working to upstream their code. That code often does not take the form that mainline developers would like to see; some of it is just patches exporting symbols. That has created some tension in the development community, he said.

He concluded by saying that the Android 11 release requires all devices to ship with kernels based on the Android Common Kernel; Android 12 will require shipping with the GKI instead. Tim Bird asked how vendors plan to cope when a patch they need isn't integrated into the mainline or the Android Common Kernel; Kjos answered that the current plan is to add vendor hooks via tracepoints. The details, though, have not yet been worked out.

ABI enforcement

Later, Matthias Männich talked about GKI ABI enforcement, the purpose of which is to ensure a stable ABI for modules so that GKI updates do not end up breaking devices in the field. This is not a simple task; the kernel ABI is large, and it is hard to catch changes in every part of it. He emphasized that this work is in no way trying to stabilize the [Matthias Männich] mainline kernel ABI, or even the ABI for LTS kernels. It is only intended to keep the kernel ABI stable within a specific Android version.

While ABI changes are not welcome in GKI updates, configuration changes are allowed as long as they don't change the interface as seen by modules. The kernel and modules are all built with a single toolchain using a "hermetic build" process wherein all needed libraries are provided independently of the system the kernel is built on. Compiler updates are carefully examined to ensure that they will not result in any ABI changes; Android would rather not upgrade than risk problems, he said.

Within the ABI itself, the goal is to keep everything that is observable stable. That task is obviously easier if the set of observable aspects is minimized; kernel symbol namespaces help in that regard. They also help to prevent kernel symbols from being used accidentally. The kernel-module interface is established by looking at the symbols that are actually used by vendor modules; those naturally have to be exported. Everything that turns out not to be used is trimmed from the GKI, though, making it unavailable. When a vendor needs a new symbol, a request is made to the Android Open Source Project; assuming the request makes sense, the symbol will appear in a subsequent GKI update.

Android on mainline

Sumit Semwal talked for a while about what it takes to boot Android on a mainline kernel. It turns out that, in the generic case, there is only one patch needed at this point: anonymous VMA naming. [Sumit Semwal] The Android kernel also requires inline encryption, but that has been merged for the 5.9 release.

The situation gets more complicated on real hardware, of course. For devices using the Snapdragon 845 system-on-chip, a number of out-of-tree drivers are required. One of them, the lt9611 HDMI bridge driver used on Pixel 3 devices, has been queued for the 5.10 merge window. The Xiaomi Pocophone F1 can run on the 5.9-rc1 kernel with just a few patches for the touchscreen, WiFi, and audio devices.

Android may, in theory, be able to boot with a single patch to the mainline kernel, but the project is still carrying 485 patches on top of the 5.9-rc kernel, he said. About 30 of those are currently being discussed for merging; 78 of them are intended to be upstreamed. Another 25 are being worked on by Linaro with the intent of getting them upstream. There are 54 patches that will eventually be replaced by alternatives; these include the ION memory allocator. That leaves 260 patches currently not on a path for upstreaming; many of them have to do with the GKI build or configuration changes. There are ten patches that could be considered for upstream, but they need an upstream user as well.

The most active discussions around upstreaming currently focus on features like inline encryption (now merged) and the incremental filesystem (discussed further below). Anonymous VMA naming was first posted by Colin Cross in 2013, but has yet to be merged; a new effort to merge that work is underway now. DMA-buf heaps are moving forward as a replacement for ION.

Work that is not yet going upstream includes DRM notifiers, which lack an in-kernel user and thus will not be considered for merging; that patch is not being posted currently. More DMA-buf heap providers will be needed to fully replace ION. These, too, lack in-kernel users; changing that will require vendors to upstream their drivers that need those features.

Incremental filesystem

[Paul Lawrence] One significant Android feature that has not yet seen much discussion in the mainline is the incremental filesystem; Paul Lawrence ran a brief session dedicated to this work. The goal behind the incremental filesystem is to allow the launch of a newly downloaded app to happen immediately, even if the process of downloading the app to the device has not yet completed. To make that happen, files that are being downloaded are made to appear as if they were already present on the device. Reads from such a file will succeed if the relevant blocks are present; otherwise the app will have to wait until those blocks show up.

Files on an incremental filesystem are read-only, but the filesystem itself is not. A file's blocks can be delivered out of order, and the filesystem will keep track of things accordingly. It is implemented as a stacking filesystem, meaning that there is a "real" filesystem underneath where the files are ultimately stored. Most directory operations are passed through directly to the underlying filesystem, while reads require interpreting the file and returning the expected data (once it is available). Writes (only allowed when the file is being created) are done via a special ioctl() call.

This work was first posted to the lists in 2019, but it has not yet received much serious consideration. Expect new versions in the near future as the Android project works to get this feature into the mainline kernel.

Comments (10 posted)

Modernizing the tasklet API

September 14, 2020

This article was contributed by Marta Rybczyńska

Tasklets offer a deferred-execution method in the Linux kernel; they have been available since the 2.3 development series. They allow interrupt handlers to schedule further work to be executed as soon as possible after the handler itself. The tasklet API has its shortcomings, but it has stayed in place while other deferred-execution methods, including workqueues, have been introduced. Recently, Kees Cook posted a security-inspired patch set (also including work from Romain Perier) to improve the tasklet API. This change is uncontroversial, but it provoked a discussion that might lead to the removal of the tasklet API in the (not so distant) future.

The need for tasklets and other deferred execution mechanisms comes from the way the kernel handles interrupts. An interrupt is (usually) caused by some hardware event; when it happens, the execution of the current task is suspended and the interrupt handler takes the CPU. Before the introduction of threaded interrupts, the interrupt handler had to perform the minimum necessary operations (like accessing the hardware registers to silence the interrupt) and then call an appropriate deferred-work mechanism to take care of just about everything else that needed to be done. Threaded interrupts, yet another import from the realtime preemption work, move the handler to a kernel thread that is scheduled in the usual way; this feature was merged for the 2.6.30 kernel, by which time tasklets were well established.

An interrupt handler will schedule a tasklet when there is some work to be done at a later time. The kernel then runs the tasklet when possible, typically when the interrupt handler finishes, or the task returns to the user space. The tasklet callback runs in atomic context, inside a software interrupt, meaning that it cannot sleep or access user-space data, so not all work can be done in a tasklet handler. Also, the kernel only allows one instance of any given tasklet to be running at any given time; multiple different tasklet callbacks can run in parallel. Those limitations of tasklets are not present in more recent deferred work mechanisms like workqueues. But still, the current kernel contains more than a hundred users of tasklets.

Cook's patch set changes the parameter type for the tasklet's callback. In current kernels, they take an unsigned long value that is specified when the tasklet is initialized. This is different from other kernel mechanisms with callbacks; the preferred way in current kernels is to use a pointer to a type-specific structure. The change Cook proposes goes in that direction by passing the tasklet context (struct tasklet_struct) to the callback. The goal behind this work is to avoid a number of problems, including a need to cast from the unsigned int to a different type (without proper type checking) in the callback. The change allows the removal of the (now) redundant data field from the tasklet structure. Finally, this change mitigates the possible buffer overflow attacks that could overwrite the callback pointer and the data field. This is likely one of the primary objectives, as the work was first posted (in 2019) on the kernel-hardening mailing list.

Plotting the removal of tasklets

The patch set caused no controversies, but the discussion changed direction following this comment from Peter Zijlstra, who said: "I would _MUCH_ rather see tasklets go the way of the dodo [...] Can't we stage an extinction event here instead?" In a response, Sebastian Andrzej Siewior suggested that tasklets could be replaced with threaded interrupts, as they also run in atomic context. Dmitry Torokhov suggested immediately expiring timers instead. Cook replied that the change could not be done mechanically and gave some examples of more complicated usage of tasklets. One such case is the AMD ccp crypto driver, which combines tasklets with DMA engines, while another is the Intel i915 GPU driver, which schedules GPU tasks with tasklets.

In the following messages, Thomas Gleixner "grudgingly" acked the patch set, but also spoke in favor of removing tasklets: "I'd rather see tasklets vanish from the planet completely, but that's going to be a daring feat." The developers agreed that removing tasklets would be a logical next step, but that this is a bigger task than improving their API. The Kernel Self-Protection Project has added a dedicated task for this objective.

The removal of the tasklet API has been discussed before; LWN covered it in 2007. At that time, the main argument for the removal of tasklets was to limit latencies (since tasklets run in software interrupt mode, they can block even the highest-priority tasks). The argument against removing tasklets was a possible performance loss for drivers that need to react quickly to events. At that time, threaded interrupts were not yet included in the mainline.

In current kernels, tasklets can be replaced by workqueues, timers, or threaded interrupts. If threaded interrupts are used, the work may just be executed in the interrupt handler itself. Those newer mechanisms do not have the disadvantages of tasklets and should satisfy the same needs, so developers do not see a reason to keep tasklets. It seems that any migration away from tasklets will be done one driver (or subsystem) at a time. For example, Takashi Iwai already reported having the conversion ready for sound drivers.

Current API changes

While the removal of tasklets remains a longer-term goal, the developers are proceeding with the API changes. The modifications in the tasklet API performed by Cook's patch set are minimal and consist of creating a new initialization macro and adding one initialization function. In current kernels, tasklets are declared with:

     #define DECLARE_TASKLET(name, func, data) \
          struct tasklet_struct name = { NULL, 0, ATOMIC_INIT(0), func, data }

To allow compatibility with existing users, all calls to the "old" DECLARE_TASKLET() were changed to DECLARE_TASKLET_OLD with the following definition:

     #define DECLARE_TASKLET_OLD(name, _func)        \
          struct tasklet_struct name = {             \
          .count = ATOMIC_INIT(0),            	     \
          .func = _func,                    	     \
     }

The same modifications were done to the DECLARE_TASKLET_DISABLED() macro. The conversion to DECLARE_TASKLET_OLD() turned out to be mechanical, since all those users provided zero as the data parameter.

A following patch included a new version of the declaration macro that does not contain that data parameter:

     #define DECLARE_TASKLET(name, _callback)        \
          struct tasklet_struct name = {             \
          .count = ATOMIC_INIT(0),            	     \
          .callback = _callback,                     \
          .use_callback = true,                	     \
     }

In the new API, the callback function is stored in the callback() field rather than func(); the callback itself simply takes a pointer to the tasklet_struct structure as its one argument:

    void (*callback)(struct tasklet_struct *t);

That structure will normally be embedded within a larger, user-specific structure, the pointer to which can be obtained with the container_of() macro in the usual way. The patch set also adds a function to initialize a tasklet at run time, with the following prototype:

     void tasklet_setup(struct tasklet_struct *t,
          void (*callback)(struct tasklet_struct *));

The tasklet subsystem will invoke the callback in either the new or the old mode, depending on how the tasklet was initialized; beyond that, the behavior of tasklets is unchanged.

Where to from here

The team working on the change submitted a number of patches to convert all tasklet initializations in the kernel to the new tasklet_setup() function. Another task remains to remove the tasklets from all those users. The work in some subsystems has already started. Developers are welcome to help with the conversion of all subsystems to the new API and, eventually, removing all tasklet users from the kernel. There is certainly plenty of will on the part of the kernel developers to do so, but this is likely going to take a few kernel development cycles.

Comments (none posted)

News from PHP: releases, features, and syntax

By John Coggeshall
September 16, 2020

As the PHP project nears its 8.0 release, which is currently slated for late November, there are a number of interesting things to report from its development mailing list. For one, the syntax of the attributes feature has finally been settled on after an acrimonious debate largely over the minutiae of the voting process. In addition, some releases were made and a new proposal to add any() and all() as core library functions was discussed.

New releases and general news

The PHP project has recently released three new versions; two in the PHP 7 series (7.3.22 and 7.4.10) and PHP 8.0beta3. Both PHP 7 releases were for bug fixes, addressing approximately 20 issues which can be seen in the release notes for 7.4.10 and 7.3.22. The most notable of these fixes addressed a language-wide memory leak when using compound assignments, and crash fixes when xml_parser_free() and array_merge_recursive() are called.

While the project continues to provide bug-fix releases for PHP 7, development on PHP 8.0 is steaming ahead. The community has succeeded thus far in keeping with its release schedule; it is still on-target for general availability of PHP 8.0 on November 26. One noteworthy recent decision by the project was to drop support for OpenSSL version 1.0.1.

Originally, PHP 8.0beta3 was to be the last beta release before entering into the release-candidate (RC) phase, when implementation details regarding APIs and behavior should stop changing. That plan changed, however, at the request of Nikita Popov. In the request to release manager Sara Golemon, Popov said more time was needed, suggesting eliminating the final RC5 release in exchange for an extra beta release:

We're close to done with the warning to Error exception promotion task, but haven't really started on reviewing and consolidating parameter names yet (in preparation for named parameters). It would be good to have that work mostly finalized before RC1, so we can limit the number of nominally BC-breaking changes past RC1 (I expect we'll still fix some things that slipped through the cracks, but at least we should prevent any mass changes).

The two issues referred to by Popov are an effort to reclassify errors and warnings within the PHP engine and the necessary internals work to support the approved named arguments RFC. Dimtry Stogov agreed with Popov, explaining that it would give him more time to clean up the just-in-time compilation that is new in PHP 8.

The request was approved by Golemon, making the beta4 release scheduled for September 17. Since the scheduling change effectively swaps one planned release (RC5) for another (the new beta4), the final release date will remain unchanged for now; Golemon, however, isn't closing the door on pushing the final release to December 10.

This decision does have impacts on development past PHP 8.0, however. Until a release-candidate branch is created, no development past the 8.0 release can take place in the repository. Thus, this release schedule change will delay developers who want to start committing features slated for PHP 8.1.

Proposal: any() and all()

Contributor Tyson Andre made a proposal for two new core language functions: any() and all(). These new functions are designed to work with the pseudo-type iterable, which is a PHP array or object implementing the Traversable interface. The intent behind any() and all() is to check each element of an iterable, returning a boolean value indicating whether any or all of the elements evaluate as true, respectively. This is similar to Python's any() and all(), except that the PHP version allows passing a function to be used as a predicate. Here is a look at how PHP currently solves this problem, and how a function like all() would be useful:

    // Check condition for $item (current PHP)
    foreach($my_list as $item) {
        if(!isOkay($item)) {
            throw new Exception("Invalid");
        }
    }
    // Same logic with the proposed all() function

    if(!all($my_list, fn($item) => isOkay($item))) {
        throw new Exception("Invalid");
    }

The idea has been around for a long time, but has never made it into the language. A conceptually similar implementation of all() and any() first appeared as a pull request in 2015. After two years, that pull request was closed due to a lack of feedback from the contributor. Then, in 2018, another attempt was made — again, it was abandoned by the contributor. The third time might be the charm, as it has received a fair amount of feedback from the community and is under active discussion. The last time the concept was proposed, Golemon felt it didn't need to be in core, but has yet to give an opinion on Andre's attempt; Larry Garfield has supported its inclusion in PHP 8.1, pending some changes. All in all, many different languages implement various interpretations of what Andre is proposing with these functions, making for a reasonable argument that PHP should do the same.

PHP attributes syntax conclusion

We recently wrote about the saga of attributes in the upcoming PHP 8.0 release. Attributes are used to attach metadata to classes, functions, object properties, and so on. There has been considerable debate over the feature's syntax. The original implementation used <<>>, as in <<Attribute('param1','param2')>>. The syntax was then changed to @@Attribute('param1', 'param2'), something that Derick Rethans has called "a terrible mistake" and, with the support of Benjamin Eberlei, hoped to change. On August 10, with the blessing of Golemon, Rethans opened the re-vote on attributes syntax via a new request for comments (RFC). The goal of the vote was to decide a new attributes syntax among four options proposed by Rethans and Eberlei: @@Attr, #[Attr], @[Attr], and <<Attr>>. As the votes began to be cast, the various parties continued to advocate for their opinion on how things should play out.

Theodore Brown, who originally proposed (and succeeded in) changing attribute syntax from <<Attr>> to @@Attr, was one of the more vocal voices opposing the effort to change the syntax again. He made it clear that he wasn't changing his mind. In an email regarding his vote to the mailing list, he called the RFC "fundamentally flawed." As voting continued, another mega-thread emerged as developers quibbled over their respective positions among the choices. Golemon spoke up to assert that those who were voting for the @[...] version of the syntax "are making a terrible choice", adding:

We have options with varying degrees of backward compatibility issues (<<>> none, @@ some, #[..] and @[..] a bit more than some), and only one which offers forward compatibility (#[..]). So why vote in favor of the option with the highest BC breaking probability and no FC?

The debate over the change again quickly escalated. Two days after voting had started, the fight transitioned into a parliamentary issue when Brown pointed out that the RFC "was rushed to vote after less than the minimum two week period required after it was brought up on the list." Technically, Brown was correct: the voting began approximately 30 hours short of two weeks time. Golemon responded to Brown indicating that she wasn't inclined to derail the vote over it, but she was open for discussion:

So, 30 hours short of 2 weeks. I'm going to ascribe good intentions in trying to get the issue resolved in the minimal timeframe. The fact active discussion was ongoing makes this a questionable choice, but in my opinion, purely on a matter of time, quibbling over 30 hours is splitting hairs. Maybe compromise on adding time to the vote end period so that the total is greater than 4 weeks? [...] You're right to raise the concern. And let's wag a finger over it at least. If others agree that it's premature, we can stop the vote, but I'm not inclined to disrupt the process over such a small variance.

Rethans wrote that he chose to start the vote when he did based on Golemon's previously-expressed desire to have it done by beta3, but he miscalculated the date for the release. Rethans then added he had no problem extending the length of the vote per Golemon's suggestion.

That wasn't enough to satisfy Brown, who added that he felt the rules on resurrecting rejected proposals was also being violated. Brown claimed that, when he first proposed changing the attribute syntax, he omitted a syntax option he preferred (@:) because it had been previously rejected in an earlier failed attributes proposal. The proposal by Rethans and Eberlei did not include the @: syntax Brown omitted, but did include other previously-rejected options such as <<Attr>>. Brown felt that this made the RFC and vote invalid, writing:

But if we can vote again on '#[]' and '<<>>' after they were declined, why can't we also vote again for '@:'? This syntax has the advantage of being equally short as '@@' without any BC break.

Brown's complaints launched another mega-thread, including more gripes regarding the timing of the vote and even challenging the justification for the vote at all. Andreas Leathley, upon reviewing the RFC policies, argued that voting on the proposal was not opened 30 hours early, but rather eight days early:

After reading https://wiki.php.net/rfc/howto it is stated clearly there that an RFC has to be created and be "Under Discussion" for at least two weeks. So you were actually wrong that the RFC was one day early - it was at least 8 days early, as the RFC was created and announced on the 4th of August and then put to vote on the 10th of August.

After considerable heated discussion over the rules, Brown ultimately made four requests in an effort to resolve the impasse. He asked that the 50+ votes already cast be invalidated and that voting be restarted after the full two weeks had passed, that the vote include previously-rejected options such as @:, that the RFC add a section on the backward-compatibility impacts of each syntax option, and that the RFC provide a summary of the arguments made on the internals mailing list for and against each option.

Several of these requests were accepted without much challenge; the biggest point of contention between the parties was the invalidation of votes that had already been received. There was a fair amount of support for invaliding the vote from community members, but choosing that road meant the votes that had already been cast might not get recast in a second vote. By August 19, Rethans and Eberlei relented; the votes were invalidated and a new vote opened after an appropriate amount of time had passed. The new vote also included the additional information at the request of Brown, and was closely monitored by him for any potential irregularities.

For all of the back and forth, in the end Rethans and Eberlei got their way with the #[Attr] syntax winning out against the competition (33 out of 64 total votes cast). Due to all of the delays caused by the uproar, the feature was unable to make it into the PHP 8.0beta3 release as planned; it is expected to make its first appearance in PHP 8.0beta4.

Wrapping up

With the entire PHP internals community focused on getting PHP 8.0 released, discussions of features for PHP 8.1 (such as any() and all()) are a lower priority at the moment. Once PHP 8.0 reaches the release-candidate stage, the door will open for commits destined for PHP 8.1 and features targeted for that release will pick up speed. As long as another mega-debate over syntax doesn't erupt over the next couple of weeks, it looks likely there will be a new version of PHP available in time for the holidays.

Comments (6 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Briefs: GNOME 3.38; Moment.js "retires"; Quotes; ...
  • Announcements: Newsletters; conferences; security updates; kernel patches; ...
Next page: Brief items>>

Copyright © 2020, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds