Leading items

Welcome to the LWN.net Weekly Edition for January 14, 2021

This edition contains the following feature content:

Debian discusses vendoring—again: what is to be done to properly package programs with an unmanageable number of dependencies?
A license change for Nmap: a venerable tool accidentally goes non-free.
Restricted DMA: infrastructure to protect the system against buggy or compromised drivers.
Old compilers and old bugs: a kernel war story shows the cost of supporting old toolchains.
A possible step toward integrity measurement for Fedora: the idea of shipping checksums with all installed files in Fedora 34 is not universally acclaimed.

This week's edition also includes these inner pages:

Brief items: Brief news items from throughout the community.
Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Debian discusses vendoring—again

By Jake Edge
January 13, 2021

The problems with "vendoring" in packages—bundling dependencies rather than getting them from other packages—seems to crop up frequently these days. We looked at Debian's concerns about packaging Kubernetes and its myriad of Go dependencies back in October. A more recent discussion in that distribution's community looks at another famously dependency-heavy ecosystem: JavaScript libraries from the npm repository. Even C-based ecosystems are not immune to the problem, as we saw with iproute2 and libbpf back in November; the discussion of vendoring seems likely to recur over the coming years.

Many application projects, particularly those written in languages like JavaScript, PHP, and Go, tend to have a rather large pile of dependencies. These projects typically simply download specific versions of the needed dependencies at build time. This works well for fast-moving projects using collections of fast-moving libraries and frameworks, but it works rather less well for traditional Linux distributions. So distribution projects have been trying to figure out how best to incorporate these types of applications.

This time around, Raphaël Hertzog raised the issue with regard to the Greenbone Security Assistant (gsa), which provides a web front-end to the OpenVAS vulnerability scanner (which is now known as Greenbone Vulnerability Management or gvm).

[...] the version currently in Debian no longer works with the latest gvm so we have to update it to the latest upstream release... but the latest upstream release has significant changes, in particular it now relies on yarn or npm from the node ecosystem to download all the node modules that it needs (and there are many of them, and there's no way that we will package them individually).

The Debian policy forbids download during the build so we can't run the upstream build system as is.

Hertzog suggested three possible solutions: collecting all of the dependencies into the Debian source package (though there would be problems creating the copyright file), moving the package to the contrib repository and adding a post-install step to download the dependencies, or removing gsa from Debian entirely. He is working on updating gsa as part of his work on Kali Linux, which is a Debian derivative that is focused on penetration testing and security auditing. Kali Linux does not have the same restrictions on downloading during builds that Debian has, so the Kali gsa package can simply use the upstream build process.

He would prefer to keep gsa in Debian, "but there's only so much busy-work that I'm willing to do to achieve this goal". He wondered if it made more sense for Debian to consider relaxing its requirements. But Jonas Smedegaard offered another possible approach: analyzing what packages are needed by gsa and then either using existing Debian packages for those dependencies or creating new ones for those that are not available. Hertzog was convinced that wouldn't be done, but Smedegaard said that the JavaScript team is already working on that process for multiple projects.

Hertzog ran the analysis script described on that page and pointed to the output from the package.json file of gsa. He said that it confirmed his belief that there are too many dependencies to package; "Even if you package everything, you will never ever have the right combination of version of the various packages."

To many, that list looks daunting at best, impossible at worst, but Smedegaard seemed unfazed, noting several reasons to believe that those dependencies can be handled. But Hertzog pointed out that the work is not of any real benefit, at least in his mind. He cannot justify spending lots of time packaging those npm modules (then maintaining them) for a single package "when said package was updated in Kali in a matter of hours". He thinks the distribution should focus its efforts elsewhere:

By trying to shoehorn node/go modules into Debian packages we are creating busy work with almost no value. We must go back to what is the value added by Debian and find ways to continue to provide this value while accepting the changed paradigm that some applications/ecosystems have embraced.

He said that Debian is failing to keep up with the paradigm change in these other ecosystems, which means that "many useful things" are not being packaged. Pirate Praveen agreed that there are useful things going unpackaged, but disagreed with Hertzog's approach of simply using the upstream download-and-build process. Praveen thinks that a mix of vendoring (bundling) for ultra-specific dependencies and creating packages for more generally useful modules is the right way forward. It comes down to distributions continuing to provide a particular service for their users:

All the current trends are making it easy for developers to ship code directly to users. Which encourages more isolation instead of collaboration between projects. It also makes it easy for shipping more proprietary code, duplication of security tracking or lack of it. Debian and other distributions have provided an important buffer between developers and users as we did not necessarily follow the priorities or choices of upstream developers exactly always.

One of the reasons Smedegaard felt that the dependencies for gsa could be handled via Debian packages is that gsa (and other large projects) tend to overspecify the versions required; in many cases, other versions (which might already be packaged for Debian) work just fine. But figuring that out is "a substantial amount of work", Josh Triplett said in a lengthy message. He cautioned against the "standard tangent" where complaints about the number of dependencies for these types of projects are aired.

[...] people will still use fine-grained packages and dependencies per the standard best-practices of those communities, no matter the number or content of mails in this thread suggesting otherwise. The extremes of "package for a one-line function" are not the primary issue here; not every fine-grained dependency is that small, and the issues raised in this mail still apply whether you have 200 dependencies or 600. So let's take it as a given that packages *will* have hundreds of library dependencies, and try to make that more feasible.

He said that disregarding a project's guidance on the versions for its dependencies is fraught, especially for dynamically typed languages where problems may only be detected at run time. For those ecosystems, the normal Debian practice of having only one version of a given library available may be getting in the way. Relaxing that requirement somewhat could be beneficial:

I'm not suggesting there should be 50 versions of a given library in the archive, but allowing 2-4 versions would greatly simplify packaging, and would allow such unification efforts to take place incrementally, via transitions *in the archive* and *in collaboration with upstream*, rather than *all at once before a new package can be uploaded*.

Triplett outlined the problems that developers encounter when trying to package a project of this sort. They can either try to make it work with the older libraries available in Debian, upgrade the libraries in Debian and fix all the resulting problems in every package that uses them, or simply bundle the required libraries. The first two are enormously difficult in most cases, so folks settle for bundling, which is undesirable but unavoidable:

Right now, Debian pushes back heavily on bundling, and *also* pushes back heavily on all of the things that would solve the problems with unbundled dependencies. That isn't sustainable. If we continue to push back on bundling, we need to improve our tools and processes and policies to make it feasible to maintain unbundled packages. Otherwise, we need to build tools and processes and policies around bundled dependencies. (Those processes could still include occasional requirements for unbundling, such as for security-sensitive libraries.)

Adrian Bunk is concerned with handling security problems in a world with multiple library versions. He said that these ecosystems seem to not be interested in supporting stable packages for three to five years, as needed by distributions such as Debian stable or Ubuntu LTS. More library proliferation (version-wise) just means more work for Debian when the inevitable CVE comes along, he said.

But Triplett said that he is not expecting there to be a lot of different library versions, but that at times it might make sense to have more than one:

I'm talking about packaging xyz 1.3.1 and 2.0.1, as separate xyz-1 and xyz-2 packages, and allowing the use of both in build dependencies. Then, a package using xyz-1 can work with upstream to migrate to xyz-2, and when we have no more packages in the archive using xyz-1 we can drop it.

That's different from requiring *exactly one* version of xyz, forcing all packages to transition immediately, and preventing people from uploading packages because they don't fork upstream and port to different versions of dependencies.

It seems safe to say that few minds were changed in the course of the discussion. Bunk and Triplett seemed to talk past each other a fair bit. And no one spoke up with some wild new solution to these problems. But the problems are not going to disappear anytime soon—or ever. Without some kind of shift, bundling will likely be the path of least resistance, at least until some hideous security problem has to be fixed in enough different packages that bundling is further restricted or prohibited. That would, of course, then require a different solution.

The approach currently being taken by Smedegaard, Praveen, and others to tease out the dependencies into their own packages has its attractions, but scalability and feasibility within a volunteer-driven organization like Debian are not among them. The size and scope of the open-source-creating community is vastly larger than Debian or any of its language-specific teams, so it should not come as a surprise that the distribution is not keeping up. Debian is hardly alone with this problem either, of course; it is a problem that the Linux distribution community will continue to grapple with.

Comments (86 posted)

A license change for Nmap

By Jake Edge
January 13, 2021

It may be kind of an obvious statement, but licensing terms matter in our communities. Even a misplaced word or three can be fatal for a license, which is part of the motivation for the efforts to reduce license proliferation in free-software projects. Over the last few months, various distribution projects have been discussing changes made to the license for the Nmap network scanner; those changes seemed to be adding restrictions that would make the software non-free, though that was not the intent. But the incident does serve to show the importance of license clarity.

On October 3, Nmap 7.90 was released; it came with a new license, the Nmap Public Source License (NPSL) version 0.92. The link here goes to the Wayback Machine as the usual location for the NPSL was updated to version 0.93 in mid-January. Previous versions of Nmap were available under the GPLv2, with some additional wording with regard to the project's definition of a "derivative work".

As part of the release announcement and changelog for Nmap 7.90, the license change was made openly: "Upgraded the Nmap license [from] a sort of hacked-up version of GPLv2 to a cleaner and better organized version (still based on GPLv2) now called the Nmap Public Source License to avoid confusion." It did not take long for distributions to start noticing and reacting to the change. In a mid-October message on the development mailing list for the GNU Guix distribution, Marius Bakke asked whether the new license turned Nmap into non-free software:

...which states:

Proprietary vendors: This license does not allow you to redistribute Nmap source code or the executable for use with your software (stand alone or on an appliance).

...I'm fairly certain this is not an acceptable license for Guix, or free software distributions in general.

So I think we should revert the license change, as well as the update to 7.90 which introduced the new license.

There was general agreement that the text (which came from the annotations, not the license itself) constituted a "field-of-use" restriction, which runs afoul of "freedom 0" in the FSF Free Software Definition. Looking more closely, Bakke noted problematic text in the license itself:

Having re-read the original text (without the annotations), the thing that stands out is:

Proprietary software companies wishing to use or incorporate Covered Software within their programs must contact Licensor to purchase a separate license. Open source developers who wish to incorporate parts of Covered Software into free software with conflicting licenses may write Licensor to request a waiver of terms.

[...] So a "proprietary software company" cannot use or incorporate nmap within a program, even if that program is free (as in software)?

Those messages resulted in a Debian bug being opened, also positing that the new license was non-free. A Gentoo bug was also opened; eventually, that led to a GitHub issue being created in the Nmap repository in early December, which is where Nmap creator Gordon "Fyodor" Lyon apparently found out about the controversy.

Lyon said that he agreed that the wording in Section 0 was poor and did not reflect what the project was trying to express:

The intent is to simply explain that because the NPSL terms do not allow inclusion within proprietary software, "proprietary software vendors" (like anyone else) would have to purchase the Nmap OEM license to include it within their proprietary software. If a "proprietary software vendor" also releases free software in compliance with the NPSL, that is great and we certainly don't want or mean to take away that right from them. We are planning to rewrite this before the next Nmap release.

He also agreed that the wording in two other sections could be improved. In a later message, Lyon outlined the intent of the changes, which he seemed to indicate are meant to be free-software-friendly terms:

Our main goal with the license, in case it's not clear, is to allow unlimited free use of Nmap by end users (including commercial use), while charging companies who want to build and sell commercial products on top of Nmap through our Nmap OEM program. The goal has been to avoid the much more common tact of having a "free version" with limitations and then a non-free pro version with extra features. So we need to either have a license which supports the current business model, or switch to a different business model, or give up on a business model and lay off the developers who are currently working full time improving Nmap. So we're trying to find the best balance for everyone.

But several commenters in that GitHub issue were uncomfortable with the wording of his comment, which still sounds like a field-of-use restriction because "commercial products" could be free software, thus they should not be restricted. As Micah Cowan noted, Lyon probably misspoke: "what was actually meant here, is 'while charging companies who want to build and sell proprietary commercial products'". Lyon agreed with that toward the end of December, and that's where things stood over the holidays and into January.

The Fedora distribution made a formal determination that the NPSL 0.92 license was not acceptable for software shipped by Fedora in early January. Void Linux also chose to roll back to an earlier version of Nmap, from before the license change. But on January 12, Lyon announced a new version of the license that is meant to address the problematic clause:

Version 0.92 included:

Proprietary software companies wishing to use or incorporate Covered Software within their programs must contact Licensor to purchase a separate license.

While Version 0.93 rewrites that sentence as:

Companies wishing to use or incorporate Covered Software within their own products may find that our Nmap OEM product (https://nmap.org/oem/) better suits their needs.

I also updated the Nmap Changelog to note that we are retroactively offering the previous Nmap 7.90 and 7.91 releases under this newer version of the NPSL so users and distributors can choose to receive Nmap under either license.

Lyon also admitted that more work needed to be done on the NPSL, "possibly scrapping it altogether and moving to an already established license", but hoped that this fix would alleviate the immediate issues. That remains to be seen, as several commenters were still not entirely happy with some parts. As Fedora program manager Ben Cotton put it:

Thanks for being so responsive. The following change (I think it's a change) in the annotation is still of concern to Fedora:

This license does not allow for redistributing Nmap for use with (or incorporating it's source code within) proprietary hardware. This includes stand-alone software distribution or inclusion on a hardware appliance, docker container, virtual machine, etc.

It is pretty vague and could be interpreted, for example, to preclude Lenovo shipping Fedora Workstation with nmap pre-installed. Most hardware is proprietary to some degree, so this is a pretty broad restriction.

While that text is not part of the license, and is only an annotation, it is still relevant to any court interpretation of the NPSL, Cotton said. It may well be that an NPSL 0.94 is needed to fully patch the problem in the near term. After that, Nmap can hopefully take some more time to clean up the language or replace the license entirely. The project does have copyright assignments that allow it to relicense as needed.

At some level it is a bit surprising that things got to this point. According the Nmap 7.90 changelog, the NPSL project got its start back in 2006, but proceeded in fits and starts after that, until the 7.90 release suddenly brought in the new license. It would seem that it was discussed in the Nmap community in 2006 and 2013, at least, but there was, apparently, little wider review. Raising the profile of the proposed license change seems likely to have headed off many of the problems that cropped up more or less immediately.

This episode should probably serve as a warning to other free-software projects. Certainly those contemplating a license change should probably attempt to involve the wider community for review, perhaps by using the OSI license-discuss mailing list. In addition, proposed license "patches" might also benefit from wider review. Beyond that, though, there are good reasons to stick with known licenses. It would not seem to be that hard to find an existing license that fulfills the same goals as the NPSL. Doing so would make this kind of exercise much less fraught.

Comments (23 posted)

Restricted DMA

By Jonathan Corbet
January 7, 2021

A key component of system hardening is restricting access to memory; this extends to preventing the kernel itself from accessing or modifying much of the memory in the system most of the time. Memory that cannot be accessed cannot be read or changed by an attacker. On many systems, though, these restrictions do not apply to peripheral devices, which can happily use direct memory access (DMA) on most or all of the available memory. The recently posted restricted DMA patch set aims to reduce exposure to buggy or malicious device activity by tightening up control over the memory that DMA operations are allowed to access.

DMA allows devices to directly read from or write to memory in the system; it is needed to get reasonable I/O performance from anything but the slowest devices. Normally, the kernel is in charge of DMA operations; device drivers allocate buffers and instruct devices to perform I/O on those buffers, and everything works as expected. If the driver or the hardware contains bugs, though, the potential exists for DMA transfers to overwrite unrelated memory, leading to corrupted systems and unhappy users. Malicious (or compromised) hardware can use DMA to compromise the system the hardware is attached to, making users unhappier still; examples of this type of attack have been posted over the years.

One way to address this problem is to place an I/O memory-management unit (IOMMU) between devices and memory. The kernel programs the IOMMU to allow access to a specific region of memory; the IOMMU then keeps devices from straying outside of that region. Not all systems are equipped with an IOMMU, though; they are mostly limited to the larger processors found in desktop machines, data centers, and the like. Mobile systems usually lack an IOMMU.

The restricted DMA patch set, posted by Claire Chang, is an attempt to apply some control to DMA operations on systems without an IOMMU. To do so, it builds on an old, relatively obscure kernel mechanism called the "swiotlb", which stands for "software I/O translation lookaside buffer". The swiotlb was originally created to facilitate operations with devices that have annoying DMA limitations, such as the inability to address all of the memory in the system. The core mechanism used within the swiotlb is bounce buffering: allocating a buffer in a region that the device in question is able to access, then copying data between I/O buffers and this bounce buffer as needed. Copying the data clearly slows I/O operations, but it is far better than not using DMA at all.

Chang's patch set enhances the swiotlb by allowing it to allocate a specific range of physical memory and associate it with a given device; this range can be specified in a devicetree using the new restricted-dma-pool "compatible" property. All DMA operations involving that device will be bounced through that range of memory, effectively isolating devices from the actual I/O buffers seen by the rest of the system.

Using this kind of bounce-buffering offers some benefit on its own. Your editor, who has written device drivers in the past, would never have committed such an error, but it is not unheard of for driver bugs to result in a device performing DMA when the rest of the system thinks it should be idle. Having memory buffers seemingly randomly overwritten in unreproducible ways can (again, your editor relies on the word of others for this) result in corrupt data, painful debugging sessions, and excessive alcohol use. By separating the buffer used by the device from the buffer used by the kernel, restricted DMA can mitigate many of the more unpleasant effects of this sort of bug.

Readers may be wondering, though, how the use of the swiotlb will protect the system against a malicious or compromised device; such devices may well ignore polite requests to restrict their DMA activities to the designated area, after all. The answer is that it will not protect systems from this type of attack — at least, not on its own. The evident intent, though, is to pair restricted DMA with trusted firmware implementations that are able to restrict DMA operations to specific ranges of memory; these restrictions are set up at (or before) boot time and cannot be changed by the kernel. So the trusted firmware can constrain a device's access to the designated region, while the restricted DMA mechanism causes all DMA operations to go through that region. Together, these mechanisms provide a way to enable DMA without allowing a device to access arbitrary memory, all without an IOMMU in the system.

The amount of setup work required suggests that this capability will not be present on most general-purpose systems anytime soon. But on tightly controlled systems — mobile devices, for example — there is clear value in making the additional effort to prevent compromise via a hostile device. It's not clear whether the restricted DMA patches will make it into the mainline in their current form, but chances are that this kind of mechanism will be merged sooner or later.

Comments (18 posted)

Old compilers and old bugs

By Jonathan Corbet
January 11, 2021

The kernel project goes out of its way to facilitate building with older toolchains. Building a kernel on a new system can be enough of a challenge as it is; being forced to install a custom toolchain first would not improve the situation. So the kernel developers try to keep it possible to build the kernel with the toolchains shipped by most distributors. There are costs to this policy though, including an inability to use newer compiler features. But, as was seen in a recent episode, building with old compilers can subject developers to old compiler bugs too.

On January 5, Russell King reported on a problem he had been chasing for a long time. Some of his 64-bit Arm systems running 5.4 or later kernels would, on rare occasion, report a checksum failure on the ext4 root filesystem. It could take up to three months of uptime for the problem to manifest itself, making it, as King described it, "unrealistic to bisect". He had, however, found a way to more reliably reproduce the failure, making the task of finding out when the problem was introduced plausible, at least.

Starting with King's findings, a number of developers working in the Arm subsystem looked into the issue; their efforts seemed to point out this commit as the culprit. That change, applied in 2019, relaxed the memory barriers used around I/O accessors, optimizing accesses to I/O memory. Reverting this patch made the problem go away.

Some developers might have applied the revert and called the problem solved, but that is not what happened here. Will Deacon, the author of the patch in question, was convinced of its correctness; if the Arm architecture is behaving as specified, there should be no need for the stronger barriers, so something else was going on. Reverting the patch, in other words, made the issue go away by papering over a real problem somewhere else.

Where might that "somewhere else" be? King suggested that it could be somewhere else in the kernel, in the Arm processor itself, or in the cache-coherent interconnect that ties together processor clusters and memory. He thought that a problem in the hardware was relatively unlikely, and that the bug thus lurked somewhere within the kernel. That, naturally, led to a lot of code examination, especially within the ext4 filesystem.

Two days later, King announced that the problem had been found; it indeed was an issue within the ext4 filesystem, but not of the variety that had been expected. A look at the assembly code generated for ext4_chksum() revealed that the compiler was freeing the function's stack frame prior to the end of the function itself. The last line of the function is:

    return *(u32 *)desc.ctx;

Here, desc is a local variable, living on the stack. The compiled function was resetting the stack pointer above this variable immediately before fetching desc.ctx. That led to a window of exactly one instruction where the function was using stack space that had already been freed.

This is a compiler bug of the worst type. The miscompiled code will work as expected almost every time; there is, after all, no other code trying to allocate stack space in that one-instruction window. All bets are off, though, if an interrupt arrives exactly between the two instructions; then the stack will be overwritten and the load of desc.ctx will be corrupted, leading to the observed checksum failure. This is something that will almost never happen, but when it does things will go badly wrong.

This miscompilation was done by GCC 4.9.4, which was released in August 2016 (4.9.0, the major release on which it is based, came out in April 2014). The relevant bug, though, was reported in 2014 and fixed in November of that year. That fix was seemingly never backported from the (then) under-development 5.x release to 4.9.x, so the 4.9.4 release did not contain it. Interestingly, versions of 4.9.4 shipped by distributors like Red Hat, Android, and Linaro all did have the fix backported, so it only affected developers not using those versions. The bug lurked there for years until finally turning up in King's builds.

One outcome from this episode is a clear illustration of the potential downside of supporting old toolchains. A great deal of effort went into tracking down a bug that had, in fact, been fixed six years ago; that would have not been necessary if developers were not still using 4.9.x compilers.

As it happens, GCC 4.9 is the oldest compiler supported by the kernel, but even that requirement is relatively recent. As of 2018, the kernel still claimed (not entirely truthfully) that it could be built with GCC 3.2, which was released in 2002. As a result of discussions held in 2018, the minimum GCC version was moved forward to 4.6; later it became 4.9.

Fixing GCC 4.9 to address this bug is out of the question; the GCC developers have long since moved on from that release. So, at a minimum, the oldest version of the compiler that can be used for the arm64 architecture will have to be moved forward to 5.1. But that immediately led to the question of whether the oldest version for all architectures should be moved forward.

Ted Ts'o was in favor of that change, but he also pointed out that RHEL 7 (and thus CentOS 7) systems are still stuck with GCC 4.8. As Peter Zijlstra noted, though, it is already necessary to install a newer compiler than the distribution provides to build the kernel on those systems. Arnd Bergmann said that the other known users of GCC 4.9 were Android and Debian 8. Android has since switched over to Clang to build its kernels, and Debian 8 went unsupported at the end of June 2020. So it would appear that relatively few users would be inconvenienced by raising the minimum GCC version to 5.1.

On the other hand, there are some advantages to such a move beyond leaving an unpleasant bug behind. Bergmann argued for this change because it would allow compiling the kernel with -std=gnu11, making it possible to rely on bleeding-edge C11 features. Currently, kernel builds use -std=gnu89, based on the rather less shiny C89 standard. Zijlstra and Deacon both added that moving to 5.1 would allow the removal of a number of workarounds for GCC 4.9 problems.

Given all that, it seems unlikely that there will be much opposition to moving the kernel as a whole to the 5.1 minimum version. That said, Linus Torvalds is unconvinced about the value of such a change and may yet need some convincing. Even if the shift to 5.1 does not happen right away, the writing would seem to be on the wall that GCC 4.9 will not be supported indefinitely. GCC 5.1, released in April 2015, is not the newest thing on the planet either, of course. But hopefully it has fewer lurking bugs while simultaneously making some welcome new features available. Supporting old toolchains has its value, but so does occasionally dropping the oldest of them.

Comments (40 posted)

A possible step toward integrity measurement for Fedora

By Jonathan Corbet
January 8, 2021

The Fedora 34 release is planned for April 20 — a plan that may well come to fruition, given that the Fedora project appears to have abandoned its tradition of delayed releases. As part of that schedule, any proposals for system-wide changes were supposed to be posted by December 29. That has not stopped the arrival of a late proposal to add file signatures to Fedora's RPM packages, though. This proposal, meant to support the use of the integrity measurement architecture (IMA) in Fedora, has not been met with universal acclaim.

The purpose of IMA is to measure whether the integrity of the system is intact, where "integrity" means that the important files in the system have not been corrupted. At its core, this measurement is carried out by reading a file's contents, computing a hash, and comparing that hash to the expected value; if the values match, the file has not been altered. This measurement can be used to prevent the execution (or reading) of corrupted files; it can also be used as part of a remote attestation scheme to convince a remote party that the local system has not been subjected to unauthorized modifications.

To perform this measurement, IMA clearly must know what the expected hash for each file is; those hashes are signed with a key trusted by the kernel and stored as extended attributes. Generally, the private key used to sign these hashes is kept in some secure location, while the public key is either stored in a device like a trusted platform module (TPM) or built into the kernel binary. If all works as intended, IMA can thus be used to ensure that systems only run executables that have been blessed by some central authority, that those executables only read configuration files that have been similarly blessed, and so on. It is a mechanism for ensuring that the owner of a system keeps control of it; whether this is a good thing or not depends entirely on who the "owner" is defined to be.

The actual proposal does not go so far as to implement IMA on Fedora systems; it is limited to including signatures with every file that is shipped in Fedora packages. These signatures "will be made with a key that’s kept by the Fedora Infrastructure team, and installed on the sign vaults". Fedora users would then be able to use IMA to keep their systems from using files that have been modified since they were packaged. An actual IMA setup for Fedora can be expected to come at some future time.

Using stereotypes is always a hazardous business, but it is still probably safe to say that a typical Fedora user is uninterested in the prospect of some central authority controlling which programs may be run on their systems. That said, there may be situations where IMA could be useful. It seems that the push for these signatures is coming from the parts of the Fedora project working on initiatives like Fedora CoreOS, where much of the system is, in fact, meant to be immutable. It seems unlikely that there will be much call for IMA in, say, the desktop edition, but one never knows. It would be surprising indeed if that edition were to enable IMA by default.

In other words, Fedora users need not fear having IMA pushed on them against their will. But there are still reasons for concern about this proposal. One of those is the simple problem of bloating Fedora packages with that signature data; Panu Matilainen looked into it and found that the overhead is 1,745 bytes for each file, nearly doubling the size of smaller packages. A typical Fedora installation has many files; the extra overhead caused by the signatures adds up to a lot of extra disk space and bandwidth that may well not be useful to a large percentage of Fedora users. That has led to suggestions that perhaps the signatures should be stored in separate packages that could be installed if desired.

Florian Weimer, meanwhile, suggested that, to remain in compliance with the GPLv3 license, Fedora would have to make the signing key available on request. It's not clear that releasing that key would actually be required, though. In any case, that would be a concern for anybody distributing locked-down Fedora systems, and not the Fedora project itself. Meanwhile, as Colin Walters pointed out, any IMA implementation in Fedora would have to make it possible for users to supply their own keys.

There were also some questions about whether IMA is the best technology for this sort of assurance; a couple of participants suggested looking at fs-verity instead. It provides more efficient signature storage and verification with every read; support for fs-verity in RPM is evidently in the works. This is an option that needs to be more fully considered, given that Fedora probably needs to make a choice between IMA and fs-verity. Getting one signature into Fedora packages is a bit of a hard sell; adding a second for yet another integrity scheme would be rather harder yet.

There has been no decision made on this proposal as of this writing; that will likely have to happen at a meeting of the engineering steering committee. Given the concerns that have been raised, and the bloat concern in particular, it would not be surprising to see this idea pushed back for another release cycle. That is a lot of extra data for every Fedora user to download and store, and there is currently no established way for those users to actually benefit from it. Support for integrity measurement may eventually come to Fedora, but the feature seems less than fully baked at the moment.

Comments (23 posted)

Page editor: Jonathan Corbet
Next page: Brief items>>