|
|
Subscribe / Log in / New account

Leading items

Welcome to the LWN.net Weekly Edition for July 31, 2025

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

On becoming a Python contributor

By Jake Edge
July 30, 2025

EuroPython

In the first keynote at EuroPython 2025 in Prague, Savannah Bailey described her path to becoming a CPython core developer in November 2024. She started down that path a few years earlier and her talk was meant to inspire others—not to slavishly follow hers, but to create their own. In the talk, entitled "You don't have to be a compiler engineer to work on Python", she had lots of ideas for those who might be thinking about contributing and are wondering how to do so.

[Savannah Bailey]

She noted that she had recently gotten married and changed her last name from "Ostrowski"; she is "a bit in-between last names right now", she said with a chuckle. Most recently, she has been contributing to the new just-in-time (JIT) compiler for CPython, including as a co-author of PEP 744 ("JIT Compilation"). She is also the maintainer of the argparse module in the standard library, which attendees may "know, use, love, [or] hate".

She is the treasurer for the governing board of the Jupyter Foundation and her day job is to work on Python developer experience at Snowflake. As a mostly self-taught developer, she does not have a computer-science degree, which is something that she "used to be very self-conscious about" but now sees as a strength.

Most importantly, "I am a cat mom of three", complete with a photo of said cats and predictable reactions from the audience. "What's a keynote without a cat picture?" That question started something of a competition throughout EuroPython, featuring more cats, naturally, but also dogs, ferrets, and other pets.

Back to 2020

Bailey said that she wanted to take the audience back to 2020; "no, not that part of 2020, not the sourdough starters and the existential dread". 2020 is when she got her first job working on Python developer tools and her first time working as a product manager; prior to that, she had developed in Python for five years or so. She had "fallen into software" after studying at an engineering school in Canada; she taught herself to program because of the "amazing internships" that were available to programmers.

She started working on an unnamed project in 2020, which became the Pylance language server; it "provides the UX [user experience] bells and whistles and a great editing experience" by adding syntax highlighting, auto-completion, auto-imports, and so on for Python in Visual Studio Code. That exposed her to lots of new things, including "the guts of Python". Pylance is built on top of a static type checker called Pyright, so she learned about type annotations, which she had never used when developing in Python. She and her team were also contributing to open-source projects like typeshed for Python type annotations and referencing PEPs as part of their work. "I was learning bits and pieces about how Python worked under the hood and the open-source ecosystem around it."

After a few months of the "most fun I've ever had in my career", she thought that "it would be so cool to contribute to Python someday". But then came a followup thought: "Let's be real, that's probably never going to happen". She did not know C well—"there's that big C in front of CPython"—and was not working as an engineer anymore at that point. So she did what seemed to be the rational thing and shelved the idea. She thinks that a lot of people have a similar experience where that voice in their head causes them to set aside various goals and dreams because they are "never" going to happen.

Good news

But the good news is that it turns out that she was wrong; "that line of thinking is just 100% wrong", so wrong, in fact, that she wanted to spend a 45-minute keynote slot describing just how wrong she was, she said with a laugh. But, she acknowledged that those feelings of inadequacy are real and did not want to downplay them; "I had those doubts, I still sometimes have those doubts."

Like many others, she originally thought that contributing to Python "always means that you are knee-deep in interpreter internals", which is intimidating for those without the requisite background. The belief is that a Python contributor is "some kind of 10x programming wizard that dreams in bytecode" and spends all of their available time improving Python. There are some contributors and core developers who are like that, perhaps, but the idea that all contributors know "a bunch of stuff that you don't know, and you could never learn, could really not be further from the truth". Python contributors are just people, with different backgrounds, working on various parts of Python at different rates; they have families, hobbies, and other passions beyond just Python.

She thinks that many people already have useful skills for contributing to Python. Those who have written, tested, and debugged Python code, or who like to write documentation, can apply those skills to contributions. Work from other areas of computing likely qualify as well. "Or, maybe, if you just like being thrown into the deep end with a tricky problem, then I bet you already have skills that are relevant to support contributing".

The main barriers to starting to contribute are threefold. A lack of understanding about the different kinds of contributions is one. Another is figuring out how to apply existing skills to Python contributions. The third is to address the impostor syndrome "that gets in the way".

She wanted to dispel the notion that a contribution is always a commit that gets added to a repository somewhere. Asking questions that surface a bug, filing a bug or issue that saves time for others, improving documentation to help people avoid an edge case, or reviewing a pull request and asking for clarification, all help move the project forward, Bailey said.

She started contributing by doing bug triage, which is the process of "combing through open issues and helping to figure out if something is actually a bug", which Python versions it affects, how it can be reproduced, and so on. "For maintainers, triage work is gold." If there is a bug, fixing it is, of course, also welcome, but the triage itself saves lots of time for maintainers.

Triage also provides a nice on-ramp for contributors. She learned a lot about how to build CPython, run its tests, and about the workflows that are used in the project. But, most importantly, she got to know the people involved in the project, she said. "It gave me a sense of how the team communicates and makes decisions."

Over time, she started noticing patterns in the bugs and in parts of CPython that had lots of issues that were unaddressed for a long time. One of those areas was the argparse module; without doing triage, she would not have even known that the module needed help.

Documentation is another good way to learn about CPython. It is incredibly valuable to users, both old and new, but is difficult to write—often harder than the code itself. Writing off documentation as "not real work" is common, "but honestly I think that's total garbage".

No internals needed

Something that is either not well-known or has been forgotten is that much of the standard library is written in pure Python; "it's not C, it's not deep internals, it's just Python". So most Python programmers already have the skills needed to contribute there. She suggested starting as you would with a new job: pick something small, keep learning more about the code base, and build up from there.

So far, she had shown that potential contributors do not need to understand the internals of CPython in order to do fruitful work for the project—triage, documentation, pure-Python standard-library code—but she also wanted to point out that those who want to learn about the interpreter can definitely do so. "You can learn if you're interested, it is totally in the realm of possible." She started only knowing about the interpreter at a high level, got interested in the JIT compiler, started asking lots of questions, poking around in the code, reading and re-reading PEPs, and so on. Eventually she started contributing to the JIT.

She started her learning "in a way that felt really natural to me, by applying my background in DevOps and infrastructure to solve some pretty real problems for CPython's JIT compiler". She really loves working on DevOps tasks, such as build tools and continuous integration, which are "unglamorous and hopefully invisible parts of the code base". Doing those jobs right makes "people happier and more productive, even if they never really know why".

When people think of the JIT compiler, they often think things like "PhD-level expertise required", and the topic does seem kind of academic to her at times. But whenever she traces through the interpreter code or reads a pull request for the JIT compiler, she learns lots of new things; sometimes she feels over her head, "but, honestly, I love that feeling". Earlier when she was over her head, it made her feel like she did not belong; it amplified her self-doubt. She has learned "how to get comfortable with being a bit uncomfortable" and she was able to contribute while still learning by applying skills that she already had.

Her DevOps experience was what the JIT project needed at the time she started contributing because it is important to ensure that CPython can reliably be built for and run on various platforms. So, she helped on upgrading the LLVM version used to build CPython from 16 to 18 and then to version 19. She also worked on ensuring that the JIT builds correctly for both Intel and Apple-silicon Macs. Making the continuous-integration pipeline "faster, more reliable, and less painful for contributors and other core developers" is another area she worked on; for example, she added new continuous-integration jobs for building the JIT into the free-threaded version of CPython.

She also authored PEP 774 ("Removing the LLVM requirement for JIT builds"), which has been deferred for now; "that's not the point, it may never be accepted, but I would still be proud that I wrote it" because writing a PEP seemed completely out of reach to her when she started out. Writing it did not require a special degree or compiler knowledge; it required DevOps, Python, documentation, and research, which are skills and knowledge that she has. To contribute, "you just need to have something to offer and the courage to start showing up".

Community

The Python community is perhaps the biggest thing that helped her over the hurdle of contributing. Before her first commit she had people cheering her on and it has been the community that has kept her coming back. She noted that the famous quote from Brett Cannon ("I came for the language but I stayed for the community") is "100% true" for her as well.

At PyCon US 2023, scientific Python contributor and former steering council member Carol Willing approached Bailey and asked if she had ever thought about becoming a core developer. She responded by laughing, and then crying, "which was kind of awkward because there were a lot of people around". She said that she was interested, but had no idea where to start, which she thinks is a common feeling. That question really changed things for her, "because it took something that I had only dreamed about and made it feel tangible".

A few months went by without any real progress toward contributing when she got together with Willing when she was in Seattle, near where Bailey lives. At the dinner, Willing gave her a bit of a pep talk after finding that Bailey was still interested but was "still figuring out if I can even be useful here". Willing suggested contacting other core developers to get their perspectives on contributing.

The next day, she reached out to her Microsoft colleague Brandt Bucher, who started the CPython JIT project, to ask him if they could chat. That led to several months of nearly weekly meetings for him to answer questions and for them to talk about CPython internals; he also supported her first contributions to CPython along the way. The technical help was useful, of course, but it was the conversations and the relationships with Bucher, Willing, and others "that made me feel like I belonged".

She was sharing all of this, "not because I think it is extraordinary", because that is not what she thinks: "I think it's a story about community". It is about people taking time to share what they know and to make others feel welcome. Not everyone has those connections—she did not when she started working on Python development tools—but they can be built over time. "You can reach out, you can ask questions, you can take those first steps yourself."

Her ask for attendees was to "find your community"; maybe that's on the Python discussion forum, at EuroPython, in the CPython issue tracker, in the documentation or translation communities, or elsewhere. "You should introduce yourself, you should ask questions, you should offer to help, because that's actually how it all starts."

She showed pictures of attendees at the Python Language Summit and core developer sprint, noting that no two people in those pictures took the same path to contributing. She had presented her path in the talk, "your path might look totally different and that's not just okay, it's actually really really important". There is no need for 100 of her working on CPython; it is the diversity of skills and perspectives that is needed. "That's actually how Python grows, how open source stays open, not just to code but also to people."

Getting started contributing does not mean you need to know everything first; "wherever you start, you should just start". She did note that doing so can be scary at first, "but you should just do it scared". Worrying about asking naive questions or being wrong is natural, but can be overcome; beyond that, the naive question is probably one that others have but are too scared to ask. The smartest people she knows "are wrong literally all the time, and they ask a lot of questions and admit when they don't know the answer". Maybe, once you get started contributing, "you'll realize that it's not really that scary after all".

Toolkit

With luck, she had inspired some curiosity about contributing and given attendees the sense that it is possible for them to contribute; but where should people go for more information? Her "contribution toolkit" had four sites that she suggested people explore, starting with the CPython GitHub repository. It has issues and pull requests for the whole project, including the interpreter, standard library, and documentation; "it all happens here". Watching the activity in the repository is a good way to get started by seeing what changes are being proposed, how decisions are being made, and places where some help is needed.

The PEP index is another site to explore; PEPs document major changes to the language, so they provide some historical context for decisions made by the project. "If you've ever wondered why something works the way it does in Python, there's probably a PEP for it."

For questions about PEPs or anything else Python-related, the Python discussion forum is the right place to go. That is "where a lot of bigger picture conversations happen, around features, governance, packaging, docs, typing"; it is a good place to post ideas. She recommended reading the forum for a while before posting, as there is a lot that can be learned that way; when posting, please "be nice and kind", she asked, so that the forum can "continue to be a really good experience for everyone".

Lastly, she said that the Python Developer's Guide is "really your roadmap to contributing". It covers things like setting up a development environment, building CPython, running tests, submitting pull requests, and so on. There is also information about the criteria for joining the triage team and for becoming a core developer. The experts index in the guide is useful to figure out which core developers are active in any given part of the project. "These are some great places to start: pick a link, open the tab, start poking around."

She closed by showing her response to the vote to promote her to the core-developer team, noting that "it captures what this talk has really been about": the people that make Python possible. Her slides are available in her GitHub repository and a video of the talk should appear before long in the EuroPython YouTube channel.

[I would like to thank the Linux Foundation, LWN's travel sponsor, for travel assistance to Prague for EuroPython.]

Comments (2 posted)

Graphene OS: a security-enhanced Android build

By Jonathan Corbet
July 24, 2025
People tend to put a lot of trust into their phones. Those devices have access to no end of sensitive data about our lives — our movements, finances, communications, and more — so phones belonging to even relatively low-profile people can be high-value targets. Android devices run free software, at least at some levels, so it should be possible to ensure that they are working in their owners' interests. Off-the-shelf Android installations tend to fall short of that goal. The GrapheneOS Android rebuild is an attempt to improve on that situation.

GrapheneOS got its start as "CopperheadOS"; it was reviewed here in 2016. A couple of years later, though, an ugly dispute between the two founders of that project led to its demise. One of those founders, Daniel Micay, continued the work and formed what eventually became GrapheneOS, which is, according to this history page, an independent, open-source project that "will never again be closely tied to any particular sponsor or company". Work on GrapheneOS is supported by a Canada-based foundation created in 2023; there appears to be almost no public information available regarding this organization, though.

At its core, GrapheneOS is an effort to harden Android against a number of threats and to make Android serve the privacy interests of its users. It is based on the Android Open Source Project, but removes a lot of code and adds a long list of changes. Some of those, such as a hardened malloc() library or the use of additional control-flow-integrity features, will be mostly invisible to users (unless they break apps, of course, which has evidently been known to happen). Others are more apparent, but it is clear that a lot of effort has gone into making the security improvements as unobtrusive as possible.

Installation

Some Android rebuilds prioritize supporting a wide range of devices, often with an eye toward keeping older devices working for as long as possible. GrapheneOS is not one of those projects. The list of supported hardware is limited to Google Pixel 6 through Pixel 9 devices, with some trailing-edge support for Pixel 4 and 5 devices. Even then, though, the newer devices are strongly recommended:

8th/9th generation Pixels provide a minimum guarantee of 7 years of support from launch instead of the previous 5 year minimum guarantee. 8th/9th generation Pixels also bring support for the incredibly powerful hardware memory tagging security feature as part of moving to new ARMv9 CPU cores. GrapheneOS uses hardware memory tagging by default to protect the base OS and known compatible user installed apps against exploitation, with the option to use it for all apps and opt-out on a case-by-case basis for the few incompatible with it.

My phone had been making it clear for a while that it could not be counted on in the future, but the prospect of buying a new one inspired a lot of trepidation. Each new device seems to come with more privacy-hostile "features" and intrusive AI "assistants"; finding all of the necessary "disable" switches is a tedious and error-prone task. That, along with the news that Google's "Gemini" feels increasingly entitled to a device-owner's data regardless of its configuration, inspired the purchase of a Pixel 9 device that would be used to experiment with GrapheneOS to see if it could replace stock Android for everyday use.

Flashing the firmware of an expensive device is always a bit of a nervous prospect; the GrapheneOS installer is designed to minimize the amount of fingernail biting involved in the process. There are two installation methods described in the documentation — a web-based install, and one that works from the command line. Naturally, I chose the command-line version. The instructions are straightforward enough: download the installation image, connect the device, and run the supplied script. Said script ran to completion and confidently declared victory at the end, but the device still only booted into normal Android — a repeatable result, but not quite the intended one.

Some investigation turned up the (undocumented) fact that the web installation method is seen as being rather more reliable than the command-line version. So I tried that, and it worked as intended; the GrapheneOS experiment had begun in earnest.

First impressions

[GrapheneOS screenshot] Stock Android includes some nice features to make the move to a new device as easy as possible — unsurprising, given the strong incentive to get people to make that move often. Most of the data, apps, and configurations that were on the old device will be automatically moved to the new one. GrapheneOS has no such feature; a newly installed phone is a blank slate that must be reconfigured from the beginning. One should expect to spend a lot of time rediscovering all of those settings that were set just right some years ago.

As can be seen from the screenshot to the right, the initial GrapheneOS screen is an austere and monochromatic experience. The system handles color just fine, but color is something for the owner to configure, it seems.

A stock Android install comes with a large set of apps out of the box, many of which the user likely never wanted in the first place, and many of which often cannot be deleted. GrapheneOS does not have all of that stuff. It comes with its own versions of the web browser, camera app, PDF viewer, and app store. Notably, GrapheneOS does not include the Google Play store or any apps from there (but keep reading for Google Play). The app store offers all of 13 apps in total.

The web browser is a Chromium fork called Vanadium. It enables strict site isolation on mobile devices (which Chrome evidently does not) and adds a number of code-hardening features. The documentation strongly recommends avoiding Firefox, which is described as "more vulnerable to exploitation".

The camera app is said to be the best available in a writing style that is often encountered with GrapheneOS:

GrapheneOS Camera is far better than any of the portable open source camera alternatives and even most proprietary camera apps including paid apps. On Pixels, Pixel Camera can be used as an alternative with more features

The camera app strips Exif metadata by default, and location metadata must be enabled separately if it is wanted.

App stores

One other thing that can be installed from the GrapheneOS store is the Accrescent app store, which is an alternative repository that claims a focus on security and privacy. It provides access to a few dozen more apps, including Organic Maps, the Molly Signal fork, and IronFox, a hardened version of Firefox.

With those app stores, one can enable a certain amount of basic phone functionality, but the sad fact is that many of us will need a bit more than that. One alternative, of course, is F-Droid, which can certainly be installed and used on GrapheneOS. Hard-core security-oriented people, including those in the GrapheneOS community, tend to look down on F-Droid (see this article for an example), but it is a useful source for (mostly) free-software apps.

In the end, though, it will often come down to using the Google Play store; an Android device can be nearly useless for many people without the apps found there. GrapheneOS offers a sandboxed version of Google Play that turns it into an ordinary app without the special privileges that Google Play has on stock Android systems. It worked without a hitch here; the documentation says that some apps may not work properly, but I did not encounter any.

It is worth noting that Android provides an "integrity API" that can be used to query the status of the software running on the device. Among other things, it can attest to whether the secure-boot sequence was successfully executed, or whether the device is running an official Android build. GrapheneOS implements this API and, since it uses the secure-boot machinery, can pass the first test, but it is not an official image and cannot pass the second. Some apps care about the results of these queries and may refuse to work if they get an answer they don't like.

GrapheneOS will put up a notification for each use of this API, so it is easy to see which apps are using it. Most don't, but some definitely do. I saw a few apps query this API, but did not encounter any that refused to work; booting securely was good enough for them. Some others are pickier; there is a short list of apps that refuse to run under GrapheneOS available. Testing any important apps before committing to an alternative build like GrapheneOS is thus an important bit of diligence. One just has to hope that a future app update won't make a working app decide to stop cooperating; this is a definite risk factor associated with using any alternative Android build.

Security features

GrapheneOS includes a number of security and privacy features beyond the under-the-hood hardening. Many of them are designed to make the device work as if the owner of the device actually owns it. For example, the provisioning data included with Android, which tells the device how to work with carriers around the world, allows those carriers to specify that features like tethering are not to be made available. GrapheneOS never quite got around to implementing that part of the system. There is, instead, an option to prevent the phone from being downgraded to older, less-secure cellular protocols.

Standard Android gives control over some app permissions, but does not let users deny network access to an app. GrapheneOS does provide that control, though network access is enabled by default for compatibility reasons. If network access is disabled, the app in question sees a world where that access is still available, but, somehow, the device just never finds a signal. So apps should not refuse to run just because network access is unavailable (though they may, of course, fail to run correctly).

There is a "sensors" permission bit that controls access to any sensors that are not subject to one of the other permissions; these include the accelerometer, compass, thermometer, or any other such that may be present. This permission, too, is enabled by default but can be turned off by the owner.

The storage scopes feature can put apps into a sandbox where they believe they have full access to the device's shared storage, but they can only access the files they have created themselves. There is also a contact scopes feature that allows apps to believe they have full access to the owner's contacts, while keeping most or all of that data hidden from those apps.

GrapheneOS supports fingerprint unlocking, just like normal Android, with one difference: after five consecutive failures, the fingerprint feature is disabled for 30 minutes. An owner being forced to supply a finger to unlock a device can thus disable that functionality quickly by using an unrecognized finger. For those whose privacy needs are more stringent, a duress PIN can be configured; entering that PIN causes the device to immediately wipe all of its data. Needless to say, this self-destruct feature should be used with care.

There is a special app that can audit the state of a GrapheneOS device and, using the hardware security features, provide an attestation that the device has not been tampered with or downgraded to an older software version.

The project makes frequent releases, and installed GrapheneOS systems update aggressively. The project updated to the Android 16 release in early July, slightly less than one month after Google released that version. In the default configuration, the device will automatically reboot after 18 hours of inactivity as a way of pushing all data to (encrypted) rest; that also has the effect of making the device run the latest software version.

See also this page comparing a long list of security features across several Android-based builds.

Governance and community

One potential caveat is that the development community behind GrapheneOS is somewhat murky. As mentioned, a foundation exists to support this system, but there is little information about how the foundation operates beyond an impressively long list of ways to donate. The public registry information shows three directors: Micay, Khalykbek Yelshibekov, and Dmytro Mukhomor, but there is no public information on how directors are chosen or how the foundation uses its funds.

There is a vast set of repositories containing the project's source, but there is little information on how one might contribute or what the development community is up to. Some information can be found on the build-instructions page. The project runs a set of chat rooms and a forum, but they seem to be dominated by user-oriented conversation rather than development. Participation by the project in the forums comes from a generic "grapheneos" account.

In a response to a private query, the project claimed to have ten active, paid developers, most of whom are full time. One gets the feeling, though, that Micay is still the driving force behind GrapheneOS; if nothing else, the project's belligerent fediverse presence bears a lot of resemblance to his previous interaction patterns. What would happen if he were to depart the project is far from clear. There is a potential risk here that is hard to quantify.

Overall impressions

Setting up the device with GrapheneOS required a couple of days of work, much of which was dedicated to reproducing the apps and configuration on the older device. A certain amount of time must be put into setting the privacy features appropriately and giving apps the permissions they need to work. In the end, though, the device works just as well as its predecessor, with all the needed functionality present, and a lot of unneeded functionality absent. I have committed willingly to using it, and have no intention of going back.

The system is undoubtedly more secure, even if the invisible hardening changes do not actually do anything. The sandboxing is tighter, there is more control over what apps can do, and there is no AI jinni doing its best to escape its bottle.

The remaining problem, of course, is that, for many people, GrapheneOS alone will not be enough, and it will be necessary to let the nose of proprietary software into the tent. The documentation says that logging into the Play Store is not required, but it insisted on a login for me, re-establishing the umbilical connection to Google that installing GrapheneOS had cut. The keyboard does not support "swipe" typing; users who want that will likely end up installing GBoard, which poses privacy risks of its own. The GrapheneOS messaging app works, but Google's app can filter out some spam, one might as well toss it on. There are some reasonable, privacy-respecting weather apps on F-Droid these days, but the proprietary, privacy-trashing ones have better access to weather alerts (at least in countries that still have functioning weather agencies) and red-flag warnings. Android Auto is highly useful, and it works fine in GrapheneOS, but it requires its own level of special access permissions.

Then there is the whole slew of banking apps, ride-share apps, airline apps, and so on that, seemingly, are indispensable in modern life. Each of these pokes another hole into the private space that GrapheneOS has so carefully created. It is possible to live and thrive without these tools, and many of us know people who do, but the tools exist and are popular for a reason. For many, it is simply not possible to get by without using proprietary software, much of which is known to be watching our every move and acting in hostile ways.

Putting GrapheneOS onto a phone, at least, forces an awareness of each hole that is being poked, and provides an incentive to minimize those holes as much as possible. When potentially malicious software has to be allowed onto a device that contains many of our closest secrets, the system will at least do its best to keep that software within its specified boundaries and unable to do anything that it is not specifically allowed to do. Installing GrapheneOS orients a device more toward the interests of its owner; that, alone, is worth the price of admission.

Comments (42 posted)

Smaller Fedora quality team proposes cuts

By Joe Brockmeier
July 28, 2025

Fedora's quality team is looking to reduce the scope of test coverage and change the project's release criteria to drop some features from the list of release blockers. This is, in part, an exercise in getting rid of criteria, such as booting from optical media, that are less relevant. It is also a necessity, since the Red Hat team focusing on Fedora quality assurance (QA) is only half the size it was a year ago.

The team is responsible for a host of activities which include testing of software, running test days, maintaining tools for test automation, and coordinating the Fedora release process with the release engineering team. The quality team is composed of Red Hat employees and Fedora community contributors, but it is fair to say that the bulk of the team's work is done by those employed to by Red Hat.

Unfortunately, according to an announcement by Kamil Páral, a member of the team, there is a somewhat urgent need to reduce its workload. Six out of ten Red Hat employees who had been working on the team have chosen to move to other teams within Red Hat over the past nine months, or have left Red Hat altogether. Only one new person has joined the team. Páral pointed out in the announcement that this was not the result of a layoff or intentional reduction of the quality team; he said that the moves were "truly decisions of our colleagues", some of whom opted to move to AI-focused roles or other jobs within Red Hat.

Red Hat is hiring for at least one more quality engineer, but with only five people currently doing the work of ten, the team is looking to shed some tasks with "a poor price/benefit ratio". Páral said that the team is looking to make permanent changes to the way it operates, rather than making temporary adjustments and waiting for Red Hat to hire enough people to keep things as-is:

We decided to look at this change as an opportunity to re-examine what's really important to Fedora right now, and aim to use our resources as wisely as possible for the future. We're not sure that spending a lot of time on manual testing of relatively less-widely-used functionality is the best thing the RH team could be doing, even if we had more people to continue doing it. We will be hiring in future [...] but we want to be able to consider the best possible way to spend all of the paid team's time going forward, and perhaps spend it on other things that will be more valuable in the end, like automation or supporting individual teams.

There is also the fact that Fedora has continued to add more and more to the testing pile over time; it's rarer for things to be removed. Páral said that some of the changes being proposed could have been done a long time ago, "we just lacked the right impetus".

Proposals

The announcement lists nine changes to Fedora's release criteria that the quality team would like to make, but only five have full proposals so far. These will need approval by the Fedora Engineering Steering Committee (FESCo) or the Fedora Council; the quality team cannot simply change the release criteria by itself. However, Páral points out that since the team does not have enough people to cover everything, if a proposal is rejected by FESCo or the council "an alternative solution might be needed".

Most of the proposals involve removing criteria from the list of release blockers; that does not mean that a feature or deliverable will no longer be included, it simply changes the priority of any bugs related to the features or deliverables to be non-blocking. If there is a bug found in one of the de-prioritized areas during testing for a new Fedora release, it will not delay the release.

One of the full proposals so far is dropping optical-media boot as a release-blocking feature. This should not be too controversial; most laptops and desktops do not even include a CD-ROM or DVD-ROM drive of any sort these days. Even if a system has optical media, it should also be capable of booting from a USB drive.

The team also wants to drop dual-booting from Intel-based Apple hardware as a release blocker. Again, this seems like a feature that is of limited relevance now that Apple has phased out Intel systems in favor of its own Arm-based silicon and is dropping support for those systems with newer releases of macOS. At any rate, the testing team lacks the hardware to test Intel dual-boot; in October last year, Páral had to put out a call for someone with an Intel-based Mac to test dual-booting Fedora with macOS because the quality team had no such hardware to test with. Igor Jagec answered the call then, but it is easy to see why the quality team is unwilling to block releases for hardware it has to go scrounging for.

Another proposal seeks to pare down the list of applications that are release-blocking for the Workstation edition. Right now the criteria says that, for Workstation on x86_64, "all applications installed by default which can be launched from the Activities menu" must have basic functionality or it's a release-blocking bug. Páral makes the case that this criteria has "multiple long-standing issues" such as a lack of agreement on what "basic functionality" actually means and that leads to problems during blocker-review meetings. That is compounded by the fact that bugs in these applications tend to be found very close to the final release, and that automating testing of those applications is particularly difficult and the time spent could be better used on other work.

The quality team would also like to limit the release-blocking status of BIOS systems to specific and simple-to-test scenarios. The team's justification for this is that it considers BIOS-only systems to be a rarity at this point, and available UEFI hardware that has a compatibility-support module (CSM) for BIOS is getting harder to find.

That does not mean Fedora users with older BIOS-based hardware would be entirely out in the cold. For example, it would still be a release blocker if Fedora will not install on BIOS-based systems "which use the default automatic partitioning layout to a single empty SATA or NVMe drive". But bugs that affect more complex partitioning layouts or alternative storage types would no longer be considered release blocking. Bugs that impact upgrading Fedora systems with a BIOS would also be considered blockers, and Fedora's cloud images will require BIOS support because Xen on AWS EC2 still uses BIOS boot.

Right now, the criteria for Fedora 43 include "all applications installed by default which can be launched from the Activities menu" for Fedora Workstation on x86_64. The testing team has proposed dropping that criteria and limiting release-blocking applications to a list of 11 basic applications, such as the default web browser, file manager, image viewer, and system settings. Culling the list, Páral said, may reduce the quality of less-critical applications, but it would "reduce the likelihood of long blocker arguments, pre-release crunches and stress".

In the interest of having fewer arguments about blocker bugs, the team also wants to be "stricter about changes between Beta and Final" unless the changes were agreed on ahead of time. That seems likely to be approved by FESCo, depending on the details in the final proposal.

Just one Arm

There is also a proposal to focus on only one popular 64-bit Arm device, such as the Raspberry Pi 4, as a release-blocker. Currently, there are three separate lists of allegedly supported Arm hardware for Fedora that have developed over time; a supported platforms page that has not been updated since 2020, a second supported platforms page with different hardware, and the reference platforms for the Fedora IoT edition. In the Arm proposal, Páral notes that slimming down the list to one device would not only reduce the burden of testing, "it would also clarify that confusion and remove the burden of reconciling and updating all these lists". Being able to run Fedora in a virtual machine on Arm would still be a requirement for release, however, which should cover cloud use cases.

There are two other Arm-related items that may be more controversial when the full proposals are published: dropping desktops on 64-bit Arm as release-blocking criteria and asking the Fedora Council whether Fedora IoT "still makes sense" as an edition. Presumably, the question is whether IoT should be demoted to a Fedora spin, which would not be release-blocking by default. It became an edition in 2020, with the Fedora 33 release; LWN took a look at Fedora IoT at the time.

That is a fair question; the IoT mailing list and discussion category on Fedora's Discourse forum show few signs of life. There are no posts under the IoT category from June 24 through July 24, and only four emails sent to the mailing list during the same period. Three of the emails are automated meeting reminders, and the other is a notification from Páral of the proposal to limit release-blocking Arm hardware.

I emailed Peter Robinson, who drove the original effort to promote Fedora IoT to edition status while he was employed by Red Hat, to get his thoughts on its possible demotion. He wondered why the quality team would be driving an effort to change IoT's status; it is the upstream of some of Red Hat's "edge" products, and he said that one might think that other Red Hat employees would be more involved with the project. Ultimately, though, he said it might be good for IoT to be demoted because it might allow "a reset of expectations" and experimentation with different technologies. For example, he said he did not think that OSTree or bootc were the right choices for IoT. (LWN covered bootc last June.)

He also questioned the decision to reduce focus on aarch64 hardware, given that the number of Arm laptops and single-board computers (SBCs) is on the rise. That is true, but it is also something of a chicken-and-egg situation; Fedora does not seem to have a large user base on Arm, but to get there it needs better Arm support.

The Fedora Workstation and KDE Plasma Desktop editions have installation images for Arm systems; one might expect that if KDE or GNOME are failing tests on Arm then a Fedora release is not ready to be shipped. It is unclear, however, how many people actually use Fedora as a desktop operating system on Arm hardware. In the the quality team meeting on July 21 (meeting log), Páral said that testing desktops on Arm was one of the slowest things to do in QA, and there were too few users to justify it "according to some data estimates" from former Fedora Project Leader (FPL) Matthew Miller.

Adam Williamson, who leads Fedora's quality team, said during the meeting that he thought there was some discussion that "Fedora is just so slow for desktop purposes on SBCs that most people wind up using the distros with out-of-tree patches". Neal Gompa said that KDE on Raspberry Pi 4 was "reasonably performant", but Fedora Workstation "chugs" with the larger problem being that "GTK4 just flat out crashes on RPi systems since moving to Vulkan".

Kashyap Chamarthy asked Gompa if there were "a non-trivial portion of users who care about aarch64 desktops". Gompa said that Fedora KDE on Arm use was growing "in large part because we're actively promoting it that way", but the challenge was that the current release-blocking criteria for Arm hardware seemed like a mess right now.

Sorting out that mess, as well as other work that has traditionally fallen to the quality team, will require more hands than the team has available at the moment. The team is asking for more involvement from other teams—such as the Workstation working group, KDE special-interest group, and cloud working group—when it comes to manual testing, debugging issues, and validating fixes.

It is also looking for new maintainers for the Fedora Packager Dashboard and its data parser, oraculum. Currently the team is trying to find maintainers within Red Hat to take these over but will send out an announcement to the larger community if that is unsuccessful. If both of those efforts fail, the dashboard may go away entirely.

Reactions

So far, the quality scope-reduction announcement has not generated as much response as one might expect. This is probably, at least in part, because it's peak vacation season in much of the Northern Hemisphere; many of the usual participants in Fedora discussions may be enjoying time away from their computers. It may also be because of the venue chosen for the discussion; Páral asked that feedback be restricted to the Discourse forum when he sent the announcement to the fedora-devel mailing list. Fedora is a project with a long history, and many of its participants still prefer to discuss things via email rather than web forums.

Most of the responses that have come in so far are largely in favor of the proposals. Pat Kelly thought the team was taking a good approach to the situation, and that it would serve the community well in the long run.

"P G" agreed that the proposals seemed reasonable, but wondered about the impact of the "resourcing situation" on the initiative from Fedora's strategy 2028 plan to add accessibility features in its editions to the release-blocking criteria. Accessibility features are part of test criteria now, but they do not block a release.

Sumantro Mukherjee was supposed to drive an effort to help prepare Fedora's editions so that the project could make accessibility tests must-pass for release. Unfortunately, he is one of the people who have recently left the team to work on AI-related things. It is unclear who might fill that gap. Williamson replied to P G that he hoped that it would not impact the initiative, but it could do so if the team is unable to find ways to automate the needed testing.

Fedora, in its early days, was not known for good quality control. It has done much to remedy that over the years, and it generally enjoys a good reputation as a solid, user-friendly distribution today. One hopes that the project can find ways to maintain its quality despite having fewer people being paid to focus on the task.

Comments (1 posted)

Some 6.16 development statistics

By Jonathan Corbet
July 28, 2025
The 6.16 development cycle was another busy one, with 14,639 non-merge changesets pulled into the mainline — just 18 commits short of the total for 6.15. The 6.16 release happened on July 27, as expected. Also as expected, LWN has put together its traditional look at where the code for this release came from.

Work on 6.16 came from 2,057 developers, a reasonably high number relative to previous releases. Of those, though, 310 contributed their first patch to the kernel this time around, the highest new-contributor rate since the release of 6.12 (335 new developers) in late 2024. The most active contributors this time around were:

Most active 6.16 developers
By changesets
Kent Overstreet 3582.4%
Herbert Xu 2141.5%
Matthew Wilcox 1911.3%
Krzysztof Kozlowski 1631.1%
Bartosz Golaszewski 1571.1%
Johannes Berg 1501.0%
Rob Herring1411.0%
Eric Biggers 1350.9%
Alex Deucher 1300.9%
Dmitry Baryshkov 1160.8%
Jakub Kicinski 1150.8%
Michael Rubin 1100.8%
Thomas Weißschuh 1080.7%
Marc Zyngier 1050.7%
Jani Nikula 1050.7%
Ingo Molnar 1050.7%
Thomas Zimmermann 1040.7%
Ian Rogers 1020.7%
Christoph Hellwig 1010.7%
Filipe Manana 1010.7%
By changed lines
Rob Herring197682.7%
Ian Rogers 180482.4%
Ben Skeggs 166592.2%
Kuniyuki Iwashima 163512.2%
Herbert Xu 156242.1%
Kent Overstreet 121081.6%
Johannes Berg 111701.5%
Antonio Quartulli 105601.4%
Eric Biggers 84691.1%
AngeloGioacchino Del Regno 83871.1%
James Morse 80451.1%
Keke Li 79341.1%
Nicolas Pitre 78251.1%
Mauro Carvalho Chehab 73351.0%
Richard Fitzgerald 65310.9%
Inochi Amaoto 64960.9%
Jani Nikula 62660.8%
David Howells 59310.8%
Wesley Cheng 54540.7%
Tiwei Bie 52370.7%

For the third time in a row, the developer with the most changesets is Kent Overstreet, who continues to work to stabilize the bcachefs filesystem (though the future of that work in the kernel is in doubt). Herbert Xu worked extensively on refactoring within the crypto subsystem (of which he is the maintainer). Matthew Wilcox's commit count is dominated by the conversion of the F2FS filesystem to use folios, but he made many other changes in the memory-management subsystem as well. Krzysztof Kozlowski contributed small improvements throughout the driver subsystem, and Bartosz Golaszewski did a lot of refactoring, mostly within the pin-control and GPIO subsystems.

The numbers in the "changed lines" column are relatively small this time around; this cycle lacked the old-code removals and massive amdgpu header dumps that often show up here. That said, Rob Herring appears at the top by virtue of having removed some unused USB controllers. Ian Rogers updated the Intel performance-monitoring event definitions. Ben Skeggs made a number of changes to the nouveau graphics driver. Kuniyuki Iwashima removed support for the DCCP network protocol.

The top testers and reviewers for 6.16 were:

Test and review credits in 6.16
Tested-by
Daniel Wheeler 1066.6%
Timur Tabi 603.7%
Arnaldo Carvalho de Melo 503.1%
Thomas Falcon 412.5%
Judith Mendez 382.4%
Tomi Valkeinen 342.1%
Tony Luck 271.7%
Fenghua Yu 251.5%
Venkat Rao Bagalkote 231.4%
Oleksandr Natalenko 231.4%
Shaopeng Tan 201.2%
Ingo Molnar 201.2%
Mark Broadworth 201.2%
Babu Moger 191.2%
Mor Bar-Gabay 181.1%
Weilin Wang 181.1%
Reviewed-by
Dmitry Baryshkov 2542.5%
Konrad Dybcio 2382.4%
Simon Horman 1972.0%
Ilpo Järvinen 1781.8%
Krzysztof Kozlowski 1761.8%
Chao Yu 1721.7%
Geert Uytterhoeven 1451.5%
David Sterba 1371.4%
Andy Shevchenko 1221.2%
Vasanthakumar Thiagarajan 1211.2%
Rob Herring1071.1%
David Lechner 1061.1%
Linus Walleij 991.0%
Neil Armstrong 981.0%
Hannes Reinecke 961.0%
Christoph Hellwig 930.9%

Daniel Wheeler is still the permanent resident at the top of the testing column; Timur Tabi made a debut in the second position by testing a long series of nouveau patches. On the review side, both Dmitry Baryshkov and Konrad Dybcio reviewed changes related mostly to Qualcomm devices. In the end, 1,375 commits (9.4% of the total) in 6.16 contained Tested-by tags, while 7,518 commits (51.4%) had Reviewed-by tags.

(Subscribers can consult the LWN Kernel Source Database 6.16 page for more details on this activity and more).

Work on 6.16 was supported by 209 employers, a fairly typical number. The most active employers were:

Most active 6.16 employers
By changesets
Intel165511.3%
(Unknown)12958.8%
Red Hat11177.6%
(None)9486.5%
Google9276.3%
Linaro8175.6%
AMD7895.4%
Qualcomm5283.6%
SUSE4343.0%
Meta4202.9%
Oracle3832.6%
NVIDIA3542.4%
Huawei Technologies3492.4%
Arm3432.3%
Renesas Electronics3232.2%
Linutronix2801.9%
NXP Semiconductors2111.4%
IBM2031.4%
(Consultant)1961.3%
Collabora1831.3%
By lines changed
Intel682729.2%
(Unknown)676199.1%
Red Hat534317.2%
Google514056.9%
Arm350504.7%
NVIDIA340344.6%
Qualcomm334624.5%
AMD331764.5%
(None)312154.2%
Linaro302884.1%
Huawei Technologies193372.6%
Amazon.com189932.6%
Collabora165942.2%
Meta155602.1%
NXP Semiconductors122271.6%
Renesas Electronics114161.5%
SUSE110921.5%
BayLibre106581.4%
OpenVPN Inc.105461.4%
Amlogic104251.4%

As usual, there are no real surprises in this list.

Bugs, new and old

When developers fix a bug in the kernel, they normally try to identify the commit that introduced that bug in the first place; the addition of a "Fixes" tag to the commit changelog documents that relationship. This tag can be useful for people deciding whether to backport a patch, but it also can provide a picture of how long bugs live in the kernel. For the 6.16 release, a look at these tags produces the following picture. For each previous kernel release, the "Fixed" column shows how many commits were named in Fixes tags, and the "By" column is the number of 6.16 commits that identified the fixed commits.

Releases fixed in v6.16
ReleaseCommits
FixedBy
v6.15 228 259 259
v6.14 111 142 142
v6.13 92 104 104
v6.12 81 98 98
v6.11 77 88 88
v6.10 55 61 61
v6.9 53 58 58
v6.8 62 73 73
v6.7 49 60 60
v6.6 35 38 38
v6.5 41 41 41
v6.4 33 40 40
v6.3 46 60 60
v6.2 34 41 41
v6.1 25 32 32
v6.0 20 21 21
v5.19 39 49 49
v5.18 22 28 28
v5.17 29 34 34
v5.16 13 13 13
v5.15 20 26 26
v5.14 18 23 23
v5.13 20 24 24
v5.12 15 17 17
v5.11 13 15 15
v5.10 18 21 21
v5.9 21 22 22
v5.8 16 16 16
v5.7 19 20 20
v5.6 11 15 15
v5.5 12 14 14
v5.4 13 13 13
v5.3 8 8 8
v5.2 9 9 9
v5.1 13 15 15
v5.0 16 16 16
v4.20 4 5 5
v4.19 14 15 15
v4.18 9 11 11
v4.17 8 10 10
v4.16 9 8 8
v4.15 13 14 14
v4.14 11 13 13
v4.13 5 7 7
v4.12 7 7 7
v4.11 6 6 6
v4.10 14 16 16
v4.9 7 10 10
v4.8 12 13 13
v4.7 8 8 8
v4.6 4 5 5
v4.5 9 10 10
v4.4 5 5 5
v4.3 6 6 6
v4.2 6 7 7
v4.1 3 3 3
v4.0 5 5 5
v3.19 2 2 2
v3.18 6 6 6
v3.17 4 4 4
v3.16 3 2 2
v3.15 6 6 6
v3.14 7 7 7
v3.13 4 3 3
v3.12 5 7 7
v3.11 5 5 5
v3.10 5 5 5
v3.9 1 2 2
v3.8 6 6 6
v3.7 2 3 3
v3.6 1 1 1
v3.5 6 8 8
v3.4 2 2 2
v3.3 3 3 3
v3.2 3 3 3
v3.1 2 2 2
v3.0 3 3 3
v2.6.39 3 3 3
v2.6.38 1 1 1
v2.6.37 2 2 2
v2.6.36 2 2 2
v2.6.34 1 1 1
v2.6.32 3 4 4
v2.6.31 2 2 2
v2.6.30 4 4 4
v2.6.29 7 9 9
v2.6.28 2 2 2
v2.6.27 1 1 1
v2.6.25 2 2 2
v2.6.24 1 1 1
v2.6.23 1 1 1
v2.6.22 3 3 3
v2.6.21 2 2 2
v2.6.20 2 3 3
v2.6.19 2 2 2
v2.6.18 2 2 2
v2.6.15 2 2 2
v2.6.14 2 2 2
v2.6.13 1 1 1
v2.6.12 2 17 17

This picture looks similar for every release; there is always a long tail of bugs that are fixed many years after having been introduced. Almost every release made in the Git era (the last 20 years) is represented here. There are even 16 commits in 6.16 that fix bugs introduced in the initial commit in the kernel repository; these are bugs that predate the use of Git.

Amusingly (but, again, typically), there are two commits (3637e457eb00 and 81bf912b2c15) in 6.16 that were fixed by four commits (d33724ffb743, d433981385c6, 38e93267ca68, and ca4f113b0b4c) in 6.15. This seemingly clairvoyant development activity is usually an artifact resulting from the cherry-picking of commits between branches that is done in some subsystems.

As of the start of the 6.17 merge window, there are 11,451 non-merge commits waiting in the linux-next tree, suggesting that 6.17 will be a slightly smaller release than 6.16 was. LWN will, of course, keep you informed of the significant changes in this merge window, stay tuned.

Comments (none posted)

A proxy-execution baby step

By Jonathan Corbet
July 29, 2025
Priority inversion comes about when a low-priority task holds a resource that is also needed by a high-priority task, preventing the latter from running. This problem is made much worse if the low-priority task is unable to gain access to the CPU and, as a result, cannot complete its work and free the resources it holds. Proxy execution is a potential solution to this problem, but it is a complex solution that has been under development for several years; LWN first looked at it in 2020. The 6.17 kernel is likely to contain an important step forward for this long-running project.

The classic solution for priority inversion is priority inheritance; if a high-priority task finds itself blocked on a lock, it lends its priority to the lock holder, allowing the holder to progress and release the lock. Linux implements priority inheritance for the realtime scheduling classes, but that approach is not really applicable to the normal scheduling classes (where priorities are far more dynamic) or the deadline class (which has no priorities at all). So taking a different tack is called for.

That tack is proxy execution. While priority inheritance donates a task's priority to another, proxy execution also donates the waiting task's available CPU time. In short, if a high-priority ("donor") task finds itself waiting on a lock, the lock holder (the "proxy") is allowed to run in its place, using the donor's time on the CPU to get its work done. It is a relatively simple idea, but the implementation is anything but.

The next step

This patch series from John Stultz (containing the work of several developers) pushes the proxy-execution project one significant step forward. It starts by adding a new kernel configuration option, SCHED_PROXY_EXEC, to control whether the feature is built into the kernel. At this point, proxy execution is incompatible with realtime preemption, and with the extensible scheduler class as well, so the kernel cannot (yet) be built with all of those features enabled.

The kernel's massive task_struct structure, used to represent a task in the system, optionally contains a field called blocked_on. This field, which is only present if mutex debugging is enabled, tells the kernel which mutex (if any) a task is currently waiting for. The proxy-execution series makes this field unconditional, so that it is always available for the kernel to refer to. It provides a crucial link that lets the kernel determine which task needs to be allowed to run so that it can release the mutex in question.

One of the key changes in the overall proxy-execution project is the separation of "scheduling context" from "execution context". The scheduling context is essentially a task's position in the scheduler's run queue, while the execution context describes the task that actually runs when the scheduling context is selected for execution. In current kernels, the two contexts are always the same, but proxy execution will change that situation. One task's scheduling context may be chosen to run, but the holder of the lock that task is waiting for is the one that will get the CPU.

That leads to a bit of an accounting problem, though. The CPU time used by the execution context will be charged against the scheduling context — the proxy will burn a bit of the donor's time slice so that it can get its work done. But the total CPU time usage of the execution context should be increased to reflect the time it spends running in the proxy mode. That is the time value that is visible to user space; having it reflect the actual execution time of the task makes it clear that the task is, indeed, executing.

Normally, when a task finds itself blocked on a mutex, that task is deactivated (removed from the run queue) and not further considered by the scheduler until the mutex is released. Another important change made to support proxy execution is that a task that is blocked in this way is, instead, left on the run queue, but marked as being blocked. If the scheduler picks that task for execution, it will see the special flag, follow the blocked_on pointer to the lock in question, and from there it can find the owner of that lock. The lock owner can then be run in the blocked task's place.

That, at least, is the idea, but there are complications. For example, the lock-holding task may, itself, be blocked on a different lock. So the scheduler cannot just follow one pointer, it must be prepared to follow a chain of them until it finds something that can actually be run to push the whole chain forward. The current series includes an implementation of that logic. For extra fun, the situation could change while the scheduler is following that chain, so it must check at the end and, if it appears that the state of the relevant tasks has changed, bail out and restart from the beginning.

The EEVDF scheduler can, in some circumstances (described in this article), leave a task on the run queue even though that task has used its time slice and is not actually eligible to run. In the current patch series, if the lock holder turns out to be in this "deferred dequeue" state, the scheduler just gives up, deactivates the blocked task, and tries again. Dealing with this special case is just one of many details that have been left for future work.

The biggest limitation of the proxy-execution work, as seen in this patch series, comes about if the lock holder is running on a different CPU than the blocked task. There is a whole set of complications that are involved in this case, so the code doesn't even try. Unless the two tasks are running on the same CPU, the blocked task will, once again, be deactivated to wait in the old-fashioned way.

On a modern-day system — even on a small, mobile system — there are enough CPUs that the chances of both tasks running on the same one will be relatively small most of the time. That, in turn, means that proxy execution, in its state at the end of this patch series, is not yet a particularly useful feature for users. It is, however, useful for developers who are trying to understand this work; limiting proxy execution to same-CPU tasks makes the series much easier to review.

That review has been done, and this series is now staged to go upstream during the 6.17 merge window. Even if it is not a complete solution, it is a significant step toward that solution.

Donor migration

The next step can be seen in this patch series adding "donor migration". In simple terms, it handles the different-CPU case by migrating the donor task to the CPU where the lock holder is running. At that point, its scheduling context will be on the correct run queue to allow the proxying to happen.

Of course, nothing in the scheduler is quite that simple. System administrators can set CPU masks on groups of tasks that limit them to a subset of the CPUs on the system. So it may well be that the donor task cannot actually run on the CPU where the proxy is. Lending its scheduling context on that CPU is fine, but the scheduler has to take care to migrate the donor task back to a CPU it is allowed to run on once that task becomes runnable. It also would not do for the scheduler's load-balancing code to migrate either task somewhere else while the proxy execution is happening, so more care is required to disable migration in that case. Once again, there may be a whole chain of tasks involved; migrating all of them at the outset is more efficient than doing the job piecemeal.

Finally, there is the question of what happens when the mutex that caused all of this work is finally released. Having some unrelated task swoop in and grab it before the task that donated its CPU time gets a chance to do so seems unfair at best. It could also happen reasonably often, especially in situations where the donor task has to be migrated back to its original CPU before it can run. To avoid this problem, the mutex code is enhanced to recognize that a lock has been released as the result of proxy execution, and to hand the lock directly to the donor task in that case.

Getting this work into shape for merging could take a little while yet; the current posting is the 20th version, and more are likely to come. Once it is in, the problem still will not be completely solved, though the end will be coming into sight. It looks like proxy execution will, eventually, as the result of persistent effort by a number of developers, be a part of the mainline kernel.

Comments (2 posted)

Extending run-time verification for the kernel

By Daroc Alden
July 30, 2025

There are a lot of things people expect the Linux kernel to do correctly. Some of these are checked by testing or static analysis; a few are ensured by run-time verification: checking a live property of a running Linux system. For example, the scheduler has a handful of different correctness properties that can be checked in this way. Nam Cao posted a patch series that aims to extend the kinds of properties that the kernel's run-time verification system can check, by adding support for linear temporal logic (LTL). The patch set has seen eleven revisions since the first version in March 2025, and recently made it into the linux-next tree, from where it seems likely to reach the mainline kernel soon.

Run-time analysis is present everywhere in the kernel; lockdep, for example, is a kind of run-time verification. But instrumenting the whole kernel for each kind of verification that people may want to perform is infeasible. The run-time verification subsystem allows for tracking more complex properties by hooking into the kernel's existing tracing infrastructure. For example, run-time verification can be used to ensure that a system schedules tasks correctly; there are options to ensure that task switches only occur during a call to __schedule(), that the scheduler is called in a context where it is safe to do so, and various other properties of the scheduler interface that depend on the global state of the system. Each property that is checked in this way is represented by a per-CPU or per-task state machine called a monitor. Tracing events drive the transitions in these machines. If they ever reach an error state, the kernel can be configured to log an error message or panic.

The use of state machines has the nice property of keeping the actual overhead of the monitors as low as possible. A 2019 paper by Daniel Bristot de Oliveira, Tommaso Cucinotta, and Rômulo Silva de Oliveira showed that the overhead of updating a state machine was actually lower than the overhead of just recording tracing events to a file for later analysis. Because state machines, ironically, do not track much state, the per-task memory usage of the system is quite small as well.

Writing state machines by hand is a tedious process, though, so the kernel includes an rvgen tool that can convert a state machine described in Graphviz's DOT format into appropriate C code. There is a bit of manual work to do in order to connect the generated state machine to the correct tracing events, but rvgen also generates appropriate kernel configuration and header files, and provides a checklist of what the programmer will need to implement themselves.

The problem Cao ran into was that simple deterministic state machines are too inflexible to easily represent some desirable properties. For example, it would be nice to have a monitor that can detect priority inversion in realtime tasks, but representing this property as a state machine is complex and error prone. Cao's solution is to add another specification language to rvgen that can handle more complicated statements. The resulting code is still compiled to a state machine — specifically, a non-deterministic Büchi automaton — but it can express properties about the future execution of a task more easily.

The new specification language has a custom syntax, but the underlying semantics are taken from linear temporal logic (LTL), which is a kind of modal logic. LTL extends classical Boolean logic with a notion of time. Unlike some more complicated modeling systems, LTL only deals with a single, discrete, non-branching timeline — hence the "linear" part of the name. In addition to the fundamental operations on Booleans (such as "or" and "not"), LTL has two new operators "next" and "until". In LTL, "next A" means that some proposition A must be true on the next time-step. Similarly, "A until B" means that A must be true at all subsequent points in time until (and possibly after) B is true.

Just as classical logic has derived operators such as "implies", these two temporal operators can be combined to produce more helpful operators like "eventually" and "always". This makes it possible to express constraints such as "a task that acquires a lock must release the lock before exiting" as something like "it is always the case that a task that acquires a lock does not exit until it releases the lock". In Cao's proposed syntax, that would look like this:

    RULE = always (ACQUIRE imply ((not EXIT) until RELEASE))

Upper-case words correspond to events or rules; the first rule of the file is used to generate the state machine. Lower-case words are operators. Cao's simple code does not implement operator precedence, so parentheses are mandatory on pain of surprising behavior.

The code generator is currently fairly basic. The above rule compiles to a five-state non-deterministic state machine, but many sets of states are unreachable. To illustrate the kind of state machine produced for a simple property like the above, I took the generated Büchi machine and flattened it into a deterministic state machine shown in the diagram below. Red edges represent acquire events, blue edges represent exit events, and green edges represent release events. After pruning unreachable states, the machine looks like this:

[A complicated state machine diagram]

State s0 is the rejecting state, which indicates that there was a problem. Much of the complexity in this example comes from correctly tracing situations where it is not certain whether a lock has been acquired or not. In any case, this kind of automaton would be painful to write by hand; the generated code is much easier to deal with. In order to use it, the programmer must fill in the implementation of the ltl_atoms_init() function, which sets the initial state of the monitor, and then arrange for the ltl_atom_update() function to be called from appropriate tracepoints. The rest of the integration with the run-time verification subsystem is handled by the generated code. The actual state machine itself is generated and placed in a separate header file.

The patch set includes two example definitions for run-time monitors using the new syntax. Both have to do with ensuring that realtime tasks do not sleep incorrectly, and are simple enough that they probably could have been written by hand. But the hope is that having a generator available will enable other kernel developers to write more complicated run-time checks in their areas of expertise.

Comments (4 posted)

Rethinking the Linux cloud stack for confidential VMs

July 25, 2025

This article was contributed by Carlos Bilbao

There is an inherent limit to the privacy of the public cloud. While Linux can isolate virtual machines (VMs) from each other, nothing in the system's memory is ultimately out of reach for the host cloud provider. To accommodate the most privacy-conscious clients, confidential computing protects the memory of guests, even from hypervisors. But the Linux cloud stack needs to be rethought in order to host confidential VMs, juggling two goals that are often at odds: performance and security.

Isolation is one of the most effective ways to secure the system by containing the impact of buggy or compromised software components. That's good news for the cloud, which is built around virtualization — a design that fundamentally isolates resources within virtual machines. This is achieved through a combination of hardware-assisted virtualization, system-level orchestration (like KVM, the hypervisor integrated into the kernel), and higher-level user-space encapsulation.

On the hardware side, mechanisms such as per-architecture privilege levels (e.g., rings 0-3 in x86_64 or Exception Levels on ARM) and the I/O Memory Management Unit (IOMMU) provide isolation. Hypervisors extend this by handling the execution context of VMs to enforce separation even on shared physical resources. At the user-space level, control groups limit the resources (CPU, memory, I/O) available to processes, while namespaces isolate different aspects of the system, such as the process tree, network stack, mount points, MAC addresses, etc. Confidential computing adds a new layer of isolation, protecting guests even from potentially compromised hosts.

In parallel to the work on security, there is a constant effort to improve the performance of Linux in the cloud — both in terms of literal throughput and in user experience (typically measured by quality-of-service metrics like low I/O tail latency). With the knowledge that there is room to improve, the cloud providers increasingly turn to I/O passthrough to speed up Linux: bypassing the host kernel (and sometimes the guest kernel) to expose physical devices directly to guest VMs. This can be done with user-space libraries like the Data Plane Development Kit (DPDK), which bypasses the guest kernel, or hardware-access features such as virtio Data Path Acceleration (vDPA), which allow paravirtualized drivers to send packets straight to the smartNIC hardware.

But hardware offloading exemplifies a fundamental friction in virtualization, where security and performance often pull in opposite directions. While it is true that offloading provides a faster path for network traffic, it has some downsides, such as limiting visibility and auditing, increasing reliance on hardware and firmware, and circumventing OS-based security checks of flows and data. The uncomfortable reality is that it's tricky for Linux to provide fast access to resources while concurrently enforcing the strict separation required to secure workloads. As it happens, the strongest isolation isn't the most performant.

A potential solution to this tension is extending confidential computing to the devices themselves by making them part of the VM's circle of trust. Hardware technologies like AMD's SEV Trusted I/O (SEV-TIO) allow a confidential VM to cryptographically verify (and attest to) a device's identity and configuration. Once trust is established, the guest can interact with the device and share secrets by allowing direct memory access (DMA) to its private memory, which is encrypted with its confidential VM key. This avoids bounce buffers — temporary memory copies used when devices, like GPUs when they are used to train AI models, need access to plaintext data — which significantly slow down I/O operations.

The TEE Device Interface Security Protocol (TDISP), an industry standard published by PCI SIG, defines how a confidential VM and device establish mutual trust, secure their communications, and manage interface attachment and detachment. A common way to implement TDISP is using a device with single root I/O virtualization (SR-IOV) support — a PCIe feature that a physical device can use to expose multiple virtual devices.

In those setups, the host driver manages the physical device, and each virtual device assigned to a guest VM acts as a separate TEE device interface. Unfortunately, TDISP requires changes in the entire software stack, including the device's firmware and hardware, host CPU, and the hypervisor. TDISP also faces headwinds because not all of the vendors are on board. Interestingly, NVIDIA, one of the biggest players in the GPU arena, sells GPUs with its own non-TDISP architecture.

Secure Boot

Beyond devices, many other parts of the Linux cloud stack must change to accommodate confidential computing, starting right at boot. To understand how, we need to look at Secure Boot. A typical sequence is shown in the area outlined in red in the figure below. First, the firmware verifies the shim pre-bootloader using a cryptographic key embedded in the firmware's non-volatile memory by the OEM, along with a database of valid signatures (DB) and a revocation list (DBX) to reject known-bad binaries, such as a first-stage bootloader, and revoked certificates. Once verified, shim is loaded into system memory and execution jumps to it.

Shim then does a similar check on the next step, the bootloader (usually GRUB), using a key provided by the Linux distribution. Finally, the bootloader verifies and loads the kernel inside the guest VM. The guest kernel can read the values of the Platform Configuration Registers (PCRs) stored in a virtual Trusted Platform Modules (TPM) that the hypervisor provides (e.g. using swtpm) to get the digests of all previously executed components and verify that they match known-good values.

[Secure Boot]

Extra steps need to take place during boot to set up for confidential computing. In the figure above, a secure VM service module (SVSM) on the left becomes the first component to execute, verifying the firmware itself while running in a special hardware mode known as VMPL0 (Intel's equivalent is VTL0). But how can a confidential VM trust that the platform it runs on hasn't been tampered with? In traditional Secure Boot, the chain of trust relies on a virtual TPM (vTPM) provided by the host. However, the hypervisor itself is now untrusted, so the guest cannot rely on a TPM controlled by it. Instead, the SVSM, or other trusted component isolated from the host, must provide a vTPM that supplies measurements for remote attestation. This allows the guest OS to verify the integrity of the platform and decide whether it is safe to run.

The details of remote attestation can vary depending on the model followed; the most well-known is the Remote ATtestation procedureS (RATS) architecture. In this model, three actors play a role:

  • Attester: Dedicated hardware like AMD's Platform Security Processor (PSP) that generates evidence about its current state (e.g., firmware version) by signing measurements with a private key stored within it.
  • Verifier: A remote entity that evaluates the evidence's integrity and trustworthiness. To do so, it consults an endorser to validate that the signing key and reported measurements (digests) are legitimate. The verifier can also be configured to enforce appraisal policies — for example, rejecting systems with outdated firmware versions from receiving secrets.
  • Endorser: A trusted third party, typically the hardware vendor, provides certificates confirming the signing key belongs to genuine cryptographic hardware. The endorser also supplies reference measurement values used by the verifier for validation.

The final product is an attestation result prepared by the verifier, confirming that the measured platform components match expected good values. A Linux confidential VM can use this report — including a vTPM quote with the current PCR values signed by a vTPM private key and a nonce supplied by the guest (to prevent replay attacks) — to decide whether to continue booting.

Secure Boot helps prevent malicious code from executing early in the boot sequence, but it can also increase boot time by a few seconds. Adding confidential computing to the equation slows down things even more. For most Linux users, the slight delay of Secure Boot is negligible and well worth the security benefits. But, in cloud environments, even a few extra seconds for guest boot can be consequential — small delays quickly add up at fleet scale. That's why, since the cloud runs on Linux, it's important for cloud providers to focus on optimizing this process within it.

To complicate things even more, there are different flavors of confidential computing. For example, instead of using an SVSM, Microsoft's Linux Virtualization-Based Security (LVBS) opts for a paravisor, as shown in the figure below. In LVBS, the paravisor is a small Linux kernel that runs in a special hardware mode (e.g. VTL0) after the bootloader. This design has the advantage of being vendor-neutral, but also has drawbacks, such as a significantly larger attack surface than the SVSM. Even though there are many ways to implement confidential VMs in Linux, we still lack a clear, shared understanding of the trade-offs between them.

[LVBS boot]

Once the confidential VM is booted, two major sources of runtime overhead are DRAM encryption and decryption, as well as enforcing memory access permissions from the hardware. That said, because this happens inline within the memory controller, the delay is usually small; this impact can vary depending on the workload, particularly for cache-sensitive applications.

A separate, more significant performance hit comes from the process of accepting memory pages. Before a confidential VM can access DRAM, each page must be explicitly accepted by the guest. This step binds the guest physical address (gPA) of the page to a system physical address (sPA), preventing remapping — that is, once validated, the hardware enforces this mapping, and any attempt by the hypervisor to remap the gPA to a different sPA via nested page tables will trigger a page fault (#PF). The validation process is slow and requires the guest kernel to spend virtual-CPU cycles issuing hypercalls and causing VMEXITs, since it cannot directly execute privileged instructions like PVALIDATE on x86 processors. Only components running in special hardware modes — such as the SVSM at VMPL0 — can call them directly. To avoid this overhead cost at runtime, the SVSM (or whatever component is used) should pre-accept all memory pages early during the boot process.

Scaling

Fleet scalability — meaning how many guest VMs can be created — is also impacted by confidential computing. The most significant hardware limitations come from architectural constraints: for example, the number of available address-space identifiers (ASIDs). Each confidential VM requires a unique ASID in order to be tagged and isolated; without a unique ASID, the hardware cannot differentiate between encrypted memory regions belonging to different VMs. The maximum number of ASIDs that Linux can use is typically capped by the BIOS and limited to a few hundred. That might seem enough, but modern multicore processors can have hundreds of cores, each hosting one or even two virtual CPUs with simultaneous multithreading. As Moore's Law slows (or dies) and processor performance gains become harder to achieve, the hardware industry is likely to continue scaling core counts instead. Thus, without scalable support in Linux for confidential VMs, the cloud risks underutilizing cores.

A possible solution to the hardware scalability problems would be hybrid systems, where Linux could run both confidential and conventional VMs side by side. Today, kernel-configuration options enforce an all-or-nothing approach — either the system hosts only encrypted VMs or it hosts no encrypted VMs. Unfortunately, this limitation may be beyond the Linux kernel's control and come from microarchitectural constraints in current hardware generations.

In confidential VMs, swap memory needs to be encrypted to preserve the confidentiality of data even when moved to disk. Likewise, when the VMs communicate over the network — particularly through host-managed NICs — they must establish secure end-to-end sessions to maintain data integrity and confidentiality across untrusted host networks. Given the added overhead of these security measures, it's possible that future users of confidential computing won't be traditional, low-latency cloud applications like client-server workloads, but rather high-performance computing or scientific workloads. While these batch-oriented applications may still experience some performance impact, they generally have a higher tolerance for latency — not because they are inherently less sensitive to it, but because they lack realtime human interaction (e.g., there are no users sitting in front of a browser waiting for a reply).

Live migration is another important aspect of the cloud, allowing VMs to move between hosts (such as during maintenance in specific regions of the fleet) with minimal impact on the VMs — ideally without a noticeable disruption, as IP addresses can be preserved using virtual LAN technologies like VXLAN. However, after migration, the attestation process must be repeated on the destination node. While pre-attesting a destination node (as a plan B option) can help reduce overhead, unexpected emergencies in the fleet may force the VM to migrate again shortly after arrival. Worse still, because the guest VM no longer implicitly trusts the host, it must also verify that its memory and execution context were correctly preserved during migration, and that any changes were properly tracked throughout the live migration. To facilitate all of this, a migration agent running in a separate confidential VM can help coordinate and secure live migration.

In conclusion

Hardware offloading has always implied a tradeoff in virtualization: it improves I/O performance but weakens security. Thanks to confidential computing, Linux can now achieve the former without sacrificing the latter. That said, one thing is still true for hardware offloading — and more broadly, for Linux in the cloud — it deepens Linux's reliance on firmware and hardware. In that sense, trust doesn't grow or shrink, it simply shifts. In this case, it shifts toward OEMs (hardware and device manufacturers).

But what happens if (or when) an attacker exploits vulnerabilities or backdoors in hardware or firmware? Unlike software, hardware is difficult to verify, leaving open the risk of hidden compromises that can undermine the entire security model. Open architectures like RISC-V may offer a solution with hardware designs that can be inspected and audited. This speaks to the security value of transparency and openness — ultimately the only way to eliminate the need to trust third parties.

Cloud providers are already expected to respect user privacy, but confidential computing turns that promise into more than just a leap of faith taken in someone else's computer. That shift puts the guest Linux kernel in an awkward spot. Cooperation with the host can be genuinely useful — say, synchronizing schedulers to make the most of NUMA layouts, or avoiding guest deadlocks. But the host is also, unavoidably, untrusted.

This means that Linux finds itself trying to work with something it's supposed to be protected from. As a consequence, a lot has to change in the Linux cloud stack to truly accommodate cloud confidential computing. Is this a worthwhile investment for the overall kernel community? As the foundation of the modern public cloud, Linux is in a good position to explore the potential of confidential VMs.

Comments (9 posted)

Page editor: Jonathan Corbet
Next page: Brief items>>


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds