|
|
Subscribe / Log in / New account

Leading items

Welcome to the LWN.net Weekly Edition for October 23, 2025

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Git considers SHA-256, Rust, LLMs, and more

By Jonathan Corbet
October 21, 2025
The Git source-code management system is a foundational tool upon which much of the free-software community is based. For many people, Git simply works, though perhaps in quirky ways, so the activity of its development community may not often appear on their radar. There is a lot happening in the Git world at the moment, though, as the project works toward a 3.0 release sometime in 2026. Topics of interest in the Git community include the SHA-256 transition, the introduction of code written in Rust, and how the project should view contributions created with the assistance of large language models.

Moving to SHA-256

Hashes are a core part of how Git works; they are used to identify commits, but also to identify the individual files ("blobs") managed in a Git repository. The security of the repository (and, specifically, the integrity of the chain of commits that leads to any given state of the repository) is no stronger than the security of the hash that is used. Git, since the beginning, has used the SHA-1 hash algorithm, which is increasingly viewed as being insecure. It has been understood for years that, sooner or later, Git will have to move to using a different hash algorithm.

So far, that move has been repeatedly pushed to the "later" column. That is not to say that no work has been done in that area; LWN first covered the effort to move to SHA-256 in 2020, with an update in 2022. Git has had the ability to manage a repository using SHA-256 hashes since the 2.29 release in 2020. That is only part of the job, though; before SHA-256 can be widely used, there needs to be a solution for interoperability between SHA-1 and SHA-256 repositories. Git is a distributed system, with hundreds or thousands of repositories existing even for relatively small projects. Converting all of those repositories to a new hash function simultaneously is simply not going to happen, so there must be a way to move commits between repositories using different hash functions.

Writing that sort of interoperability code is the kind of task that few developers are aching to take on. So it is not surprising that, in this case, few have. The task has fallen to brian m. carlson, who has done almost all of the SHA-256 work. This work is progressing slowly; a patch series focused mostly on documentation updates looks set to land in the next Git release. But, as carlson said recently, there is a lot still to be done if the planned 3.0 release is to switch to SHA-256 by default:

The SHA-256 interoperability work is not done yet. My estimate of this work is 200–400 patches, of which about 100 are done. If the original schedule is maintained, this would require writing up to 75 patches and sending in 100 patches per cycle, which is unrealistic without additional contributors.

He also pointed out that some of the Git-based forge systems are more advanced than others with regard to readiness for this change. The project as a whole seems undecided as to whether the completion of the interoperability code is a required feature for the 3.0 release or not. There is a desire, though, to set some sort of date for the SHA-256 switch, to put pressure on forges and such to be ready, if for no other reason.

Rust

When Linus Torvalds first wrote Git in 2005, he naturally wrote it in C, and that is still the language that the project uses. As is the case with many other C projects, though, there is an interest in moving to a safer language — Rust, in this case. Some Git developers are already working in Rust; notably, carlson is implementing some of the SHA-256 interoperability code in that language. There is also a reimplementation of the xdiff library in Rust by Ezekiel Newren that is making the rounds. Rust, it seems, is in Git's future.

The first step in that direction is likely to be this patch series from Patrick Steinhardt, which introduces an optional Rust module as a "trial balloon" to help users and distributors adapt to the new building requirements. The series includes a documentation change indicating that Rust will become mandatory for building Git as of the 3.0 release. This change seems likely to land in a near-term Git release as well. Steinhardt has also been working on some improvements to Git's continuous-integration infrastructure to enable testing the Rust side of the build.

Large language models

Many projects have been struggling with whether (and how) to accept code that was produced with the help of large language models (LLMs); the Git project is no exception. Some projects are cautiously opening the door to such contributions; Git is being more cautious than most. Partly, that may be a result of its 2025 Google Summer of Code experience, where nearly all of the proposals received were LLM-generated; a first attempt at a related policy was considered at that time. Christian Couder recently posted an updated proposed policy for LLM-generated code that, in part, reads:

The Developer's Certificate of Origin requires contributors to certify that they know the origin of their contributions to the project and that they have the right to submit it under the project's license. It's not yet clear that this can be legally satisfied when submitting significant amount of content that has been generated by AI tools.

Another issue with AI generated content is that AIs still often hallucinate or just produce bad code, commit messages, documentation or output, even when you point out their mistakes.

To avoid these issues, we will reject anything that looks AI generated, that sounds overly formal or bloated, that looks like AI slop, that looks good on the surface but makes no sense, or that senders don't understand or cannot explain.

There has been some discussion of this proposal, with carlson saying that it is not firm enough. Chuck Wolber worried that it reads like a total rejection of LLM-generated code, which he seemingly does not support. Elijah Newren said that he has already contributed some LLM-generated documentation and wondered if it needed to be reverted. Git maintainer Junio Hamano has posted a firmer variant of the proposed policy that is derived from the one used by the QEMU project. More discussion is to be expected, but it seems that the Git project will remain relatively unwelcoming to machine-generated contributions for the foreseeable future.

Other stuff

It will probably not be in the next release, but sometime thereafter Git will include some documentation of its data model contributed by Julia Evans. A change that more users may notice is using "main" as the default branch name, by Phillip Wood. There has been a desire to move away from "master" for some time; the change is likely to be made in the 3.0 release. The biggest concern about that change at this point, seemingly, is the existing body of Git tutorials using "master", which could prove especially confusing for just the sort of new users those tutorials are aimed at. To head off confusion, Git is likely to include one other change providing a hint for people who want to change the name back.

The Git project celebrated its 20th anniversary this year; in those two decades, Git has become one of the most important tools in a software developer's toolbox. After all that time, it remains clear that the job is not yet done. Development of Git is proceeding rapidly, and does not appear to be set to slow down anytime soon.

Comments (41 posted)

Explicit lazy imports for Python

By Jake Edge
October 20, 2025

Importing modules in Python is ubiquitous; most Python programs start with at least a few import statements. But the performance impact of those imports can be large—and may be entirely wasted effort if the symbols imported end up being unused. There are multiple ways to lazily import modules, including one in the standard library, but none of them are part of the Python language itself. That may soon change, if the recently proposed PEP 810 ("Explicit lazy imports") is approved.

Consider a Python command-line tool with multiple options, some of which require particular imports that others do not need; then a user invokes it with --help and has to wait for all of those imports to load before they see the simple usage text. Once they decide which option they were after, they have to wait again for the imports before the tool performs the operation they wanted. What if, instead, those imports could be delayed until they were actually needed in the Python code? That is the basic idea behind lazy imports.

Our last look at the idea was in December 2022, just after the steering council (SC) rejected a PEP for lazy imports because of concerns it would fracture the ecosystem. Unlike the current proposal, PEP 690 ("Lazy Imports") switched importation to be lazy by default. In the decision announcement, the SC saw problems with the feature because "it becomes a split in the community over how imports work", which leads to a need to test code "in both traditional and lazy import setups". But the SC did see the value in the underlying idea:

A world in which Python only supported imports behaving in a lazy manner would likely be great. But we cannot rewrite history and make that happen. As we do not envision the Python [language] transitioning to a world where lazy imports are the default, let alone only, import behavior. Thus introducing this concept would add complexity to our ecosystem.

Explicit, not implicit

Along the way, various suggestions had been made for a more explicit version of lazy imports, rather than defaulting to lazy and requiring an opt-out for the traditional behavior. After the rejection, one of the authors of PEP 690, Carl Meyer, asked if there was interest in a more explicit version; there was interest, and more discussion, but it ultimately did not go anywhere until now. The other author of PEP 690, Germán Méndez Bravo, joined with a long list of authors, notably including SC member Pablo Galindo Salgado, to create PEP 810, which adds the following syntax:

    lazy import foo
    lazy from foo import bar

The lazy keyword is soft, which means that it can be used in any other context (in particular, as a variable, function, or class name) without confusion. There are restrictions on lazy imports, though; for one thing, the wildcard import:

    from foo import *
cannot be done lazily, so putting lazy in front of that is an error. There are also multiple contexts where lazy imports cannot be specified; for one, it is only allowed at the global (i.e. module) level. The lazy keyword cannot be used inside functions or classes, in try/except blocks, or for from __future__ imports. Originally, it was also disallowed inside with blocks, but some of the feedback in the lengthy discussion thread caused that to change, as we will see.

The lazy keyword only makes the import potentially lazy; there are some mechanisms, including a global flag and a lazy-imports filter, that can make the interpreter ignore the keyword, which turns the statements into regular, non-lazy (or "eager") imports. The flag, which can be set on the command line, in an environment variable, or with the sys.set_lazy_imports() function, has three settings. If it is set to "normal" (or is unset), only lazy imports are handled lazily; "none" disables lazy imports entirely, while "all" makes every module-level import (except for import *, or in try blocks) lazy.

If an import is lazy, the names being imported are bound to lazy proxy objects that are only instantiated when they are referred to. For example:

    lazy import abc  # abc is now bound to a lazy proxy
    lazy from foo import bar, baz  # foo, bar, baz all proxies

    abc.def()  # loads module abc

    bar()  # resolves bar, which loads foo, baz still proxy
    baz()  # resolves baz, does not reload foo

Beyond the explicit keyword, a module can have a __lazy_modules__ variable containing a list of module names (as strings); those modules will be treated as if the lazy keyword was applied when they are encountered in an import statement. It is meant to be used for code that may run on versions of Python that lack support for lazy, since setting __lazy_modules__ will have no effect on those versions, while using lazy would be a syntax error.

The process of turning a lazy proxy object into a concrete object is known as "reification". It uses the standard Python import machinery to import a module, which may effectively be a no-op if it has already been imported in the meantime, and to assign a module object to the name in sys.modules. If a symbol from the module is being referenced (e.g. bar() above), the symbol from the imported module is bound to the corresponding global variable (bar) in the importing module. If the import fails, or the symbol is not present, the usual exception (e.g. ImportError) is raised at the site where the reification happens, but with additional traceback information about the site of the lazy-import statement added in.

The truly massive PEP has a great deal of detail on the semantics, reference implementation, and backward-compatibility considerations for the feature. Due to the explicit nature of the opt-in, problems with backward compatibility are likely to be minimal. The timing of when import errors are reported might cause some problems, but if an existing import is not triggering an error, adding lazy should not change that in any way. Another thing to keep in mind is that reification is done using the state of the import machinery (e.g. various sys.path* variables and the __import__() function) at that time, not what it was at the time the lazy import was evaluated.

Discussion

Galindo Salgado announced the PEP and opened the discussion about it on October 3. Overall, the reaction has been quite positive, which is not much of a surprise given that the previous PEP was popular. Reducing startup time for applications is something that many developers have worked on—to the point that some organizations have either forked CPython or are patching their version to support some mechanism for lazy imports. For example, Meta has lazy imports in its Cinder fork of CPython; back in 2022, Méndez Bravo wrote an extensive look at the path to making lazy imports work for the Instagram code base using Cinder.

Part of the reasoning behind reviving the lazy-imports effort is to give companies like Meta (and others currently using lazy imports via internal changes to CPython) an "official, supported mechanism" so that "they can experiment and deploy without diverging from upstream CPython", Galindo Salgado said in the discussion thread.

Several people expressed concern about the global "all" flag that makes all imports lazy; for example, Adam Turner said:

I worry that this will become a hidden secret hack ™ that will proliferate as a way to easily increase performance with zero other changes, etc. This means that I as a [library] author would potentially be liable for an influx of bug reports that my library doesn't work with lazy imports, even though I haven't tested for or intend to use them.

The flag is only meant for advanced users who are willing and able to test their applications under those circumstances, Galindo Salgado said (note that the flag value changed from "enabled" to "all" along the way):

Docs will make the trade-offs clear: if you enable the global mode, you're expected to use the filter and own exclusions. It's fine for maintainers to close reports that only fail under -X lazy_imports=enabled without a minimal repro using explicit lazy. This is not really a problem: maintainers can simply decide what they support. If users open issues saying "your library could support this better," that's just information: you can close it if you don't want to invest, or act on it if you do. As a maintainer myself, I don't see harm in users surfacing that feedback, and we'll make sure the docs hammer home that enabling the flag means you accept those trade-offs.

Lazy importing already exists for Python in various forms, including in the LazyLoader that is part of importlib in the standard library; there are other options in the Python Package Index (PyPI) as well. So library maintainers likely have been dealing with users reporting lazy-import problems, though it may well not have been a common complaint.

There are certain applications, like pip, that need to ensure their imports are done eagerly, as Damian Shaw pointed out: "Pip has to import [its] modules eagerly so that installing a wheel doesn't allow the wheel to insert itself into the pip namespace and run arbitrary code as part of the install step". He wondered if those imports should be done inside a "with contextlib.nullcontext()" block, which would cause the imports to be eager since they are in a do-nothing (nullcontext()) with block (though that has changed since this exchange). That would work, Galindo Salgado said, but a simpler solution would be for pip to explicitly disable lazy imports for itself with sys.set_lazy_imports("none"); the security implications section of the PEP was updated with that information as a result of the exchange.

Anticipating ImportError

There is a fairly common pattern in Python programs to switch to a different module when an import is not found, but it uses a try block to do so. Daniel F Moisset asked about a lazy alternative for something like:

    try:
        from typing import LiteralString
    except ImportError:
        LiteralString = str
Will Shanks suggested explicitly reifying the import at the point where it is about to be used, in order catch the exception, but one of the PEP authors, Brittany Reynoso, pointed out that just using the module in the usual way (perhaps wrapped in a try block catching ImportError) will reify it if needed. Oscar Benjamin had a somewhat different formulation:
What I want is something like:
    try:
        lazy import numpy
        using_numpy = True
    except ImportError:
        using_numpy = False

    # Here numpy is still only lazily imported
Obviously this requires some part of the import machinery to at least access the filesystem so it could not be a completely lazy import.

"Thanos" suggested using importlib.util.find_spec('numpy') to determine if numpy can be found, which works, but is not a guarantee that actually doing the import will not raise an exception.

Brett Cannon, developer of importlib and LazyLoader, noted that a "key benefit of this PEP over LazyLoader" is that "it makes finding the module lazy (LazyLoader is eager in finding the module but lazy for loading)". But Alyssa Coghlan thought that the PEP needed to justify that choice "as it's what rules out using lazy imports for cheap existence checking". She did not necessarily want to see a change in semantics for the feature, but hoped for something to be added to the "Rejected Ideas" section of the PEP.

In a multi-part update of the PEP, Thomas Wouters, another of the seven PEP authors, added a justification for doing all of the module resolution at reification time. The concerns are that finding a module can often be a significant part of the performance cost of an import, especially on network filesystems. In addition, exceptions would be raised at different times for different kinds of problems, which could be confusing. "The current design is simpler: with full lazy imports, all import-related errors occur at first use, making the behavior consistent and predictable."

Interaction with types

One of the benefits of lazy imports is that they can be used to avoid importing typing information at run time; since the advent of static types for Python, an enormous ecosystem for typing has sprung up, nearly all of which is only needed when a static typechecker is being used—not at run time. The "FAQ" section of the PEP notes that the common pattern for avoiding the import of type annotations at run time can be switched to a lazy import:

    from typing import TYPE_CHECKING
    if TYPE_CHECKING:
        from collections.abc import Sequence, Mapping

    def process(items: Sequence[str]) -> Mapping[str, int]:
        ...

    # could instead be:

    lazy from collections.abc import Sequence, Mapping  # No run-time cost

    def process(items: Sequence[str]) -> Mapping[str, int]:
        ...
The PEP also mentions that possibility of automatically making "if TYPE_CHECKING" imports lazy down the road.

As David Ellis pointed out, though, lazy imports can be used to avoid problems with circular imports; that only works if those imports remain lazy, which will not happen if lazy imports are disabled using the flag. He wondered if there was a need to add a filter that is analogous to the lazy-imports filter but forcefully opts into lazy for some modules. In the message linked above, Wouters said the PEP authors preferred not providing another filter:

And yes, it is intentional that disabling lazy imports globally would expose import cycles that only work with lazy imports. The advice for import cycles remains pretty much the same, even with lazy imports: refactor the code so you don't have an unresolvable cycle.

Benjamin did not agree with Wouters's refactoring suggestion, due to circular typing imports, which are not uncommon. Jon Harding expanded on that, noting: "Library authors will (perhaps unintentionally) create circular references, only to later get bug reports from those (presumably) rare users that disable lazy imports." In general, there is an asymmetry in the proposal:

The PEP discusses how lazy imports become potentially lazy during evaluation. Have the PEP authors contemplated an analogous potentially eager state that eagerly reifies lazy imports except for those which reference partially imported modules?

Galindo Salgado said that the authors believe that using the flag, either to enable or disable lazy imports, is an advanced feature; its users have to take responsibility for any breakage experienced. Wouters agreed and added another rejected-ideas entry; "We think it's reasonable for package maintainers, as they update packages to adopt lazy imports, to decide to not support running with lazy imports globally disabled."

with

Using a try/except block with a lazy import could easily lead to confusion, since normally the intent is clearly to catch import problems in the block, but they won't actually occur until later. Various ideas about changing import exception handling were discussed, but the PEP authors have "decided to forbid them", Galindo Salgado said in a post trying to narrow down the discussion some. At that point, the thread had gone well over 200 posts in less than a week. The authors are "taking the approach of nailing the core feature set first and then allowing people to build on top of it later" he emphasized.

One area that the authors were still soliciting input on was the question of lazy imports inside with blocks. Paul Ganssle had a lengthy post describing a backward-compatibility strategy for modules that already have some form of lazy imports. Under the assumption that the lazy-imports feature was added for Python 3.15 (due in October 2026), existing library code does not have an easy path: "The __lazy_modules__ mechanism allows you to opt in to lazy semantics for 3.15, but I already have lazy semantics, and switching to __lazy_modules__ for Python <3.15 would be a regression for my users." He suggested using a context manager like lazy_imports from the eclectic utils (etils) package as way to bridge that gap:

    __lazy_modules__ = ["foo", "bar"]

    from ._compat import lazy_importer

    with lazy_importer(__lazy_modules__):
        import foo
        from bar import blah
If lazy imports were allowed inside with, Python 3.15 and above could simply turn lazy_importer() into contextlib.nullcontext() and the __lazy_modules__ would take care of handling foo and bar lazily. But with the PEP as it stands, that will not work. Furthermore:
The backwards compatibility use case only works if with statements are allowed from the beginning — if they are forbidden in 3.15 and then allowed in 3.16+, we are still stuck with a hack in 3.15.

Several people opted for simplicity, suggesting that lazy imports in with blocks could be added later. But Ganssle said that excluding with is actually complicating things:

I would argue that allowing lazy imports is the simpler option, since forbidding them requires a more complicated implementation and requires specifying things like "when a lazy import appears in a context manager using the __lazy_modules__ mechanism, it is not an error but rather imported eagerly".

He is referring to an example he gave in an earlier message. Specifying a module import as lazy using __lazy_modules__ leads to a potentially confusing situation:

    __lazy_modules__ = ["foo", "bar"]

    import foo # lazy

    with blah():
        import bar  # eager
Some of the same arguments could be made with regard to try, but the intent is different; context managers can be used to suppress exceptions, but that usage is not widespread, whereas catching exceptions with except is ubiquitous. As Ganssle put it: "on Github I see about 10k uses of suppress(ImportError) and 3.5M uses of except ImportError". In addition, for a clean backward-compatibility picture using a context manager as he described, with would need to support lazy imports from the outset.

Those argument seem to have been sufficient, as the PEP authors decided to allow lazy imports in with blocks, Galindo Salgado said:

There are many legitimate use cases where with is used for managing lifetime or scoping behavior rather than just suppressing exceptions, and the syntax is explicit enough that users know what they're doing. It fits Python's "consenting adults" model as with carries broader semantics than just error handling. For the genuinely problematic cases (like contextlib.suppress(ImportError)), we think linters are the right abstraction to catch them rather than hard language restrictions.

Off to the SC

That led to a final PEP update, which was sent to the SC for its consideration on October 14. The short ten-day discussion period before SC submission surprised some, but Galindo Salgado assured commenters that the SC would not have time to start its review for a few weeks. That should give people more time to review it, but, after more than 300 comments, the authors "certainly feel it has been discussed enough for us to feel confident on the design".

There was, of course, the inevitable bikeshedding about names—"defer" was a popular replacement for lazy—and about the placement of the new keyword. There are quite a few who feel that from imports should look like:

    from foo lazy import bar
The PEP addresses that—it is not missing much, if anything—by pointing out that the authors also preferred that form but found that it was already legal syntax. White space is not significant in the from statement, so:
    from . lazy import bar

    # is equivalent to

    from .lazy import bar
Some felt that deprecating that corner case was worth it, while others saw having lazy imports always start with the lazy keyword as an advantage. That is typical when choosing a bikeshed color, of course.

One would guess that PEP 810 has an excellent chance, given that Galindo Salgado and Wouters were both on the SC that rejected PEP 690, thus presumably can shape something that will be acceptable this time around. Having something in the language itself will help libraries and applications converge on a single lazy-import implementation, rather than the hodge-podge of solutions in use today. In the unlikely event that the PEP is rejected outright, though, it will still have been a useful exercise as it is hard to imagine that any lazy-import feature will make the cut when both implicit (PEP 690) and explicit (PEP 810) approaches have failed. If the topic is raised again after that, it will be easy to provide good reason to not pursue the idea of lazy imports again.

Comments (7 posted)

A brief history of RubyGems.org

By Joe Brockmeier
October 17, 2025
Ruby libraries and applications are distributed via a packaging format called a gem. RubyGems.org has been the central hosting service for gems since about 2010. This article is part one of a two-part series on the RubyGems.org takeover by Ruby Central. Understanding the history of RubyGems.org, and the contributor community behind it, is vital to making sense of the current power struggle between Ruby Central and members of the Ruby community who have maintained those services and tools for many years.

Finding gems

Like many open-source projects, the RubyGems.org service and tooling developed organically over the years to meet the needs of the growing Ruby community. Yukihiro Matsumoto, better known as "Matz" in the community, released the first version of Ruby in December 1995. Early Ruby release announcements seem to be lost to time, but Matsumoto provided a timeline in his "Ruby After 25 Years" talk, which is available on YouTube for those interested in Ruby history.

The Ruby community needed a home for open-source Ruby projects, which led to the creation of RubyForge in 2003. The hosting costs for the service were funded by Ruby Central, the nonprofit founded in 2001 by Black and Fowler after they'd organized the first RubyConf. For most of its existence, Ruby Central's primary focus has been putting on Ruby events, but it has continued to help fund Ruby hosting services since RubyForge was launched.

Ruby did not have a package manager when it was introduced; According to a 2006 article, work on the RubyGems package-management framework began in November 2003. It was created by Paul Brannan, David Alan Black, Chad Fowler, Richard Kilmer, and Jim Weirich. The first release was announced in March 2004, nearly a decade after Ruby appeared. The gem command-line utility is a frontend for RubyGems; it can be used to do everything from creating, installing, and managing gem packages to publishing them to a gem server. Development of the framework is separate from the language, but RubyGems has been included with Ruby since the Ruby 1.9.0 release in 2007.

In June 2006, Fowler announced the start of integration of RubyGems with RubyForge, which led to the forge becoming the central repository for gems. GitHub, for a short time, also served as a source of Ruby gems, which meant that there were two popular sources for gems. However, GitHub dropped its automatic gem-building service when it moved from Heroku to Rackspace in 2009, which meant that it was up to the community to maintain RubyForge as a reliable service to serve and find Ruby gems.

Growing pains

"Reliable", however, is not necessarily the first word that would come to mind about the early days of gem hosting. There were frequent complaints about RubyForge's slowness and occasional downtime. In 2009, Nick Quaranto proposed "a fresh take" on gem hosting based on an MIT-licensed Ruby on Rails project he called Gemcutter. Gem hosting would move to RubyGems.org, with support from Ruby Central for hosting costs. Development for RubyGems moved to GitHub (archived repository here) in December 2010, and RubyForge was shut down entirely in 2014. The operation of RubyGems.org, as well as the development and maintenance of the software that powered it, were handled by volunteers.

Aside from gem hosting, the Ruby community had another package-management problem: dependency management. It was simple to install individual gems, but applications written in Ruby would have dependencies on specific versions of many gems. As with other package-management tools, users suffered from "dependency hell" trying to assemble the right combination of gems to run complex applications. This was exacerbated by the popularity of Ruby frameworks like Merb and Ruby on Rails that helped developers create web applications with many dependencies.

To solve the dependency-management problem, Carl Lerche and Yehuda Katz started the Bundler project in 2008. Bundler was designed to track and install the exact gems and versions needed by simply typing "bundle install". André Arko joined in on its development in 2010, around the time of the 0.9.5 release. Arko became Bundler's most active developer that year; Lerche's last contribution was in 2011, and Katz's was in 2012.

Bundler adoption took off, which had a side effect; it put a lot of stress on RubyGems.org because it made it much easier for users to grab many gems automatically. But, Arko noted in his blog post about the history of Bundler, it was slow at first. Early iterations of bundle would receive a list of all gems on RubyGems.org every time a user ran bundle install. Quaranto, Arko said, "pragmatically wrote a new API for RubyGems.org, shipped it, and let us know that we could use it" to list individual gems instead of "every gem in existence".

Bundler adoption took off, and Arko said that the tool was being downloaded about 30,000 times per day by 2012. That meant a lot of users hammering RubyGems.org:

The growing number of Bundler users slowly built up until October 2012, when we discovered that Bundler was effectively running a DDoS attack against RubyGems.org when the servers went down, hard. There was no way for the existing architecture to handle the huge number of requests coming in at all times. We had to completely disable the dependency API, and Bundler went back to being slow.

At this point, a team including myself, Terence Lee, Larry Marburger, and others, took the time to design, implement, deploy, and scale a separate Bundler API web application to serve the dependency API for Bundler users. With the cooperation of the RubyGems.org team, including Evan Phoenix and David Radcliffe, we were even able to make the original API urls continue to work.

Bundler was eventually included as a default gem in 2018 with the Ruby 2.6.0 release.

Ruby Together

The work that enabled RubyGems.org to meet increased usage was, of course, taken on by volunteers from the Ruby community. In 2015, Arko announced the formation of a nonprofit called Ruby Together to try to fund some of that work:

All of the infrastructure used by Ruby developers today, including Bundler, RubyGems, and RubyGems.org is maintained and developed by volunteers. While it's good that no one company controls resources shared by the community, it's terrible that the only people who work on our shared infrastructure are doing so for free and in their spare time.

Beyond funding development for package-management tools, Ruby Together was meant to fund incident response for RubyGems.org. Arko and David Radcliffe, who maintained the RubyGems.org servers, were the initial team funded by Ruby Together.

Moving from RubyForge to RubyGems.org had not solved the reliability problems for gem hosting. Arko wrote a retrospective on the first 18 months of Ruby Together in September 2016; he mentioned a security issue that led to a RubyGems.org outage, "about 3 years ago", that lasted more than a week:

There was a security issue, and although the team knew about the security issue and planned to fix it, everyone on the team had a day job. The best we could do was plan to fix the problem that weekend. Unfortunately, a motivated hacker figured out how to break to during the week.

We had to take the servers down and create completely new servers from scratch. We also had to download and verify every single one of the hundreds of thousands of .gem files, to make sure the hacker hadn't replaced any of them while he had access.

Arko also recounted the work that Ruby Together had helped fund, and lamented that membership had remained flat since it started:

We haven't seen any new companies join for almost six months. If that keeps up, we won't even be able to keep up paying for a few hours of work per week. If companies keep taking community benefits without giving back, it takes us straight back to unsustainable volunteers burning themselves out.

At the end of 2016, the nonprofit had revenue of $268,237, and expenses of $305,183. A look at all of the organization's tax filings shows that its revenue peaked in 2017 at $324,331. While Ruby Together paid for some development work and operations, Ruby Central covered the hosting costs for RubyGems.org.

This is, by now, a familiar story—an open-source community creates a service that is widely used but suffers the tragedy of the commons. Everyone wants to use the fruits of the project, but no one wants to pay for it.

Centralized

After years of trying to persuade companies to fund the organization, Ruby Together board member Jonan Scheffler suggested that the two nonprofits merge. Ruby Central announced the merger in 2021. Ruby Together was dissolved and Ruby Central was to provide support for RubyGems, Bundler, and operations of RubyGems.org in addition to paying for hosting. Development of the open-source projects and management of RubyGems.org continued to be led by the community with informal governance. Ruby Central formed an Open Source Software (OSS) committee in August 2023 to formalize processes and lay "the groundwork for the continued success of these essential Ruby Central tools".

All of that is what led up to the recent power struggle between Ruby Central and Ruby volunteers working on the tools and services. In the next article, we will dig into that struggle and its aftermath so far.

Comments (7 posted)

The RubyGems.org takeover

By Joe Brockmeier
October 20, 2025

In September, a group of long-time maintainers of Ruby packaging tools projects had their GitHub privileges revoked by nonprofit corporation Ruby Central in what many people are calling a hostile takeover. Ruby Central and its board members have issued several public statements that have, so far, failed to satisfy many in the Ruby community. In response, some of the former contributors to RubyGems are working on an alternative service called gem.coop. On October 17, ownership of the RubyGems and Bundler repositories was handed over to the Ruby core team, even though those projects had never been part of core Ruby previously. The takeover and subsequent events have raised a number of questions in the Ruby community.

Ruby Central is a nonprofit that was formed by David Alan Black and Chad Fowler in 2001 to organize events for the Ruby community. It soon began supporting other initiatives, such as RubyForge, which shut down in 2014, and has helped pay for RubyGems.org hosting since its inception. However, Ruby Central has always been primarily an organization to put on conferences—it has not been actively involved in maintenance or operations until its merger with Ruby Together. The work to maintain and operate RubyGems.org, the Ruby community's hosting service for Ruby gem packages, has been undertaken primarily by volunteers for most of its existence. LWN covered this in more detail in the article "A brief history of RubyGems.org".

Takeover

Development of RubyGems, Bundler, and software for RubyGems.org has been maintained in repositories under the RubyGems GitHub organization for many years. Organizations are used to manage shared accounts for multiple repositories; organization administrators can configure the roles and permissions granted to users for one or more repositories under the organization. Note that GitHub roles are only visible to members of an organization with push access to a repository; it is not possible to verify a person's role in a repository without that access, which makes it impossible for outsiders to audit these changes.

On September 9, a RubyGems maintainer renamed the GitHub organization from "RubyGems" to "Ruby Central", added Ruby Central's director of open source Marty Haught as a maintainer, and removed everyone else. This is according to a document provided by Ellen Dash, who said that the takeover happened "with no warning or communication" to the other maintainers of these projects. Joel Drapper named Hiroshi Shibata as the maintainer who handed control of the organization to Haught in his timeline of the events.

Dash said that Shibata refused to revert the changes unless Haught gave permission to do so. Drapper's report indicates that Haught met with some of the maintainers on Zoom and explained that he had been working on operational planning. He was putting together an agreement that operators of the RubyGems.org service would be required to sign. Shibata had jumped the gun.

Martin Emde, one of the maintainers who had been locked out, submitted a pull request to the RubyGems RFC repository with a proposal for RubyGems organizational governance on September 14. The proposal was based on the Homebrew project's governance policy. Mike McQuaid, who helped create Homebrew's policy, offered his help in refining the policy for RubyGems. A fair amount of discussion took place between RubyGems maintainers about the policy over a few days.

On September 15, Dash said, access was restored after Haught gave Shibata permission. I emailed Emde about the events; he said that when the maintainers' access was restored "all of us were asked 'not to seek revenge' even though any one of us could have removed" Haught and Shibata. On September 18, Haught replied to the governance pull request discussion and said:

I've taken a first pass on this and this is a great start. I'll dig into specifics as I have more time. I'm committed to find the right governance model that works for us all. More to come.

To date, Haught has not replied to the discussion again. That day, Dash said, Haught once again "revoked GitHub organization membership for all admins on the RubyGems, Bundler, and RubyGems.org maintainer teams" with no explanation. She added that Ruby Central refused to restore GitHub permissions and also revoked access to the bundler and rubygems-update gems on RubyGems.org. "I will not mince words here: This was a hostile takeover." (Emphasis in the original.)

Emde said Haught had claimed the original changes were a mistake, "but then broke that truce in the middle of formalizing a clearer governance". He also said that Haught did not believe Ruby Central was right to take the repositories. "He knows that they were taken from us unfairly."

Takeover becomes public

All of this had happened more or less quietly until Dash spoke out about it on September 19, and published her timeline of the events. Dash, who had been a RubyGems maintainer for many years and had acted as a contractor for Ruby Central on a part-time basis since it absorbed Ruby Together, said she was resigning from her position "effective immediately".

Valerie Woolard, president of Ruby Central's board, said that the changes were "part of an effort to harden our supply chain security posture and will be followed by discussions as how to develop a sustainable governance model going forward". She also referred people to a post by Ruby Central called "Strengthening the Stewardship of RubyGems and Bundler". It said, in part:

Moving forward, only engineers employed or contracted by Ruby Central will hold administrative permissions to the RubyGems.org service.

In addition, with the recent increase of software supply chain attacks, we are taking proactive steps to safeguard the Ruby gem ecosystem end-to-end. To strengthen supply chain security, we are taking important steps to ensure that administrative access to the RubyGems.org, RubyGems, and Bundler is securely managed. This includes both our production systems and GitHub repositories. In the near term we will temporarily hold administrative access to these projects while we finalize new policies that limit commit and organization access rights. This decision was made and approved by the Ruby Central Board as part of our fiduciary responsibility. In the interim, we have a strong on-call rotation in place to ensure continuity and reliability while we advance this work

According to Drapper's timeline, the on-call rotation mentioned in the post was provided by Shopify employees.

Ruby Central promised a community Q&A session with Haught, members of the Ruby Central board, and its executive director, Shan Cureton, on September 23. The post was updated on September 25 to say the Q&A had been postponed because it was scheduled "on a major holiday in addition to it being an inconvenient time for our global community", it being the start of Rosh Hashanah.

In the place of the Q&A, Cureton provided a video update that said, "sponsor questions about supply-chain risk made one thing clear: we needed to close governance and access gaps quickly". She also indicated that there were recent departures which made the changes "urgent", though she did not name the people involved. Drapper did, however, in a blog post about the insufficiency of Ruby Central's security measures published on September 30. He identified RubyGems lead maintainer André Arko and security engineer Samuel Giddins as the people who had departed.

I emailed Arko with questions about these events. He said that he and Giddins had "clearly stated we were continuing as project maintainers". He also said that Haught confirmed that he still considered Arko the team lead for the RubyGems and Bundler projects at that time.

Arko had, however, announced a "new kind of Ruby management tool" on August 25 that could someday replace RubyGems and Bundler. The tool, rv, is a Rust-based Ruby "language manager" patterned after the uv package-management tool for Python. The end goal for rv is "a completely new kind of management tool" that would handle everything from installing Ruby to managing gems and more. The team working on rv includes Giddins and Sam Stephenson, creator of the rbenv Ruby version-manager tool. Arko said that he had learned "some people think working on two related open source projects at once is an impossible conflict of interest".

Conflicting stories

Ruby Central's public communications have claimed that the takeover was about supply-chain risk and a need to live up to its responsibilities related to RubyGems.org infrastructure and open-source projects on GitHub. There is no question that Ruby Central does have a reasonable claim to "ownership" over the operation of RubyGems.org infrastructure. The nonprofit has helped to pay for hosting since the beginning, and the merger with Ruby Together made Ruby Central the sole funding source for paid operations of RubyGems.org.

However, there is no indication that the open-source maintainers ever agreed to hand over any authority to Ruby Central for the RubyGems and Bundler open-source projects. The merger agreement between Ruby Together and Ruby Central does not convey control of those projects. It is left to Ruby Central to decide "whether to start, continue, or terminate fundraising and programming efforts" as a continuation of work that Ruby Together had done, but that does not imply ownership of those projects. It does mention an open-source committee that is supposed to propose development work budgets to the board.

Ruby Central apparently formed such a committee in August 2023, but did not announce it until November 2024. None of the RubyGems or Bundler contributors or Ruby Central's "Open Source Team" were involved in this committee, except Haught. A blog post promised that Ruby Central would "discuss the details of how the committee works" in the future. If a post explaining the committee and its work was ever published, I cannot find it.

Before the takeover, the development of these projects carried on as it had for years: with some paid work being funded by a nonprofit, but most of it was still being done on a volunteer basis and governed by lightweight contributor policies. See the RubyGems POLICIES.md and Bundler POLICIES.md for more. The policies are not as comprehensive as one might hope, but they were in place and maintainers had every reason to believe that they would be followed.

Supply chain

The claim that this was urgently necessary due to supply-chain issues has also been questioned. The prevailing counter-theory seems to be that Ruby Central moved when and how it did due to funding problems and influence from a major sponsor: Shopify.

Ruby Central had recently dealt with what it called supply-chain issues. In August, an application-security company, Socket, published its research on what it called "a long-running supply chain attack in the RubyGems ecosystem".

Since March 2023, a threat actor had published dozens of malicious gems that were advertised as automation tools for Instagram, Telegram, TikTok, WordPress, and others. While the gems did provide the promised functionality, they also sent user credentials to "threat-actor controlled infrastructure".

Haught published a blog post about the attack on August 25. He credited RubyGems maintainer Maciej Mensfeld with initial detection of the attack and Josef Šimánek for his assistance in removing malicious gems. At the time, Haught said that the incident "shows our security systems working as intended: threats were detected, removed, and contained before they could cause widespread harm".

Socket and Ruby Central have positioned this as a "supply-chain attack", but it did not fit the profile of one. Publishing gems that are malicious from the get-go is not really a supply-chain attack; it's simply a threat actor publishing malicious software through RubyGems.org and enticing people to use it. The problem is one of inadequate review of gems before publishing, not a matter of subverted infrastructure or smuggling a malicious payload into a popular project. Nothing that Ruby Central has done regarding removing maintainer account access seems to address the problem of malicious gems at all.

Funding

Ruby Central was struggling with funding problems, though. In a talk given at the Baltic Ruby conference on June 13, Haught talked about the RubyGems.org budget and funding. The budget for 2024 was $1.2 million, and $1.4 million for 2025; however, he said, "we haven't quite raised enough money to cover the budget for this year", which was something that he had to deal with. The video is available on YouTube.

Before the pandemic, Haught said, Ruby Central had made a lot of its money from conferences, "and so that funded all this work previously" but that was no longer the case. "So now the open-source program has to figure out how to fund itself". That had prompted Ruby Central to spin up a corporate sponsorship program in 2024.

The nonprofit had received a lot of grant funding in the past two years, he said, but that money was running out. "So, now I'm in the position of figuring out how to replace grant money when it's no longer available to us." According to a graph he showed during the talk, about 62% of funding was from grants, about 15% was in the form of donated services, and about 23% came from individual membership or corporate sponsorships.

Note that a large percentage of Ruby Central's budget would be allocated to salaries for Haught and Cureton. Both were hired after the 2023 tax year, which is the last filing publicly available; however, the executive director position was advertised with a range between $120,000 and $150,000. Emde speculated that Haught would have a higher salary than Cureton, but he was unsure.

Loss of Sidekiq

One reason for the budget shortfall, aside from the post-pandemic malaise, is the loss of a major corporate sponsor: Sidekiq. The organization withdrew its $250,000 sponsorship after Ruby Central announced that Ruby on Rails creator David Heinemeier Hansson (often referred to simply as "DHH") would keynote the final RailsConf event in July. Sidekiq creator Mike Perham said that he rescinded the grant because: "We cannot tolerate hateful people as leaders in our communities." David Celis wrote a blog post on September 19 that gives one perspective on some of the Ruby community's grievances with Hansson.

Hansson had keynoted or had been interviewed at RailsConf from 2006 through 2021, with a break in 2016 due to a scheduling conflict. But, in 2022, Hansson was essentially uninvited following a controversial "no politics" policy at 37signals (a company Hansson co-founded) that prompted about a third of employees to leave and drew a lot of negative attention to Hansson and 37signals.

No doubt, Hansson was not pleased at being uninvited from a conference about a technology he created. That led to the creation of the Rails Foundation and Rails World conference. Having a competing Rails event seems to have also contributed to the decline in Ruby Central's conference income, and its decision to end RailsConf after this year.

The purpose here is not to get deeply into those controversies, but to acknowledge the fact that Hansson has publicly and regularly taken positions on topics outside of Ruby that alienate quite a few people. That, in turn, put Ruby Central in a bit of a bind; some people (and sponsors) would be upset if Hansson was at RailsConf, others would be upset if he was not. There was no option that would please everyone, so it was a matter of choosing who to upset.

The first time around, Ruby Central chose to distance itself from Hansson. This year, it chose to give him the stage, and that cost the organization a significant chunk of its $1.4 million budget. Many people have also taken note of the fact that Hansson joined the Shopify board of directors last year.

Shopify and Ruby Central

Drapper said that he was told by "an anonymous source" that Ruby Central was presented with a long-term funding proposal at the Rails World 2025 conference, held September 4 through September 5, "but this would only happen if certain RubyGems maintainers were removed". Dash said that the maintainer to be removed was Arko. Drapper also claims that "Shopify specifically put immense financial pressure on Ruby Central to take full control of the RubyGems GitHub organisation and Ruby gems".

Freedom Dumlao, a member of the Ruby Central board, did not identify any sponsor specifically but seems to confirm that the board was reacting to a demand related to funding and had to decide quickly:

A deadline (which as far as I understand, we agreed to) loomed. Either Ruby Central puts controls in place to ensure the safety and stability of the infrastructure we are responsible for, or lose the funding that we use to keep those things online and going.

Dumlao said that conversations were ongoing with maintainers, but that the board had a deadline that was less than 24 hours away. In Arko's response to me on October 19, he said that he had been told directly that Ruby Central had been required to force him out of the projects as a condition of receiving corporate sponsorships. He said that he was aware some companies had been unhappy with his ideas to raise money for maintainer work, and for deprioritizing their feature requests. No amount of unhappiness, though, "justifies Ruby Central stealing the project repo" to kick him out.

The community responds

Ruby Central's statements about the takeover did not seem to satisfy many people in the Ruby community. Drapper called it "AI-generated corporate speak and bears no signature from anyone at Ruby Central willing to take responsibility". Šimánek, who was a part-time contractor for Ruby Central as well as one of the maintainers, said that the organization was using supply-chain security as an excuse to remove people from projects "they never owned [...] and now claim ownership themselves".

McQuaid said that he had met with people on both sides to try to mediate the dispute. His take was that Ruby Central managed things poorly, "including removing literally the most active member of the RubyGems organisation by mistake who has declined to return." That would appear to be former RubyGems and Bundler maintainer David Rodríguez, who updated his GitHub profile to say that he had been kicked out. "I was informed that they would unilaterally remove fellow maintainers from the project in order to keep funds from Shopify."

Ruby community member Justin Searls, however, said he was not rushing to take sides. He did not have a clear picture, he said, but "I don't believe this is a cut-and-dry case of altruistic open-source maintainers being persecuted by oppressive corporate interests". He also provided a timeline of actions by Arko to provide "broader historical context", such as a comment that suggests Heroku should fund Ruby Together if it expects Arko to continue backporting fixes to an old version of Bundler, and adding a post-install message to Bundler asking users to support Ruby Together. Searls urged others "not to rush to judgment about who's at fault in the current conflict".

Former Shopify employee Jean Boussier has defended Shopify. He was employed by the company from November 2013 to August 2025, but left "mainly because of my constant friction with the CEO". Despite that, Boussier said that he is unconvinced that Shopify threatened to pull funding or that the takeover was orchestrated by the company. He also noted that he has contacted two former coworkers who assured him, "Shopify never threatened to pull Ruby Central's funding, nor threatened not to renew it".

Ruby Central responds

Ruby Central published an update on September 30, signed by Cureton, that apologized for the confusion caused by failing to communicate "earlier and in more detail". It denied that what had happened constituted a takeover and said: "We accept responsibility for how our initial communications created the impression of sponsor-driven action."

Cureton denied that sponsors had directed Ruby Central's actions. "The Board acted independently, and financial support was NOT conditioned on taking these steps." It said that the organization would publish regular updates on Fridays, with an update on the status of the repositories "soon". A brief weekly update was published on October 3; it noted that "discovery work related to supply-chain security and governance concerns" was ongoing and would be shared "as soon as we're able".

On October 9, Cureton published a "post-incident review" of an "AWS root-access event". It said that Drapper's blog post on September 30, which demonstrated Ruby Central had failed to revoke Arko's access as part of its supply-chain cleanup, "raised concerns that a former maintainer continued to have access to the RubyGems.org production environment". It includes a detailed analysis of events of Arko accessing the RubyGems.org AWS account to demonstrate that Ruby Central had not, in fact, done a thorough job of their stated goal of improving security. In one of Emde's replies, he said that Ruby Central "never previously had, and I would argue still doesn't have the capacity to maintain this service independently" of the long-time maintainers they locked out:

We were responsible for protecting billion dollar companies and every Ruby developer in the world, from being hacked. The US government has previously coordinated drills with package repository maintainers. It's hard to overstate how big a responsibility it was and this has always been handled outside of Ruby Central. [Cureton] and Marty are not qualified nor is it least privilege, to hold such access just for the purpose of being able to take it away at their discretion.

Ruby Central's post casts Arko in a sinister light but concludes that there was no evidence that the "security incident" actually compromised anything.

Ruby Central also shared an exchange with Arko from early August, to "provide additional context to the community about our decision to formalize production access". It said that Ruby Central had been reviewing its contractor budget and planned to stop working with Arko's consultancy, "which had been receiving approximately $50,000 per year for providing the secondary on-call service". It included an email from Arko sent on August 3 that offered to provide secondary on-call services at no charge "in exchange for access to production HTTP access logs, containing IP addresses and other personally identifiable information (PII)".

The board and leadership team, Cureton said, "determined that this proposal crossed important ethical and legal boundaries", which set in motion Ruby Central's actions to revoke access from maintainers. It was selectively sharing communications with Arko "to be transparent about what occurred, what we have learned, and what we are doing to prevent it" in the future.

Arko responded the same day. About two weeks after Ruby Central took over the GitHub organization and stated it was performing a security audit, Arko said that "someone asked if I still had access", and he found that he did.

I discovered (to my great alarm), that Ruby Central's "security audit" had failed. Ruby Central also had not removed me as an "owner" of the Ruby Central GitHub Organization. They also had not rotated any of the credentials shared across the operational team using the RubyGems 1Password account.

Arko indicates that he wrote to Haught on September 30, to disclose that the organization had not terminated all of his access. He said Haught responded three days later, asking him to confirm whether he had any production data, server logs, access logs, or PII. Arko replied that he did not; he also noted in the blog that he has no interest in any PII "commercially or otherwise", but confirms that he was interested in acquiring "company-level information with no information about individuals included in any way". He also argued that his actions "were taken in defense of the service that Ruby Central was paying me to support and defend".

Ruby Central published another update on October 10. This included an email from Haught on September 18 that informed Arko that Ruby Central was "pausing" on-call rotations and directed him to send a pro-rated invoice. It said that there had been no live Q&A "yet" due to a risk of "spreading incomplete information" and excluding contributors who could not participate in real time.

Additionally, it said that a lawyer had sent Ruby Central a cease-and-desist letter on Arko's behalf with a claim that he owns the Bundler trademark, "along with various other demands". Cureton said that Ruby Central did not expect to make further public comments until those issues were resolved.

On October 10, Perham wrote that Ruby Central is "smearing Andre in public so they can justify their hostile takeover of the rubygems/rubygems repo after the fact". Arko told me that it was "wildly hypocritical" that the nonprofit published an idea he asked them about, while "keeping their own critical decisions completely secret". He also said that Ruby Central's actions were unneeded:

The "Fork" button has been there the whole time, and Ruby Central could have used it at any point to have as much security and control as they could possibly want. Ruby Central's unelected board violated the written policies of the projects they now claim to own in order to take them from their maintainers of over ten years. Their claims they had to steal the projects for legal reasons are now obviously false, since they have already passed off the stolen projects to a different outside party.

The pass-off that Arko is referring to is the announcement on October 17 by Ruby creator Yukihiro Matsumoto (a.k.a. "Matz") that the core Ruby team will be assuming stewardship of RubyGems and Bundler. The repository ownership will change in order "to ensure long-term stability and alignment with the broader Ruby ecosystem".

According to Arko, however, none of the locked-out maintainers of RubyGems or Bundler were contacted about this transition ahead of time. Emde and Rodriguez also confirmed they were not consulted in email replies to me. Emde also said he believed that if Matz knew the details "he would make the right decision". Ruby Central "took the repositories from us. They know it. We know they know it." The project was healthy, well-maintained, and actively developed, he said. So well, in fact, that this situation is the first time that many people have thought about RubyGems and how much work goes into the project. He also emphasized that there were six maintainers affected by the takeover, not just Arko:

The smearing of André reveals Ruby Central's focus more than it explains the situation. I think it's convenient for them to have a scape goat but it distracts from the others that were harmed by this.

gem.coop and transition

Since there is little indication that Ruby Central is going to reverse course, it seemed inevitable that there would be a fork or alternative effort from the community. That happened in early October when Martin Emde announced gem.coop. The goal for that project is to, eventually, be a new server for the gems ecosystem.

The site lists Arko, Dash, Emde, Giddins, Rodríguez, and Šimánek as the "cooperative" behind the initiative. The service currently caches gems published to RubyGems.org; it is not possible to publish gems directly to it—at least not yet. According to the site, its governance will be modeled after Homebrew. The governance documents are on GitHub.

There has been no public activity in the gem.coop code repository since October 12. Arko told me that this is due to a focus on finishing the project's governance. "Once project leadership is elected, we expect to resume work on gem server features."

Despite the length, this is a much-abbreviated overview of what's known publicly about the RubyGems.org takeover so far. No doubt, there is even more yet to be uncovered and more to come. It is always disappointing to see this type of drama in open-source communities; and should serve as yet-another warning to other open-source projects to get their governance in order before experiencing a similar scenario.

Comments (3 posted)

Large language models for patch review

By Jonathan Corbet
October 16, 2025
There have been many discussions in the free-software community about the role of large language models (LLMs) in software development. For the most part, though, those conversations have focused on whether projects should be accepting code output by those models, and under what conditions. But there are other ways in which these systems might participate in the development process. Chris Mason recently started a discussion on the Kernel Summit discussion list about how these models can be used to review patches, rather than create them.

Mason's focus was on how LLMs might reduce the load on kernel maintainers by catching errors before they hit the mailing lists, and by helping contributors increase the quality of their submissions. To that end, he has put together a set of prompts that will produce reviews in a format that maintainers are used to: "The reviews are meant to look like emails on lkml, and even when wildly wrong they definitely succeed there". He included a long list of sample reviews, some of which hit the mark and others of which did not.

The prompts are interesting in their own right; they can be seen as constituting the sort of comprehensive patch-review documentation that nobody ever quite got around to writing for humans to use. Perhaps that reflects a higher level of confidence that the LLM will actually read all of this material. These prompts add up to thousands of lines of material, starting with core guidance like:

Struct changes → verify all users use the new struct correctly

Public API changes → verify documentation updates [...]

Tone Requirements:

  • Conversational: Target kernel experts, not beginners
  • Factual: No drama, just technical observations
  • Questions: Frame as questions about the code, not accusations

Most of the prompts consist of guidance specific to subsystems like locking ("You're not smart enough to understand smp_mb(), smp_rmb(), or smp_wmb() bugs yet") and networking ("Socket can outlive its file descriptor"). All told, it resembles the sort of rule collection one saw in the expert systems that were going to take over the world in the 1980s. As noted in the README file, "the false positive rate is pretty high right now, at ~50%", so there is still some room for improvement.

In the ensuing discussion, nobody seemed to think that using LLMs in this way was a bad idea. Sasha Levin called it "a really great subject to discuss", and said that, in the previous discussions on LLM use by kernel developers, the concerns that were raised about LLMs drowned out out any attempt to find the places where they could be useful. Paul McKenney remarked that using this technology to review code written by others "seems much safer than using it to generate actual code". Krzysztof Kozlowski noted that Qualcomm has created a similar system and made it available.

There were some concerns raised about the proprietary nature of these systems; Konstantin Ryabitsev was just one of a few who drew parallels with the BitKeeper experience that (briefly) brought kernel development to a halt just over 20 years ago. Laurent Pinchart stated clearly that there are limits to how much proprietary tools can be used or required:

Forcing contributors to pay for access to proprietary tools is not acceptable. Forcing contributors to even run proprietary tools is not acceptable.

He also expressed concerns that the companies behind LLMs would make them available to developers for free to encourage adoption — until the community is well locked in, at which point access could quickly become expensive. Mason, though, was unworried about lock-in, saying that the prompts are sufficiently generic to be adaptable to any system. James Bottomley suggested that LLMs would not be proprietary forever, but Pinchart argued against relying on proprietary software in the hope that there will eventually be free alternatives.

There was some disagreement over who an LLM-based review tool should be created for. Mason's target was maintainers, but Andrew Lunn argued that the plan should be for developers to run these tools themselves before posting code for review. That, he said, would further reduce the workload on maintainers, who would only need to run LLM review to verify the the submitter had already done so.

Pinchart, along with others, pointed out that getting developers to use the tools (such as checkpatch.pl) that exist now is difficult; he wondered how submitters could be encouraged to run any new tools. Tim Bird suggested annotating patches with a list of the tools that have been run on them so that maintainers could see that history. Bottomley, instead, said that these tools should be run automatically on patches sent to the mailing lists, much like the checks that the 0day robot runs on posted patches now. Bird, though, said that running the tools should be expected of submitters. "It then becomes a cost for the contributor instead of the upstream community, which is going to scale better."

Mason was clear in his belief that LLM-generated reviews should happen in public as part of the submission process:

I think it's also important to remember that AI is sometimes wildly wrong. Having the reviews show up on a list where more established developers can call bullshit definitely helps protect against wasting people's time.

Linus Torvalds, in his one contribution to the discussion, agreed. He was about the only one to express concerns about the technology, saying "I think we've all seen the garbage end of AI, and how it can generate more work rather than less". Mason agreed that Torvalds's concerns were relevant, based on his own experience:

My first prompts told AI to assume the patches had bugs, and it would consistently just invent bugs. That's not the end of the world, but the explanations are always convincing enough that you'd waste a bunch of time tracking it down.

Torvalds mentioned the scraper problem as well. His concerns notwithstanding, he believes that this technology will prove helpful, but he feels that its initial adoption has to be aimed at making life easier for maintainers. "So I think that only once any AI tools are actively helping maintainers in a day-to-day workflow should people even *look* at having non-maintainers use them".

The conversation wound down shortly after that. One clear conclusion, though, is that these tools seem destined to play an increasing role in the kernel-development process. At some point, we will likely start seeing machine-generated reviews showing up on the mailing lists; then, perhaps, the real value of LLM-based patch-review tools will start to become clear. It will be interesting to see how the inevitable related discussion at the 2025 Maintainer Summit in December plays out.

Comments (81 posted)

DebugFS on Rust

By Daroc Alden
October 22, 2025

Kangrejos

DebugFS is the kernel's anything-goes, no-rules interface: whenever a kernel developer needs quick access to internal details of the kernel to debug a problem, or to implement an experimental control interface, they can expose them via DebugFS. This is possible because DebugFS is not subject to the normal rules for user-space-interface stability, nor to the rules about exposing sensitive kernel information. Supporting DebugFS in Rust drivers is an important step toward being able to debug real drivers on real hardware. Matthew Maurer spoke at Kangrejos 2025 about his recently merged DebugFS bindings for Rust.

Maurer began with an overview of DebugFS, including the things that make implementing a Rust API tricky. DebugFS files should outlive the private data that they allow access to, in case someone holds a file descriptor open after the underlying object has gone away. Also, DebugFS directory entries can be removed at any time, or will be automatically removed when the parent directory entry is destroyed. "That will come back to haunt us." Finally, DebugFS directories have to be manually torn down; they aren't scoped to an individual kernel module.

[Matthew Maurer]

All of this comes together to make a set of lifetime constraints that's difficult to faithfully model in Rust. At first, Maurer thought to implement a DebugFS file as a weak reference-counted pointer to a Rust trait object. That doesn't work for several reasons, including the fact that DebugFS files don't have a destruction callback. Also, DebugFS gives files one word of private data — normally used as a pointer to the object they are concerned with — but Rust pointers to trait objects are two words wide (one pointer to the object, and one to its virtual method table).

These problems aren't insurmountable — Maurer could have just added an additional pointer indirection — but that wouldn't be elegant. He wanted to find a solution that naturally fits with the lifecycle of a DebugFS directory entry, while only having one word of private data and minimal overhead. The design that Maurer ended up proposing was to have the directory entries reference-counted such that they are not destroyed until all of their child objects have been dropped, and the directory itself has been dropped. To accomplish this, two different interfaces would be exposed to Rust: a simple one for DebugFS directories with simple lifetimes, as well as a more complex, general one.

The simpler API, which Maurer called the "File API", has the DebugFS file actually own its associated data. Exposing some existing Rust data is as simple as wrapping it in a debugfs::File<T>; by default, the read and write operations for the file will convert the value to or from a string and read or update it as appropriate. The programmer can attach their own callbacks, instead, to implement custom behaviors. The downside is that there is no way to have multiple files reference the same data (without some internal reference-counted pointer), and it's not possible to conditionally provide a file based on whether some run-time value is true or false.

The more complex API, the "Scope API", allows multiple files to refer to the same data, to refer to multiple separate structures in any combination, to create files conditionally, etc. In turn, it can't delete individual subdirectories or files — the whole DebugFS directory needs to be released at once.

Maurer went through examples of how to use each API; while a bit complex, the use of the file API could be substantially simplified if Rust gains built-in in-place initialization. Neither API was terribly surprising — but the obscure contortions (read: cool hacks) required to make them work efficiently were considerably more interesting.

Pointer smuggling

As previously mentioned, DebugFS provides only a single word of private data for file structures, which is ordinarily a pointer to the underlying data for the DebugFS file, a property that Maurer wanted to preserve. But part of the utility of DebugFS is that the developer can override the file operations with arbitrary functions; that makes it easy to trigger actions in a driver in response to reads or writes to a DebugFS file. It would be possible to do this by making the user of DebugFS fill out a struct file_operations, but Maurer wanted a less verbose API. The ergonomic way to encode this in the Rust APIs is to allow the programmer to attach a function or closure to the debugfs::File object. Somehow, those function pointers need to make their way into the file_operations structure used by DebugFS. But Maurer also didn't want the API to need to allocate space for the structure at run time — he wanted the appropriate structure to be generated statically, at compile time, making the entire Rust DebugFS interface allocation-free.

Maurer's solution relies on the fact that, in Rust, every function and closure has its own unique type at compile time. This is done because it makes it easier for LLVM the Rust compiler to apply certain optimizations — a call through Rust function pointer can often be lowered to a direct jump or a jump through a dispatch table, instead of a call through an actual pointer. This makes Rust function types unique zero-sized types: there is no actual data associated with them, because the type is enough for the compiler to determine the address of the function.

The *_callback_file() functions in his new API, which take callbacks to implement the read and write operations on a file, don't actually store the provided function pointers anywhere. Instead, the type of the callback is passed as a generic argument to the code that fills out instances of the file_operations structure. When the Rust code is monomorphized during compilation, a different file_operations structure is generated for each file that uses a different set of callbacks. The generic code turns the type of the function back into a pointer to the actual function itself, and calls it. Since the conversion is done at compile time, the pointer to the callback never actually has to be stored anywhere outside the file_operations structure at run time. This trick effectively "smuggles" the function pointer through the type system, which lets Maurer pass off the work of constructing all of the needed file_operations structures to the compiler's monomorphization implementation and avoid allocating.

The reaction to this explanation was mixed. While everyone present agreed that it was clever, and permitted writing a nice API, there was some sentiment that it might be too clever. Gary Guo pointed out one potential problem with the (unsafe) code that Maurer wrote to turn a function type back into an actual function pointer: while it was correct for function types, attempting to use it with other zero-sized types could cause undefined behavior, because it didn't ensure that internal invariants of the type are checked.

There are some zero-sized types where the actual address of the value is important, Guo explained. For example, a programmer could create a zero-sized type representing that the data at a particular address is readable. Alice Ryhl suggested restricting the function to only operate on types that implement the Copy trait, since they can't have invariants that rely on having a stable address. Maurer replied that he wasn't worried in this case, because the function was intended as an internal implementation detail of the DebugFS interface, but agreed that in the general case requiring the type to implement the Copy trait would make sense. One of the assembled developers asked Pierre-Emmanuel Patry whether he anticipated supporting code like this to be a problem for gccrs; he did not think that it would impose any additional burden, since some parts of the standard library already rely on the behavior of function types.

Andreas Hindborg asked for more details on why smuggling a pointer through the type system like this was permitted — specifically, why Maurer had claimed that the type needed to be "inhabited" for the trick to work. Zero-sized types can either have one valid value (the typical case), or no valid values, Maurer explained. So, if someone tried to use his trick to create a pointer to a type that exists, but where constructing a value of the type is impossible, they could break Rust's type system — which is why the helper function is unsafe.

Hindborg asked whether the pointer-smuggling trick was documented anywhere. Maurer replied: "It's well documented in the code", to general laughter. Guo asked whether they could just change the DebugFS C structure to have two pointers, and avoid this whole workaround. Maurer passed the question off to Greg Kroah-Hartman, who answered that he didn't think they could, because it would impact the layout of the inode structure, which is widely used outside DebugFS. In his opinion, this was a case of "you optimized for fun" — the equivalent C code just allocates and eats the cost of an additional pointer indirection. But he didn't think there was anything wrong with odd techniques being used here; in many ways, it's what DebugFS is there for.

Ultimately, the pointer-smuggling solution did remain in the final patch set that was merged for the 6.18 kernel. The trick is unlikely to be adapted for use in wider contexts in the kernel's Rust bindings, though.

Comments (5 posted)

Page editor: Jonathan Corbet
Next page: Brief items>>


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds