|
|
Subscribe / Log in / New account

Fedora SIG changes Python packaging strategy

By Joe Brockmeier
July 16, 2025

Fedora's NeuroFedora special-interest group (SIG) is considering a change of strategy when it comes to packaging Python modules. The SIG, which consists of three active members, is struggling to keep up with maintaining the hundreds of packages that it has taken on. What's more, it's not clear that the majority of packages are even being consumed by Fedora users; the group is trying to determine the right strategy to meet its goals and shed unnecessary work. If its new packaging strategy is successful, it may point the way to a more sustainable model for Linux distributions to provide value to users without trying to package everything under the sun.

NeuroFedora SIG

The goal of the NeuroFedora SIG is to make it easy to use Fedora as a platform for neuroscience research. The SIG provides more than 200 applications for computational neuroscience, data analysis, and neuro-imaging, as well as a Fedora Comp-Neuro Lab spin with many applications for neuroscience preinstalled.

On June 6, Ankur Sinha started a discussion on the NeuroFedora issue tracker about changing the SIG's packaging strategy. He said that the group was no longer sure that providing native packages was the right way to serve the target audience:

A lot of our packages are python libraries but I'm seeing more and more users just rely on pip (or anaconda etc.) to install their packages instead of using dnf installed packages. I was therefore, wondering if packaging up Python libraries off PyPI was worth really doing.

Sinha said he had discussed the topic at Flock, which was held in early June, with Karolina Surma, a member of Fedora's Python SIG. Surma said that the Python SIG focuses on core Python packages, such as the Python itself, pip, and setuptools; Red Hat's Python developers had attempted to automatically convert all of the Python Packaging Index (PyPI) to RPMs in a Copr repository, but Surma said the resulting packages were not of sufficient quality to recommend. For Python libraries, it was up to packagers to determine if the use case required system packages.

Sinha tried to capture some of the reasons for and against providing native Python packages via the SIG. Since most of the upstream documentation refers to pip for installing modules from PyPI, Sinha thought it was unlikely that most users would turn to DNF for Python libraries. Pip also allows users to install different versions of libraries in virtual environments, which cannot be done with DNF. On the other hand, Sinha said that distribution packaging is important because it allows packagers to push out updates in a timely manner. "When installing from forges directly, there doesn't seem to be a way of users being notified if packages that they're using have new updates."

Benjamin Beasley said that the main reason to package libraries should be to support packaged Python-based tools and applications. He added that he saw value in having packages for command-line tools so that users could use them without having to care about what language the tools were written in or knowing how to use language-specific tools or repositories. Distribution packaging could be reserved for "software that is difficult to install, with things like system-wide configuration files, daemons, and awkward dependencies". If a Python library is not needed by other packages, on the other hand, there is not much justification for packaging it.

Testing and documentation versus packaging

Sinha asked if it even made sense to package applications that could be installed from PyPI and suggested that for some pip-installable packages it might be better for the SIG to provide documentation on installation rather than maintaining a package. For example, a pip package might require Fedora's GTK3 development package (gtk3-devel) to be installed in order to build and install correctly. He wondered if the SIG should package that application or just maintain documentation on how to install it with pip on Fedora.

After a few days, Sinha put out an update with his plan; he would create a list of Python packages of interest to the SIG, write a script to see if they could be successfully installed using pip, and orphan those that were not required to be native packages. He also asked again for input and specifically tagged SIG member Sandro Janke.

Two days later Janke appeared and agreed that it made sense to reduce the number of packages maintained by the SIG, as the current workload was challenging. He said that it would be a good idea to standardize on one tool (e.g. pip or uv) to recommend to users. "Supporting (and understanding) one tool well is better than supporting many tools half-heartedly". He was less convinced that the SIG should take responsibility for testing and documentation of Python modules that were not maintained by the SIG. "When it comes to local installation my first instinct is that the onus is with the user."

After gathering feedback from the SIG, Sinha floated a general guideline of "prioritizing software that could not be installed from an upstream index". He said he would post to Fedora's python-devel mailing list to gather feedback, but it was ultimately up to the NeuroFedora SIG since it would have to do the work.

On June 25, Sinha started that discussion on Fedora's python-devel list. To lighten its packaging load and better address user needs, Sinha said that the SIG had discussed a new strategy. The group would prioritize packaging software that was either difficult to install or completely unavailable on PyPI. If a package was installable with pip, then it would be dropped from Fedora.

Instead, the group would continue to test that software it cared about could be installed using pip; the SIG would provide documentation about software status and additional information required to get it working. NeuroFedora contributors would also report any problems to the respective upstreams and submit fixes when possible.

Michael Gruber said that the proposal was coming at just the right time. It would help to reduce the set of RPM-packaged applications, which would be beneficial when Fedora has to do a mass-rebuild of packages due to a Python change. Gruber also liked the idea of having a method to test if PyPI packages were installable on Fedora, and that might be beneficial to other language ecosystems as well.

Making a list

No one seemed to object to the plan, and Sinha wrote a script to find the leaf packages—those that are not required as dependencies for other packages—among the Python packages maintained by the SIG. He posted a list of more than 110 packages that the group could consider dropping. Beasley went through the list and provided a summary of packages that should be kept for various reasons, including some that were not Python packages but those that offered Python bindings. That brought the list down to nearly 70 packages, including a handful that had already been orphaned or retired for the upcoming Fedora 43 release.

One outstanding question is which versions of Python to test modules against. Each Fedora release has a default version of Python—version 3.13 in Fedora 42—but older and newer versions of Python are available from its repositories as well. For example, users can install Python 3.6, 3.9, 3.10, 3.11, 3.12, or 3.14 alongside the default version if they need an older or newer version for whatever reason. Janke noted that upstream maintainers are often slow to adapt to changes in the latest Python releases. He asked if the SIG should support the "bleeding edge" or the oldest default version in a supported Fedora release. As of this writing, that would be Fedora 41, which also uses Python 3.13.

On July 5, Sinha said that he had implemented a pytest-based checker to verify whether pip-installed packages install cleanly on a Fedora system. He also has an implementation of a searchable package list (sample screenshot here) that will display software that has been tested by the NeuroFedora SIG. The SIG has not deployed the changes yet, or finalized the list of packages to be dropped, but it seems likely this work will land in time for the Fedora 43 release in September.

At the moment, this strategy is limited to the NeuroFedora SIG, and will only impact a small percentage of Fedora packages. However, this hybrid approach of packaging the software that most benefits from being included in the distribution, while testing and documenting packages that can be installed from the language repository directly is something that might be worth examining on a larger scale and beyond the Python ecosystem.

The number of people interested in maintaining native packages for Fedora (and other distributions) has not kept up with the growth of software available in various language ecosystems. NeuroFedora's strategy might be a good way for Linux distributions to support developers and users while lessening the load on volunteer contributors.



to post comments

A sensible idea

Posted Jul 16, 2025 15:41 UTC (Wed) by nickodell (subscriber, #125165) [Link]

This change makes a lot of sense, and I expect that it will be pretty uncontroversial. Also, it's neat that Fedora supplies multiple versions of Python to use - on other OSes, I frequently need to install deadsnakes or similar to test code with different versions of Python.

Depenencies

Posted Jul 16, 2025 17:40 UTC (Wed) by ballombe (subscriber, #9523) [Link]

I do not know Fedora, but in Debian the majority of python packages are added because they are dependencies of other packages or are required to build the documentation of other packages.

Could/Should dnf be modified to install python packages?

Posted Jul 16, 2025 17:53 UTC (Wed) by prarit (subscriber, #27126) [Link]

I know that glosses over the pip vs dnf distinction but it would give users 'one stop shopping' for package installs.

Better use of language

Posted Jul 16, 2025 18:14 UTC (Wed) by sionescu (subscriber, #59410) [Link] (1 responses)

I should like to make a plea to all journalists, to stop using "consume" and go back to the good old-fashioned "use".

Better use of language

Posted Jul 16, 2025 19:18 UTC (Wed) by antiphase (subscriber, #111993) [Link]

See also "leverage" and "utilize".

What about user apps that might depend on these in the future

Posted Jul 16, 2025 18:29 UTC (Wed) by jsanders (subscriber, #69784) [Link] (9 responses)

As an author of a Python-based GUI application used by users (Veusz), and a packager (for Fedora), I worry that dropping packages like these may be dropping future dependencies for new features added to my application.

Wouldn't it make more sense to develop a lightweight way of wrapping a pip package inside an rpm one, so that the maintenance burden is significantly reduced?

What about user apps that might depend on these in the future

Posted Jul 16, 2025 19:19 UTC (Wed) by joib (subscriber, #8541) [Link]

Yes, some kind of meta package manager, that could use a number of different backends (dnf/rpm, pypi, crates/cargo, cran, cpan etc etc.).

The problem is of course the impedance mismatch, and various policy issues. And to be honest, many package managers just aren't that good. I've tried to manage a number of long lived python environments with pip or conda, upgrading packages etc just like with a distro package manager. But it just doesn't work, sooner or later the environment gets itself into a pretzel and you have to start over.

What about user apps that might depend on these in the future

Posted Jul 16, 2025 19:29 UTC (Wed) by mattdm (subscriber, #18) [Link] (1 responses)

We should make it so dnf recognizes cutated system-wide Python packages as filling the dependency, and can install them if needed.

What about user apps that might depend on these in the future

Posted Jul 17, 2025 19:04 UTC (Thu) by raven667 (subscriber, #5198) [Link]

Having spent a lot of time with Perl and RPM I think that building bridges between the language-specific packaging communities and the distro packaging communities makes a ton of sense. While there are obviously technical incompatiblies to work out, the disconnect seems primarily social not technical, each community wants to be able to solve their immediate problems directly without needing expensive coordination with other tangential related communities who need to bikeshed about naming conventions or other issues that slow down the development process.

The problem is that their needs do overlap and affect one another and they end up re-solving the same problems with different technology based on their differing experience, like how the conda team basically invented their own independent distro runtime and build system to handle Python binary dependencies, maybe a team with better lines of communication with Fedora or Debian would have found a different solution that used the distro tools for managing the underlying runtime that could take advantage of the mature tools and documentation which has already been done.

Or why Python has wheels while for Perl I can easily manage all my dependencies using cpanspec and RPM with basically full fidelity of the dependency graph generated automatically with minimal handholding on my part. I wish every language with built in repository dependency management worked as well such that you wouldn't need an entirely bespoke infrastructure but could output, say, an rpm which is relocatable into the venv if you dont want it system-wide. That might work better now, there seem to be much more comprehensive macros than last time I tried although I didn't find anything as easy as cpanspec for walking a whole dep tree and making packages for all of it. I think some info isn't well documented even for pip, since one often needs to supply a constraints file to prevent it from pulling incompatible dependencies which aren't documented in the pypi metadata already

I just think there are a lot of ways the pypi, cpan, node, maven, cargo, go, etc. communities could be brought together to bridge their different approaches to software change management, with supporting features added to distro tools, naming convention, etc. That collect the best tools and ideas from all of them to enhance the others.

The system works better when the tool can manage the entire dependency graph, with containers the graph kind of ends at the kernel and container orchestrator, but with system-wide packages or even venvs that depend on local C libraries the graph crosses language and system tools, so a each tool will have an incomplete view of state, and eventually male a mess or at least be frustrating to work with ("I can see the library right there!!")

What about user apps that might depend on these in the future

Posted Jul 17, 2025 13:44 UTC (Thu) by jafd (subscriber, #129642) [Link] (1 responses)

Do you have reasons to believe the python modules you need would be dropped without checking if they have any reverse dependencies first?

What about user apps that might depend on these in the future

Posted Jul 17, 2025 16:50 UTC (Thu) by Wol (subscriber, #4433) [Link]

What use is that? As I understand it, the question is "What happens if I *introduce* a reverse dependency?". Unless the SIG happens to possess a time machine, the reverse dependency check will find nothing because there is nothing to find.

Cheers,
Wol

What about user apps that might depend on these in the future

Posted Jul 17, 2025 13:52 UTC (Thu) by jafd (subscriber, #129642) [Link] (3 responses)

> Wouldn't it make more sense to develop a lightweight way of wrapping a pip package inside an rpm one, so that the maintenance burden is significantly reduced?

There exists venvrpm to produce a RPM of a virtualenv. However, to make this work really well, one would need to somehow gather all dependencies, including transient ones, and build them into the virtualenv in correct order. The SRPM needs to contain all sources and be buildable without the internet, and I think this used to be the major headache of coming up with a sensible packaging policy for Go and Rust things.

(This also means bumping your release each time you need to update a transient dependency, of course, same as when a Go module gets an update.)

What about user apps that might depend on these in the future

Posted Jul 17, 2025 15:03 UTC (Thu) by huntermatthews (subscriber, #4490) [Link] (2 responses)

Google is failing me - could I get a link to venvrpm please? I have a number of deployment issues and am looking at various solutions.

What about user apps that might depend on these in the future

Posted Jul 17, 2025 15:19 UTC (Thu) by jzb (editor, #7867) [Link] (1 responses)

I think rpmvenv is actually what you want: rpmvenv on PyPI.

What about user apps that might depend on these in the future

Posted Jul 17, 2025 15:45 UTC (Thu) by jafd (subscriber, #129642) [Link]

Yes, this one. I feel dysgraphic today — I looked at its name in the other tab and confidently retyped it wrongly.

The Incredible Shrinking Fedora

Posted Jul 16, 2025 22:40 UTC (Wed) by comex (subscriber, #71521) [Link] (1 responses)

Sounds very similar to the proposal to make Flathub the preferred source of GUI applications (taking priority over RPMs and Fedora's self-built Flatpaks):

https://pagure.io/fedora-workstation/issue/463

Is Fedora is becoming a distribution that wants to distribute as little as possible?

Perhaps it's more accurate to say Fedora is just accepting the revealed preferences of its users.

The Incredible Shrinking Fedora

Posted Jul 17, 2025 18:10 UTC (Thu) by raven667 (subscriber, #5198) [Link]

I think it's funny because we are going back to something more like Fedora Core / Extras where they started out, because packaging the whole world in a way that satisfies everyone is infeasible.

I'm using Silverblue and Bazzite and for desktops prefer this change management style where the whole OS is considered one single versioned firmware image and local config, local data and local apps are distinct separately managed tbings. To me that keeps the conceptual complexity down because you can treat huge swaths of software as a single abstract unit, even when those units are built up using traditional atomized packaging techniques.

Debian has a huge number of volunteers who work on packaging and its still very difficult to manage everything that way, at the smallest atomic unit of software


Copyright © 2025, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds