Fedora SIG changes Python packaging strategy
Fedora's NeuroFedora special-interest group (SIG) is considering a change of strategy when it comes to packaging Python modules. The SIG, which consists of three active members, is struggling to keep up with maintaining the hundreds of packages that it has taken on. What's more, it's not clear that the majority of packages are even being consumed by Fedora users; the group is trying to determine the right strategy to meet its goals and shed unnecessary work. If its new packaging strategy is successful, it may point the way to a more sustainable model for Linux distributions to provide value to users without trying to package everything under the sun.
NeuroFedora SIG
The goal of the NeuroFedora SIG is to make it easy to use Fedora as a platform for neuroscience research. The SIG provides more than 200 applications for computational neuroscience, data analysis, and neuro-imaging, as well as a Fedora Comp-Neuro Lab spin with many applications for neuroscience preinstalled.
On June 6, Ankur Sinha started a discussion on the NeuroFedora issue tracker about changing the SIG's packaging strategy. He said that the group was no longer sure that providing native packages was the right way to serve the target audience:
A lot of our packages are python libraries but I'm seeing more and more users just rely on pip (or anaconda etc.) to install their packages instead of using dnf installed packages. I was therefore, wondering if packaging up Python libraries off PyPI was worth really doing.
Sinha said he had discussed the topic at Flock, which was held in early June, with Karolina Surma, a member of Fedora's Python SIG. Surma said that the Python SIG focuses on core Python packages, such as the Python itself, pip, and setuptools; Red Hat's Python developers had attempted to automatically convert all of the Python Packaging Index (PyPI) to RPMs in a Copr repository, but Surma said the resulting packages were not of sufficient quality to recommend. For Python libraries, it was up to packagers to determine if the use case required system packages.
Sinha tried to capture some of the reasons for and against providing
native Python packages via the SIG. Since most of the upstream
documentation refers to pip for installing modules from PyPI, Sinha thought it was
unlikely that most users would turn to DNF for Python libraries. Pip
also allows users to install different versions of libraries in virtual
environments, which cannot be done with DNF. On the other hand,
Sinha said
that distribution packaging is important because it allows packagers
to push out updates in a timely manner. "When installing from
forges directly, there doesn't seem to be a way of users being
notified if packages that they're using have new updates.
"
Benjamin Beasley said
that the main reason to package libraries should be to support
packaged Python-based tools and applications. He added
that he saw value in having packages for command-line tools so that
users could use them without having to care about what language the
tools were written in or knowing how to use language-specific tools or
repositories. Distribution packaging could be reserved for
"software that is difficult to install, with things like
system-wide configuration files, daemons, and awkward
dependencies
". If a Python library is not needed by other
packages, on the other hand, there is not much justification for
packaging it.
Testing and documentation versus packaging
Sinha asked if it even made sense to package applications that could be installed from PyPI and suggested that for some pip-installable packages it might be better for the SIG to provide documentation on installation rather than maintaining a package. For example, a pip package might require Fedora's GTK3 development package (gtk3-devel) to be installed in order to build and install correctly. He wondered if the SIG should package that application or just maintain documentation on how to install it with pip on Fedora.
After a few days, Sinha put out an update with his plan; he would create a list of Python packages of interest to the SIG, write a script to see if they could be successfully installed using pip, and orphan those that were not required to be native packages. He also asked again for input and specifically tagged SIG member Sandro Janke.
Two days later Janke appeared and agreed
that it made sense to reduce the number of packages maintained by the
SIG, as the current workload was challenging. He said that it would be
a good idea to standardize on one tool (e.g. pip or uv) to recommend to
users. "Supporting (and understanding) one tool well is better than
supporting many tools half-heartedly
". He was less convinced that
the SIG should take responsibility for testing and documentation of
Python modules that were not maintained by the SIG. "When it comes
to local installation my first instinct is that the onus is with the
user.
"
After gathering feedback from the SIG, Sinha floated
a general guideline of "prioritizing software that could not be
installed from an upstream index
". He said he would post to
Fedora's python-devel mailing list to gather feedback, but it was
ultimately up to the NeuroFedora SIG since it would have to do the work.
On June 25, Sinha started that discussion on Fedora's python-devel list. To lighten its packaging load and better address user needs, Sinha said that the SIG had discussed a new strategy. The group would prioritize packaging software that was either difficult to install or completely unavailable on PyPI. If a package was installable with pip, then it would be dropped from Fedora.
Instead, the group would continue to test that software it cared about could be installed using pip; the SIG would provide documentation about software status and additional information required to get it working. NeuroFedora contributors would also report any problems to the respective upstreams and submit fixes when possible.
Michael Gruber said that the proposal was coming at just the right time. It would help to reduce the set of RPM-packaged applications, which would be beneficial when Fedora has to do a mass-rebuild of packages due to a Python change. Gruber also liked the idea of having a method to test if PyPI packages were installable on Fedora, and that might be beneficial to other language ecosystems as well.
Making a list
No one seemed to object to the plan, and Sinha wrote a script to find the leaf packages—those that are not required as dependencies for other packages—among the Python packages maintained by the SIG. He posted a list of more than 110 packages that the group could consider dropping. Beasley went through the list and provided a summary of packages that should be kept for various reasons, including some that were not Python packages but those that offered Python bindings. That brought the list down to nearly 70 packages, including a handful that had already been orphaned or retired for the upcoming Fedora 43 release.
One outstanding question is which versions of Python to test modules
against. Each Fedora release has a default version of
Python—version 3.13 in Fedora 42—but older and newer
versions of Python are available from its repositories as well. For
example, users can install Python 3.6, 3.9, 3.10, 3.11, 3.12, or
3.14 alongside the default version if they need an older or newer
version for whatever reason. Janke noted
that upstream maintainers are often slow to adapt to changes in the
latest Python releases. He asked if the SIG should support the
"bleeding edge
" or the oldest default version in a supported
Fedora release. As of this writing, that would be Fedora 41, which
also uses Python 3.13.
On July 5, Sinha said that he had implemented a pytest-based checker to verify whether pip-installed packages install cleanly on a Fedora system. He also has an implementation of a searchable package list (sample screenshot here) that will display software that has been tested by the NeuroFedora SIG. The SIG has not deployed the changes yet, or finalized the list of packages to be dropped, but it seems likely this work will land in time for the Fedora 43 release in September.
At the moment, this strategy is limited to the NeuroFedora SIG, and will only impact a small percentage of Fedora packages. However, this hybrid approach of packaging the software that most benefits from being included in the distribution, while testing and documenting packages that can be installed from the language repository directly is something that might be worth examining on a larger scale and beyond the Python ecosystem.
The number of people interested in maintaining native packages for Fedora (and other distributions) has not kept up with the growth of software available in various language ecosystems. NeuroFedora's strategy might be a good way for Linux distributions to support developers and users while lessening the load on volunteer contributors.
Posted Jul 16, 2025 15:41 UTC (Wed)
by nickodell (subscriber, #125165)
[Link]
Posted Jul 16, 2025 17:40 UTC (Wed)
by ballombe (subscriber, #9523)
[Link]
Posted Jul 16, 2025 17:53 UTC (Wed)
by prarit (subscriber, #27126)
[Link]
Posted Jul 16, 2025 18:14 UTC (Wed)
by sionescu (subscriber, #59410)
[Link] (1 responses)
Posted Jul 16, 2025 19:18 UTC (Wed)
by antiphase (subscriber, #111993)
[Link]
Posted Jul 16, 2025 18:29 UTC (Wed)
by jsanders (subscriber, #69784)
[Link] (9 responses)
Wouldn't it make more sense to develop a lightweight way of wrapping a pip package inside an rpm one, so that the maintenance burden is significantly reduced?
Posted Jul 16, 2025 19:19 UTC (Wed)
by joib (subscriber, #8541)
[Link]
The problem is of course the impedance mismatch, and various policy issues. And to be honest, many package managers just aren't that good. I've tried to manage a number of long lived python environments with pip or conda, upgrading packages etc just like with a distro package manager. But it just doesn't work, sooner or later the environment gets itself into a pretzel and you have to start over.
Posted Jul 16, 2025 19:29 UTC (Wed)
by mattdm (subscriber, #18)
[Link] (1 responses)
Posted Jul 17, 2025 19:04 UTC (Thu)
by raven667 (subscriber, #5198)
[Link]
The problem is that their needs do overlap and affect one another and they end up re-solving the same problems with different technology based on their differing experience, like how the conda team basically invented their own independent distro runtime and build system to handle Python binary dependencies, maybe a team with better lines of communication with Fedora or Debian would have found a different solution that used the distro tools for managing the underlying runtime that could take advantage of the mature tools and documentation which has already been done.
Or why Python has wheels while for Perl I can easily manage all my dependencies using cpanspec and RPM with basically full fidelity of the dependency graph generated automatically with minimal handholding on my part. I wish every language with built in repository dependency management worked as well such that you wouldn't need an entirely bespoke infrastructure but could output, say, an rpm which is relocatable into the venv if you dont want it system-wide. That might work better now, there seem to be much more comprehensive macros than last time I tried although I didn't find anything as easy as cpanspec for walking a whole dep tree and making packages for all of it. I think some info isn't well documented even for pip, since one often needs to supply a constraints file to prevent it from pulling incompatible dependencies which aren't documented in the pypi metadata already
I just think there are a lot of ways the pypi, cpan, node, maven, cargo, go, etc. communities could be brought together to bridge their different approaches to software change management, with supporting features added to distro tools, naming convention, etc. That collect the best tools and ideas from all of them to enhance the others.
The system works better when the tool can manage the entire dependency graph, with containers the graph kind of ends at the kernel and container orchestrator, but with system-wide packages or even venvs that depend on local C libraries the graph crosses language and system tools, so a each tool will have an incomplete view of state, and eventually male a mess or at least be frustrating to work with ("I can see the library right there!!")
Posted Jul 17, 2025 13:44 UTC (Thu)
by jafd (subscriber, #129642)
[Link] (1 responses)
Posted Jul 17, 2025 16:50 UTC (Thu)
by Wol (subscriber, #4433)
[Link]
Cheers,
Posted Jul 17, 2025 13:52 UTC (Thu)
by jafd (subscriber, #129642)
[Link] (3 responses)
There exists venvrpm to produce a RPM of a virtualenv. However, to make this work really well, one would need to somehow gather all dependencies, including transient ones, and build them into the virtualenv in correct order. The SRPM needs to contain all sources and be buildable without the internet, and I think this used to be the major headache of coming up with a sensible packaging policy for Go and Rust things.
(This also means bumping your release each time you need to update a transient dependency, of course, same as when a Go module gets an update.)
Posted Jul 17, 2025 15:03 UTC (Thu)
by huntermatthews (subscriber, #4490)
[Link] (2 responses)
Posted Jul 17, 2025 15:19 UTC (Thu)
by jzb (editor, #7867)
[Link] (1 responses)
I think rpmvenv is actually what you want: rpmvenv on PyPI.
Posted Jul 17, 2025 15:45 UTC (Thu)
by jafd (subscriber, #129642)
[Link]
Posted Jul 16, 2025 22:40 UTC (Wed)
by comex (subscriber, #71521)
[Link] (1 responses)
https://pagure.io/fedora-workstation/issue/463
Is Fedora is becoming a distribution that wants to distribute as little as possible?
Perhaps it's more accurate to say Fedora is just accepting the revealed preferences of its users.
Posted Jul 17, 2025 18:10 UTC (Thu)
by raven667 (subscriber, #5198)
[Link]
I'm using Silverblue and Bazzite and for desktops prefer this change management style where the whole OS is considered one single versioned firmware image and local config, local data and local apps are distinct separately managed tbings. To me that keeps the conceptual complexity down because you can treat huge swaths of software as a single abstract unit, even when those units are built up using traditional atomized packaging techniques.
Debian has a huge number of volunteers who work on packaging and its still very difficult to manage everything that way, at the smallest atomic unit of software
A sensible idea
Depenencies
Could/Should dnf be modified to install python packages?
Better use of language
Better use of language
What about user apps that might depend on these in the future
What about user apps that might depend on these in the future
What about user apps that might depend on these in the future
What about user apps that might depend on these in the future
What about user apps that might depend on these in the future
What about user apps that might depend on these in the future
Wol
What about user apps that might depend on these in the future
What about user apps that might depend on these in the future
What about user apps that might depend on these in the future
What about user apps that might depend on these in the future
The Incredible Shrinking Fedora
The Incredible Shrinking Fedora