|
|
Subscribe / Log in / New account

Recent disruptive changes from Setuptools

May 21, 2025

This article was contributed by Karl Knechtel

In late March, version 78.0.1 of Setuptools — an important Python packaging tool — was released. It was scarcely half an hour before the first bug report came in, and it quickly became clear that the change was far more disruptive than anticipated. Within only about five hours 78.0.2 was published to roll back the change, and multiple discussions were started about how to limit the damage caused by future breaking changes. Nevertheless, many users still felt the response was inadequate. Some previous Setuptools releases have also caused problems on a smaller but still notable scale, and hopefully the developers will be more cautious going forward. But there are also lessons here for the developers of Python package installers, ordinary Python developers and end users, and even Linux distribution maintainers.

Python packaging

Python code is commonly distributed using a Python-native packaging system, rather than as a Linux package or standalone executable. The basic ideas have been described by LWN before.

Packages are available in two standard formats: the "wheel" format, and the "sdist" (or source distribution) format. Both are compressed archives, but a wheel has been pre-built and can be installed without much work besides copying files. An sdist, on the other hand, roughly represents the developer's source tree (although it might omit tests, documentation, etc.), and must be built on the end user's machine. This is generally only useful for packages that contain non-Python code, but it allows for additional setup prior to installation. An sdist also includes instructions (perhaps including Python code that will be automatically run) to compile non-Python code, arrange files, and ultimately "build" a corresponding wheel.

Creating packages — in either format — can be quite complex. Usually, therefore, most of the work is delegated to a build backend, which provides logic to create an sdist or wheel directly from a source repository, as well as to create a wheel from an sdist. To install an sdist, an installer such as pip will check the sdist's metadata to determine the right build backend, ensure that backend is installed, and invoke it via a standard API. The resulting wheel is then installed as if it had been downloaded directly.

By default, this process uses a kind of build isolation: the backend is installed in an isolated, temporary virtual environment. This ensures that builds don't interfere with each other, allows different packages to use different versions of Setuptools, and means that Setuptools doesn't need to be installed ahead of time in pip's own environment.

Historically — since back when end users were expected to resolve and install dependencies themselves — Setuptools was the standard tool for a variety of development tasks. Nowadays, much of its functionality has been deprecated, but it still acts as a build backend — and for legacy support reasons, packages are assumed to use it by default.

What went wrong

Projects built with Setuptools may describe some of their metadata in a configuration file named setup.cfg (which uses a simple INI-like format, with key-value pairs organized into sections). The unreleased Setuptools 78.0.0 added stricter validation for the contents of such files. Historically, hyphens and underscores were treated as equivalent in the names of keys appearing in the file — behavior originally implemented by the distutils standard library module. In 2018, a bug report claimed that this normalization caused problems with tox (a Python tool for automating development tasks such as testing). Setuptools developer Jason R. Coombs agreed that there was a problem; after some changes to the Setuptools configuration-parsing code, a a deprecation warning for hyphens in key names was added in 2021.

Setuptools version 78 sought to turn that warning into an error. Developers creating new sdists would be able to fix the problem trivially, but the change also immediately broke the automated wheel-building process for many sdists already available on the Python Package Index (PyPI).

In fact, this change was already known to be incompatible with the popular Requests library — its setup.cfg includes provides-extra and requires-dist keys, where the standard names are provides_extra and requires_dist respectively. But, since Requests uses only Python code, and already publishes a wheel, it would only cause problems for those (such as Linux distribution maintainers) who insist on building directly from source.

The 78.0.1 patch simply removed Requests from the Setuptools integration tests. Setuptools developer Anderson Bravalheri also submitted a pull request to Requests to fix the problem with setup.cfg (though as it turns out, the next minor version of Requests is expected to modernize its packaging setup completely, so as to avoid any future issues). However, packages with now-invalid setup.cfg files turned out to be much more common than expected — about 12,000 packages are estimated to have been affected.

Reporting and updating

When a resulting installation failure was first reported, Bravalheri's initial response was fairly dismissive:

Hi @andy-maier, please contact the package developers and inform about the problem highlighted in the error message: [...] This [key name in setup.cfg] has been deprecated in 2021.

He followed that up with:

We bumped the major version of setuptools from 77.0.3 to 78.0.0 to indicate an intentional breaking change, namely the removal for the deprecated handling of --separated options in setup.cfg. They are described in this section: https://setuptools.pypa.io/en/latest/history.html#deprecations-and-removals.

At some level, this response makes sense. The project correctly followed its semantic versioning policy, and the deprecation had occurred years ago. In principle, users could avoid problems by simply not upgrading to the new Setuptools version, and developers could patch their setup.cfg files to meet the new expectation. However, things aren't really that simple. Updating the released packages is a lot of work for everyone — it's clear that many maintainers also resent this kind of churn. Besides, it doesn't prevent attempts to install the previous versions of the package with now-invalid setup.cfg files.

On the user side, meanwhile, build isolation makes it difficult to keep Setuptools downgraded — because of all the temporary copies installed automatically. Historically, there was no good way for packages to specify what version of Setuptools they used. That's possible now, but projects using Setuptools are under no pressure to change anything. And the standards-mandated default is to use the latest available version. The net result is that every sdist following any kind of deprecated practice becomes a ticking time bomb.

Worse yet, every package that breaks has the potential to cause major ripple effects, in today's world of "software ecosystems" where projects often have large graphs of transitive dependencies. Some of the projects broken this time around turned out to be abandoned or poorly maintained, but used by many others. For example, the stringcase module broke due to the change, but it is seemingly abandoned; when I learned of this, I decided to republish the library following modern packaging standards.

Similar events

This is not the first time that a breaking change in a major Setuptools version was disruptive. In recent memory, arguably the most similar case is that of Setuptools version 72, which attempted to remove the setuptools.command.test module. This, too, was following up on a long-standing deprecation of using setup.py directly to run tests. This isn't even relevant to installing packages from PyPI — but because setup.py is Python code, removing the module causes an ImportError at build time. Notably, the Requests project was also impacted by this, even though its developers were no longer even using this workflow.

Less notably, Setuptools version 77 was intended to enforce standards for how license files are referred to in project metadata, while also implementing the new standard enabling the use of SPDX license expressions in package metadata. However, the implementation caused problems for some large multi-language projects (notably for Apache projects) where the source tree has relevant license files in a parent directory of the actual Python project. Setuptools version 71 changed its vendoring system to favor separately-installed versions of its dependencies (as a stepping stone toward de-vendoring); this caused problems for users when those dependencies removed functionality that was still present in the vendored version. An internal reorganization in version 69 broke Astropy's continuous-integration (CI) system, and another in version 70 broke PyTorch. Not every new version causes widespread problems, but it's certainly possible to continue this list.

Criticism

Aside from the GitHub issue, users critiqued handling of the incident on Hacker News and Reddit. In particular, many questioned why the use of hyphens should have been deprecated in the first place. Indeed, one might infer that Setuptools can't really ever safely remove support for anything — as much as the developers might like to clean up the code, Hyrum's Law reigns supreme. Setuptools is seen as critical infrastructure for Python, and it's clear that the Python world will be stuck with old packages following deprecated packaging practices for quite some time.

The problematic version 78.0.1 was also never deleted nor "yanked" (to prevent pip from installing it unless explicitly selected by an exact version number) from PyPI, causing further objections.

One might argue that the simple fact that Setuptools is on version 78 points to a problem in itself — that Setuptools development is moving far too quickly. This is not entirely fair: about half of Setuptools major versions date to before the implementation of the new pyproject.toml-based system, with version 39 being released in March 2018. However, we still now see about five new major versions of Setuptools per year. Progress on updating legacy projects to modern standards, needless to say, has been far slower. While I'd prefer not to see Setuptools development artificially slowed, it would be nice to have breaking changes bundled together so that they occur less often.

Discussion

More importantly, though, there needs to be better awareness of such changes in Setuptools, and better processes for addressing them. On the official Python discussion forums, pip developer Damian Shaw started a thread about how build backends could improve their approach to deprecating and removing functionality, summarizing the overall problem for the Python ecosystem:

In this ecosystem it has meant that any build backend that is heavily relied on that makes a backwards incompatible change can break user workflows on a very large scale, and the tooling available for users to recover isn't well developed.

The ensuing discussion revealed that the documentation for some other build backends recommends setting an upper bound for the backend version in the project metadata. However, doing so can potentially break packages in the future — for example, if a newer version of the backend is required to work under a new version of Python.

Another approach is to have the installer use different logic to choose a build backend version. Paul Moore, a pip developer, initially resisted such a change:

I don't think it's inevitable, and the UX [user experience] design is one of the things I'm concerned about. IMO, pip shouldn't be taking on complexity here - I don't see any evidence that this is a major issue. We've never had any problems like this with any other build backend. And while I don't want to blame setuptools, I also don't want pip's feature set to be dictated by mishaps in setuptools' release management.

However, the discussion on the initial Setuptools bug report revealed that pip already provides some useful functionality here: a hack that allows end users to constrain the version of the build backend used for isolated builds. The pip team then discussed improving the surrounding user interface and making this an officially blessed approach to the problem. An analogous approach is explicitly supported by uv; Poetry users installing dependencies in a development environment may have to wait for the resolution of a GitHub issue. Pip users can also just disable build isolation, which might fix the problem in some cases.

Regarding front-end tools failing to expose warnings, Bravalheri proposed that installers should show build warnings to end users by default. This is not the first time the problem of deprecations has been raised for the language. Python developer Steve Dower noted that end users can't really do anything about the warnings. However, if a project only publishes an sdist, the maintainers won't see the output from trying to build a wheel unless they test installation locally. Even if end users can't fix the problem, they could issue bug reports about the warnings if they are shown.

A final problem with communication between installers and build backends is that the existing API doesn't seem to provide a good way to separate warnings from the rest of the build output. Moore suggested building a separate system for this:

So why don't we ask the setuptools maintainers what they would like from build frontends? Not right now - emotions are too high at the moment - but once things have cooled down, we should work with setuptools to understand how we can help things go more smoothly in future.

He had a list with several suggested paths for working with Setuptools developers, including looking into ways to get deprecation warnings in front of users:

There was no way for setuptools to get the deprecation warning in front of their users? Then frontends could add a way for backends to issue "priority warnings", which will be displayed even when normal output is suppressed.

Hopefully, something like this will be implemented in the future.

In summary, the Python packaging system is amazingly complex — to the point that a quibble over hyphens versus underscores can break thousands of packages and lead to hundreds of discussion posts. But with this complexity comes opportunity for many parties to improve the experience.


Index entries for this article
GuestArticlesKnechtel, Karl
PythonPackaging


to post comments

PEP517

Posted May 21, 2025 15:44 UTC (Wed) by sam_c (subscriber, #139836) [Link] (6 responses)

There's a far bigger issue brewing with PEP517 vs legacy use. The Python ecosystem is moving towards PEP517 for packaging which formalises how one creates and then installs a wheel. This intentionally doesn't handle the case where you need to install non-Python data files and friends.

The change, however, is at odds with how people have used Python `setup.py` as "Pythonic Makefiles" (to borrow a phrase from a friend). See https://github.com/pypa/setuptools/issues/2088#issuecomme... onwards for the latest instance of this blowing up.

PEP517

Posted May 22, 2025 5:14 UTC (Thu) by geofft (subscriber, #59789) [Link] (4 responses)

Can you explain what you mean by "Pythonic Makefiles"? I've been in this space (perhaps too deep in this space) for a while and I don't really follow - what is the thing you are trying to do that cannot be done?

My first reading of the comment linked from the comment thread you linked (https://github.com/pypa/packaging-problems/issues/576#iss...) is that while `setup.py install` is happy to put files in an OS-owned directory, pip is not, and for reasons like this, "distro maintainers hate pip because it does NOT do what distro maintainer want." If I'm reading this right, this is exactly backwards: pip's refusal to put files in OS-owned directories started out as a patch from distro maintainers overriding the behavior of upstream pip, because they specifically wanted pip to stop overwriting OS-owned files. That change slowly got upstreamed in a form that hopefully let the OS maintainers drop their patches.

A little later in that thread it's mentioned that "python setup.py install" is used to install e.g. systemd units in /lib. The reason setup.py was able to do this is that it was just an ordinary Python script - but that gives you a way forward; the ability to run ordinary Python scripts has not been deprecated, no matter how much setuptools being invoked through an ordinary script has been deprecated. Is there a path forward where your setup.py script does not use setuptools (which is a Python-packaging-specific tool) but uses generic Python functionality like open("/lib/...", "w")?

(There may be a need for a library that handles some of the details of doing this, like destdir; I think one could write such a library and have it be far simpler and far more useful for this purpose than setuptools.)

Or am I misunderstanding the problem?

PEP517

Posted May 22, 2025 12:56 UTC (Thu) by zahlman (guest, #175387) [Link] (1 responses)

> My first reading of the comment linked from the comment thread you linked (https://github.com/pypa/packaging-problems/issues/576#iss...) is that while `setup.py install` is happy to put files in an OS-owned directory, pip is not, and for reasons like this, "distro maintainers hate pip because it does NOT do what distro maintainer want." If I'm reading this right, this is exactly backwards: pip's refusal to put files in OS-owned directories started out as a patch from distro maintainers overriding the behavior of upstream pip, because they specifically wanted pip to stop overwriting OS-owned files. That change slowly got upstreamed in a form that hopefully let the OS maintainers drop their patches.

It's... a lot more complicated than this, unfortunately. First, to clarify a few things about your description:

Pip originally was happy to "put files" in *certain* OS-owned directories. Specifically, it will put them within any *Python environment*. If you tell it to install for the system environment, and give it sudo rights, it would obediently write files to (typically) `/usr/lib/pythonX.Y/site-packages`. Without sudo, it could still write to a user-specific directory; if you run the system Python as a regular user, Python can still `import` from there.

Eventually, distros realized that not only do those system-level installs cause obvious problems for the system package manager, but even a user-level install can cause problems for scripts that come with the system. This is especially problematic when you have a package manager (or interface thereto) *written in Python* that still does useful things when run without sudo. Hence https://peps.python.org/pep-0668/ (I think that's the "upstreamed patch" you have in mind).

Notably, though, this change was *entirely orthogonal* to the transition to the new PEP 517 system.

Separately, the wheel format can *only* put files *into the current Python environment* - as defined by the `sysconfig` standard library module. (By "current" I mean the one that pip is running under.) It can't ordinarily put them in /etc - or ~/.config , even though that isn't system-owned.

However, pip *can* effectively put files in those places (assuming filesystem permissions) if it installs from an sdist and builds from source - because that will request the build backend to do whatever is necessary to make a wheel, including running arbitrary Python code from the package. In the case of Setuptools, it's effectively equivalent to `setup.py install`. (At least the last I checked, pip run with sudo *will not drop permissions* when it invokes the build backend!) But this is not playing nice with either the Python ecosystem *or* with the Linux distro's package management.

----

But the comment is talking about something completely different. It's not about end users causing problems for the distro maintainer by using pip to install a third-party package. It's about the distro maintainer *trying to use* pip to *create a system* package.

They might try to do this by installing the code into a staging folder with pip, then using another tool to pack that tree into .deb or .rpm or whatever. As described above, theoretically the pip/Setuptools combination could be wrangled to write files to /path/to/staging/etc or whatever is needed, even while the Python environment is nominally rooted at /path/to/staging/usr. But it appears that this doesn't work with explicitly provided Setuptools features - you'd have to leverage the fact that you can write arbitrary code in `setup.py`, and do your own file I/O etc. And that code might not play nice with *other* packaging flows....

(Also, pip doesn't really know how to install cross-environment. It figures out installation paths based on the Python executable that's being used to run it. Newer versions have a `--python` argument to try to fix this - but it works by tracking down the other Python executable and *spawning it to re-run the pip code*.)

PEP517

Posted May 23, 2025 1:37 UTC (Fri) by sam_c (subscriber, #139836) [Link]

I'm not talking about using pip here (just invoking `setup.py install` or whatever will have the same issue) but rather any software where upstream expect data files to be installed to standard locations using their own setup.py, and all of their documentation refers to it and so on, which will stop working. The Python community hasn't done a great job of communicating this change.

But I'll reply more to the other comment rather than duplicating it.

PEP517

Posted May 23, 2025 1:48 UTC (Fri) by sam_c (subscriber, #139836) [Link] (1 responses)

> Can you explain what you mean by "Pythonic Makefiles"?

There's a lot of software that used `setup.py` as an entrypoint and relied on it being able to install to arbitrary locations on disk, often using it only because the software was written in Python and they felt it was natural that the build system also be in Python, despite it not needing to be.

It would also often (as the article covers) handle the testsuite but other administrative tasks, as well as "more elaborate" installation.

> The reason setup.py was able to do this is that it was just an ordinary Python script ...

Precisely.

> Is there a path forward where your setup.py script does not use setuptools (which is a Python-packaging-specific tool) but uses generic Python functionality like open("/lib/...", "w")?

Sure, that can be done - but if the software is installed via a wheel, that's not going to work, because there's no way to map extra data file locations in the wheel to real locations on the system. And none of the authors of these pieces of software relying on this functionality working ever seem to be aware of the problem.

i.e. All of it is solvable, it's just that it's a lot of work across a lot of different projects, and I don't think many of them are aware of the need to even do it. The software relying on this generally doesn't want to be installed as a wheel (it's usually "written in Python, not a library that should be imported"), they've just been piggybacking on setuptools and the setup.py entrypoint for years.

PEP517

Posted May 29, 2025 18:33 UTC (Thu) by zahlman (guest, #175387) [Link]

> And none of the authors of these pieces of software relying on this functionality working ever seem to be aware of the problem.

> i.e. All of it is solvable, it's just that it's a lot of work across a lot of different projects, and I don't think many of them are aware of the need to even do it.

My experience of discussion in Python packaging circles has been: people are very aware that Python code gets used and distributed in a wide variety of ways beyond the standard packaging system. But the people who try to "solve packaging" have very little idea of how to track down major projects that work this way. It often seems like the best they can do is to trawl through PyPI looking for sdists that will break in the future and send emails to whatever maintainer email addresses are listed in the metadata. If they don't publish an sdist then there's no clear way to figure out what their needs are to service them better.

A bit of speculation as well as *personal* investigation from Paul Moore (of the pip team) led to his submission of PEP 722, subsequently rejected in favour of PEP 723 by Ofek Lev (author of Hatch). That makes life easier for people who write single-file scripts with dependencies. But there are still huge unanswered questions about several other kinds of projects, e.g. large monorepos.

I had a bunch of post IDs saved on the Python Discourse forums about this general topic, but I lost those when I got kicked out (https://zahlman.github.io/posts/2024/07/31/an-open-letter... ; you may also remember me from https://lwn.net/Articles/988894/) and my local bookmark collection is quite large and hard to organize.

If you personally know people (or projects) that use deprecated workflows around setup.py, I'd heavily encourage you to warn them - and link to https://blog.ganssle.io/articles/2021/10/setup-py-depreca... . If you know projects that use only Python but publish only a "legacy" sdist instead of wheels, https://pradyunsg.me/blog/2022/12/31/wheels-are-faster-pu... is a solid reference.

If they otherwise make inappropriate use of setup.py (e.g. to convey metadata that could be in a pyproject.toml), probably the best way to warn them is to point out what happened to Requests with Setuptools version 78 (and 72): twice in less than a year, one of the most popular projects out there, written in pure Python, had its packaging break twice in easily avoidable and completely unnecessary ways. Packaging is part of maintenance, too.

But for those who don't publish to PyPI and whose projects aren't a dependencies for anything else, there might not actually *be* a need to do anything - except to stop upgrading local installations of Setuptools.

PEP517

Posted May 22, 2025 12:09 UTC (Thu) by zahlman (guest, #175387) [Link]

>The Python ecosystem is moving towards PEP517 for packaging which formalises how one creates and then installs a wheel.

PEP 517 isn't so much about defining the process, as it is about providing an interface that allows installers to automate it. `pyproject.toml` allows you to specify a build backend, but the default is still Setuptools, and pip will be able to install practically any legacy package that was designed for the `setup.py install` workflow. It's *that workflow itself* that's being deprecated (https://blog.ganssle.io/articles/2021/10/setup-py-depreca...); but the existing `setup.py` doesn't need modification. Setuptools implements its part of the PEP 517 contract basically by simulating what `setup.py install` would do; and it will continue to work that way with perhaps some internal simplification.

>This intentionally doesn't handle the case where you need to install non-Python data files and friends.

I don't follow. The wheel format explicitly provides for a folder of data files, which has subfolders for several different standard install locations (https://packaging.python.org/en/latest/specifications/bin...); and Setuptools explicitly provides configuration options to say what data files to include in the wheel and which subfolder they belong to (https://setuptools.pypa.io/en/latest/userguide/datafiles....). I'm not sure what it means for a data file to be "non-Python"; if your Python project has a C extension and a data file, neither the wheel format, nor Setuptools nor the installer will care whether the Python code reads the data or the C code does (or both).

But yes, some developers will be disrupted if they're using `setup.py` from higher level tooling, or want to install files to places outside of any Python environment (like /etc or something). Or I guess if they're trying to use Setuptools to make a .rpm or .deb etc. Eventually that will all go away.

But alternatives exist; and this is anyway much less disruptive than an end-user's install failing because a transitive dependency couldn't be built from source locally.

I HATE PYTHON

Posted May 21, 2025 16:14 UTC (Wed) by hDF (subscriber, #121224) [Link] (35 responses)

Every time I try to do damn near anything on a computer, I somehow end up dealing with Python packaging issues. I resent the language greatly for it.

I HATE PYTHON

Posted May 21, 2025 16:33 UTC (Wed) by dskoll (subscriber, #1630) [Link] (34 responses)

Seconded. I hate that you can't use pip3 to easily install system-wide packages if your Python installation comes from an OS package and that it wants you to set up a venv. Drives me completely nuts.

I'm more of a Perl person than a Python person, and Perl happily lets you install both OS packages and packages from CPAN and manages to keep them in separate directories. Why can't Python do this?

(I'm guessing it can if you know the right magic, but the warning from pip3 not to do it is pretty stern.)

I HATE PYTHON

Posted May 21, 2025 19:05 UTC (Wed) by intelfx (subscriber, #130118) [Link] (25 responses)

> I'm more of a Perl person than a Python person, and Perl happily lets you install both OS packages and packages from CPAN and manages to keep them in separate directories. Why can't Python do this?

Because Python's support for "keep[ing] them in separate directories" is called a venv.

> but the warning from pip3 not to do it is pretty stern

Yes, and rightly so.

I HATE PYTHON

Posted May 21, 2025 20:01 UTC (Wed) by dskoll (subscriber, #1630) [Link] (24 responses)

Because Python's support for "keep[ing] them in separate directories" is called a venv.

Yeah, but that's not a great solution for installing system-wide packages from source. Surely it would not have been the end of the world for Python to include a standard set of paths (different from the main system paths) for installing packages from source? A sort of default built-in venv?

AFAIK (correct me if I'm wrong) Python is the only major language that doesn't have this sort of thing.

I HATE PYTHON

Posted May 21, 2025 23:36 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (23 responses)

That's called pip install --user, if you're installing for a single user.

If you're installing for all users on the system, then you are in one of three positions:

* You are the distro. Then this behavior was your idea in the first place, because upstream Python does not do this (i.e. tell pip to fail with an error saying not to fiddle with package-managed files). If you don't like that behavior, don't put it in your distro's version of Python.
* You are the sysadmin. Then you can use the site module's configuration hooks to set up and customize how modules are imported and where they may be installed, and then use pip install --prefix or pip install --target to manually install packages into it. Your distro may have already configured such a thing for you (check with python -m site). See https://docs.python.org/3/library/site.html
* You are building a container image or the like. Then you are either using somebody else's container-targeted distro, in which case you've effectively appointed yourself sysadmin of the container (see previous bullet), or else you're building it from scratch with just the software you actually use (see first bullet). If you don't want to be a sysadmin, then use a venv instead of a full container - venvs are much cheaper and simpler than containers!

I HATE PYTHON

Posted May 22, 2025 2:13 UTC (Thu) by dskoll (subscriber, #1630) [Link] (21 responses)

I'm in option 2 - sysadmin.

Any normal C or C++ program built with ./configure installs itself in /usr/local/ by default and doesn't interfere with system packages.

Any normal Perl package built with a Makefile.PL will do the same by default and not interfere with system packages.

Why can't Python do that by default? I shouldn't need to know much about Python just to use software written in it.

I HATE PYTHON

Posted May 22, 2025 5:23 UTC (Thu) by geofft (subscriber, #59789) [Link] (20 responses)

> Any normal C or C++ program built with ./configure installs itself in /usr/local/ by default and doesn't interfere with system packages.

_Usually_ doesn't interfere! It's true that it doesn't physically overwrite files, but what if it installs a binary like bash or perl or python or something and then breaks existing system packages when PATH=/usr/local/bin:/usr/bin?

What if it installs newer and incompatible / buggy / differently-configured shared libraries at the same SONAME? /usr/local/lib is on the default shared library search path.

What if your Perl package installs a newer and incompatible / buggy / differently-configured Perl library and shows up first on @INC and breaks some OS-provided Perl command? At least on my system, /usr/bin/perl -e 'print "@INC\n"' has /usr/local before /usr.

Python does the same thing. It will look for Python code in /usr/local. It will to this day still happily install there with (e.g.) "sudo pip install --break-system-packages," which exists simply to opt into this risk if you're comfortable with this risk. I don't have a good explanation for _why_ this problem seemed to affect Python more than it affected C users and Perl users. The best guess I have is that there are far more Python users installing third-party packages, and they are far less likely to be sysadmins, than C or Perl users installing third-party packages, so it was much more common to hit these unlikely-but-certainly-possible scenarios in Python.

I HATE PYTHON

Posted May 22, 2025 12:43 UTC (Thu) by kpfleming (subscriber, #23250) [Link]

A large factor in this dilemma is many prominent distributions ship, and rely on, tooling written in Python (including package managers). Installing a different version of a package that was already installed by the distribution can break those tools, and results are not hilarious.

I believe some distributions have considered having a totally separate Python installation to be used by the distribution's tools, which brings its own complications (such as users using their package manager to determine that 'requests' is installed, but complaining that 'import requests' fails in their own programs).

It is an unfortunate situation, but had led to some people (including me) never using the distro-provided Python for anything other than temporary or one-off scripts.

I HATE PYTHON

Posted May 22, 2025 13:07 UTC (Thu) by Karellen (subscriber, #67644) [Link] (15 responses)

but what if it installs a binary like bash or perl or python or something and then breaks existing system packages

Then that's a terrible, broken package that no-one should be installing on any system under any circumstances. Also, you should probably add its authors/maintainers to a list of people to avoid installing any software from in the future.

I guess I don't understand enough about python packaging to have any clue why a problem that wild would even be on your radar. It feels like an abyss I'm frankly kind of terrified to look into, for fear of dropping SAN points.

I HATE PYTHON

Posted May 22, 2025 14:19 UTC (Thu) by geofft (subscriber, #59789) [Link] (14 responses)

You think bash is a terrible, broken package? :)

I guess part of the problem here is that Python package managers install packages' dependencies too. So, yes, I agree that if `wget ftp.gnu.org/zardoz.tgz && tar xf zardoz.tgz && cd zardoz && ./configure && sudo make install` causes a /usr/local/bin/bash to exist, that is terrible, broken, and fully unacceptable. But `sudo pip install zardoz` may well pull in a newer version of a dependency (probably not bash, of course, but there are plenty of CLI tools written in Python, and occasional needs for CLI dependencies not written in Python).

Another way of looking at this is, _if_ there is a need for a newer version of some common dependency, something like automake will make you figure that out on your own, at which point you as a sysadmin will ponder what's going on before you install a newer version of bash to /usr/local/bin. The Python ecosystem prefers not to make you figure that out on your own. There are good arguments for both philosophies, but this is a fundamental part of why there are orders of magnitude more people who install Python packages than have ever typed the letters "./configure".

I HATE PYTHON

Posted May 22, 2025 16:22 UTC (Thu) by Karellen (subscriber, #67644) [Link] (13 responses)

Python package managers install packages' dependencies too.

Even non-Python dependencies, which are generally distributed outside of Python package repos?!?

Oh. My. God.

That does clear up some confusion I was having around binary packages for Python though. Like, why do you need binaries for a scripting language at all?

Anyway, I've been doing some stuff in Python recently, just using the system python. I've been thinking about looking into non-system-python stuff, and even packaging some of my code, but been put off by the multitude of python packaging systems, and an apparent lack of consensus around which to use. I think I'll just give them all a miss now, and try looking at another language altogether. Or maybe take up goat farming instead.

I HATE PYTHON

Posted May 22, 2025 16:53 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (5 responses)

> Even non-Python dependencies, which are generally distributed outside of Python package repos?!?

Eh... sort of but not really. The de facto standard tool for installing truly non-Python dependencies is conda, but conda is an entirely different ball of wax, with its own separate package repositories etc., and most distros either do not package it at all, or at least do not install it by default. Conda also sets up a totally isolated Python installation, so it actually does play reasonably well with the system Python (in the sense that it usually does not touch the system Python at all).

The problem with pip (not conda) is that, in practice, there are a ton of libraries written in Python, including the Python *bindings* for many non-Python libraries. Installing any Python package that depends on those libraries will cause them to be installed into /usr/local if no venv or --user is active, and then plenty of system packages may break.

> Anyway, I've been doing some stuff in Python recently, just using the system python. I've been thinking about looking into non-system-python stuff, and even packaging some of my code, but been put off by the multitude of python packaging systems, and an apparent lack of consensus around which to use. I think I'll just give them all a miss now, and try looking at another language altogether. Or maybe take up goat farming instead.

Gentle recommendation: Learn uv, venv, and *maybe* pip, and ignore literally everything else in this space. You don't really need to know venv and pip for that matter, because uv handles both of those use cases, but it's good to understand them in broad conceptual terms since they are the closest things that Python has to standard package tooling.

Rationale: uv handles 90%+ of use cases exceptionally well, has very good performance compared to most of the other tooling, is actively supported and maintained, favors correctness over footguns (especially in terms of how it resolves dependencies), and does not massively deviate from the de facto best practices that most other tooling follows (so if uv suddenly gets replaced by yet another New Shiny, you'll probably be able to migrate from it without massive effort on your part).

I have previously heard good things about Poetry and a couple of other solutions, and I probably would have recommended them to you in the past, but nowadays, the common wisdom is that uv's high performance blows most of the competition out of the water, to the point where it doesn't make much sense to touch any non-uv tool unless you really need a feature that uv lacks (or your project is already using one of the alternatives, of course). These days, there are not a whole lot of things missing from uv (at least as far as I know), so that's an increasingly niche reason to use something else.

I HATE PYTHON

Posted May 23, 2025 1:56 UTC (Fri) by aragilar (subscriber, #122569) [Link] (4 responses)

90% is optimistic: uv is very much embedded in the webdev part of the ecosystem (it's the same line as pipenv and poetry) so solves their needs well (modulo assumptions uv makes which are its primary source of speedups but also cause issues), but outside webdev it makes things harder to package for linux distros (because now you have two languages to manage), it doesn't play well with the numerical ecosystem because of its assumptions (and so conda is used), and it tries to hide the real need for design when managing an install (e.g. look at how it installs python, and all the minor things that break due to trying to make an almost-static install). If you are in its target group, it's a significant improvement over pipenv/poetry (because they do make the same assumptions, and helped pave the path to uv), but otherwise to me it seems like another wedge splitting apart the Python ecosystem.

I HATE PYTHON

Posted May 23, 2025 2:14 UTC (Fri) by geofft (subscriber, #59789) [Link] (3 responses)

> it doesn't play well with the numerical ecosystem because of its assumptions (and so conda is used)

I've spent the last week at PyCon in a room with the Conda folks and other people working with numerical/scientific computing in Python, trying to both solve these problems for the PyPI-based ecosystem (pip, now also uv and poetry and others) and trying to do it in a way where we're compatible with the Conda ecosystem / where they can adopt the same designs. Honestly the PyPI ecosystem has been _pretty_ good over the last few years for this task, certainly far better than a decade ago when `pip install numpy` would always try to build from source. NVIDIA has resumed offering first-class support for the PyPI ecosystem in the last few years after very loudly quitting it in 2019 (https://medium.com/rapids-ai/rapids-0-7-release-drops-pip...). There's still a lot left to do, but we should be at the point (and are, in my experience) where most people, certainly most individuals, will have just as good a time with pip/uv/etc., so I am genuinely curious what sorts of things you personally find don't work.

> and it tries to hide the real need for design when managing an install (e.g. look at how it installs python, and all the minor things that break due to trying to make an almost-static install).

I'm one of the maintainers of those Python builds and I am absolutely open to bug reports, either here or on our bugtracker. Going after all these minor things is basically my top priority at the moment. Please let me know, I love having more test cases. :-)

I HATE PYTHON

Posted May 23, 2025 10:46 UTC (Fri) by aragilar (subscriber, #122569) [Link] (2 responses)

I want to draw a distinction between the underlying standard/tools, which are mostly fine apart from some key spots I know are mostly being worked on (and which I am happy to use), and the higher level wrappers (of which poetry and uv are the poster children, and which I have an issue being described as working for 90%).

I'd classify things on a spectrum from papercuts to blockers. Things listed on https://gregoryszorc.com/docs/python-build-standalone/mai... are close to papercuts, in that they can be avoided by using a different build (or even building from source, plus they're actually documented), but when tools like uv default to using them implicitly, they start moving towards blockers (as people tend to go with the defaults, and so it becomes a matter of debugging layers of issues). Once certain IDEs start getting involved, now you're having to debug someone poorly configured machine and wasting hours or days fixing.

Then there are the parts of the ecosystem where pre-built binaries cannot be used (e.g. MPI), you end up to start adding flags like `--no-binary` and rebuilding half the ecosystem anyway because you need to link to a single BLAS. uv is really not designed for this (neither is poetry, they assume too much about wheels), and while you can (and I have) created your own private index to control what the installers see, using pip is far easier (and less likely to break).

Then there are the fun bits like the PyPI packages of Jupyter plugins just straight up not working (though the conda version works), due to wheels not being able to install to specific paths (because they need to integrate with the system they are installed on).

This all flows from the assumption that pre-built PyPI-associated binaries (with all their limitations) are a solution (rather than a workaround) to users needing to set up their machine properly. It's worth comparing this to R, where the installer on Windows and MacOS sets up a proper dev environment (with compilers), and so these kind of problems do not happen.

I HATE PYTHON

Posted May 24, 2025 22:47 UTC (Sat) by geofft (subscriber, #59789) [Link] (1 responses)

Thank you, this is very helpful and I appreciate you taking the time to write this up! A few of those quirks in that document are gone now (e.g., the musl build is now a dynamic binary, not a static one, so extensions work fine); I'll update it.

> Once certain IDEs start getting involved, now you're having to debug someone poorly configured machine and wasting hours or days fixing.

Just to clarify, do you mean that people use their IDEs to set up uv and thus python-build-standalone, and that makes things harder to debug because it's harder to figure out what their setup is? Or is there something specific to usage inside some IDE versus at the CLI?

> Then there are the parts of the ecosystem where pre-built binaries cannot be used (e.g. MPI), you end up to start adding flags like `--no-binary` and rebuilding half the ecosystem anyway because you need to link to a single BLAS.

This is definitely one of the things we spent time talking about last weekend (and it's also called out on https://pypackaging-native.github.io/key-issues/abi/ , an excellent website written by folks who work on the scientific Python ecosystem and would like the non-conda ecosystem to work well.) It's certainly a deficiency right now, but it's also a high priority for the scientific Python community as a whole. While I'm sure people have concrete examples, if you have a specific favorite prebuilt binary wheel or package name that doesn't work well or a combination of libraries that need to share a single BLAS, again, I love having more test cases. :)

> [...] uv is really not designed for this [...] using pip is far easier (and less likely to break).

uv and pip both install the same packages from the same places, and when you build from source with --no-binary you can link your exact system libraries when using either uv or pip, so I'm curious about this - is this because you can do venv --system-site-packages (or use pip without a venv in some fashion) and get some compiled Python packages from the OS, and uv doesn't support that? Or something else?

> Then there are the fun bits like the PyPI packages of Jupyter plugins just straight up not working (though the conda version works), due to wheels not being able to install to specific paths (because they need to integrate with the system they are installed on).

This one is new to me - do you have a specific plugin / example command that doesn't work? (I know there are tons of people using Jupyter with extensions in production using pip + venv, so I'm assuming it's not "any extension.")

> This all flows from the assumption that pre-built PyPI-associated binaries (with all their limitations) are a solution (rather than a workaround) to users needing to set up their machine properly. It's worth comparing this to R, where the installer on Windows and MacOS sets up a proper dev environment (with compilers), and so these kind of problems do not happen.

Yes, conditional on actually setting up a proper dev environment with compilers and library dependencies, which has its own host of problems. :) This was, after all, the status quo of pip until about a decade ago, and the fact that it was a very bad experience is both why conda came into existence and why pip started doing wheels. (Unless you mean that Python or uv or some other installer should install _its own_ compiler toolchain, independent of what is on the host, and build stuff from source, including C libraries like BLAS? That's an intriguing idea....) Conda supports R as well as Python, and my impression is that its R support is popular for largely the same reasons, in that building all your R packages from source is a difficult experience.

I HATE PYTHON

Posted May 28, 2025 9:55 UTC (Wed) by aragilar (subscriber, #122569) [Link]

> Just to clarify, do you mean that people use their IDEs to set up uv and thus python-build-standalone, and that makes things harder to debug because it's harder to figure out what their setup is? Or is there something specific to usage inside some IDE versus at the CLI?

Sorry, I meant that IDEs providing Python installs as well, and now you've got some unreproducible conda-uv-IDE frakenstate where python points somewhere, pip somewhere else and who knows what the `sys.path` will be. Too many tools want to be in charge (for legitimate reasons) and only pip is really set up to be a bit player where it plays nicely with others (though naturally it too can mess things up, as noted by PEP 668). Hence why I push people to understand what their tools are doing, and not mix them arbitrarily.

> This is definitely one of the things we spent time talking about last weekend (and it's also called out on https://pypackaging-native.github.io/key-issues/abi/ , an excellent website written by folks who work on the scientific Python ecosystem and would like the non-conda ecosystem to work well.)

Yeah, I've provided PRs to it previously :) To me the solution here was the "local" or "custom" wheel tag, which I suggested a while back, but didn't go anywhere.

> uv and pip both install the same packages from the same places

It's not the source that's the problem, it's that uv wanted a venv (at least when I tried it) whereas it should have used (in that particular case) the (purpose-built) conda environment.

The jupyter plugin (I don't recall it's name, I'd need to dig into my notes) in this case was to do VNC on the web, and my colleague (who is not a fan on conda) said he needed to use both conda and pip to get it working. When I had time I found out there was a separate binary that was being shelled out to (I think) to set up a websocket or something, but it would only be set up right with the conda install, because it needed some post-install stuff (I'm not sure why it couldn't have been configured correctly).

> Conda supports R as well as Python

My impression (based on the astronomers and biologists I know) is R from conda is used when people want to do R<->Python connections using rpy2 or similar, and most actual R users just use CRAN. This may not be accurate across the ecosystem though (I'm very much embedded in the university ecosystem).

> Yes, conditional on actually setting up a proper dev environment with compilers and library dependencies, which has its own host of problems.

While I don't think this is trivial, I do think we (the Python community) can do a lot better (and if we assume pre-built wheels can be used, then issues like the discussion around the future of setuptools get brushed aside). R on Windows (and possibly MacOS) installs it's own copy of GCC plus a bunch of other "standard" tools, and I've not heard complaints there (I have previously suggested Python on Windows should make it easy to install MSVC, but I got pushback against that). Conda definitely made things better on Windows, but at least when I tried using it to install packages without wheels, it apparently hadn't added the configuration that the conda gcc should work with the conda python, which seemed like a missed opportunity to me.

It also encourages groups to not use PyPI to upload their packages, as the assumption is that you should produce wheels (NVIDIA was one example, but I know of others that I'm not going to name otherwise someone will squat on their packages). I think having a index where all packages get registered, and a separate (downstream?) index which is a self-contained wheel archive would go a long way to solve this, and then you tell people that the high level tools only work with the wheel archive would be a better experience as then PyPI isn't forced to become a worse conda.

I HATE PYTHON

Posted May 22, 2025 16:59 UTC (Thu) by geofft (subscriber, #59789) [Link] (5 responses)

> Like, why do you need binaries for a scripting language at all?

Two reasons. One, it's a general-purpose programming language; it happens to be good at scripting, but it also happens to be good at other things. For instance, it's very widely used for scientific computing (even before the current AI furor), and an extremely common thing to do with it is to install wrappers around BLAS/LAPACK and use them. (Most of these Python programmers don't even know that the thing they're using is BLAS/LAPACK at its core.)

Two, the thing that a scripting language does, very often, is to call binaries! Graphviz is a good example - the common Python bindings call 'dot' as a subprocess. Yes, you can install it on your own (and the version you get from pip wants you to do exactly that), but there are good reasons to want a version that matches what was tested by the wrapper, etc.

(Also also, if you insist on getting these things from the OS, that means you need sudo to install them. Weirdly enough, the Python ecosystem is probably the best fully unprivileged package manager out there for binaries on Linux. I'm not saying it's good, we have a lot of work to do, but the other options aren't better.)

> I've been thinking about looking into non-system-python stuff, and even packaging some of my code, but been put off by the multitude of python packaging systems, and an apparent lack of consensus around which to use. I think I'll just give them all a miss now, and try looking at another language altogether. Or maybe take up goat farming instead.

Don't let me stop you from taking up goat farming, but if you are willing to take a recommendation for what to use, give uv (https://docs.astral.sh/uv/) a shot. (I recommend the provided installer, but if your OS has a relatively recent version, that's fine too.) The approach of uv is to abstract all the virtual environment stuff from you (and, really, to treat it as an implementation detail).

In fact you can write a single-file Python script with some metadata about its dependencies and use "uv run myscript.py", and behind the scenes it will create a temporary virtual environment with those dependencies if it doesn't have one and run your script, completely independent of anything going on with the OS.

One other thing uv does is it also is able to install its own version of Python, generally noticeably newer than what your OS will have, further increasing how independent it is of the OS. I am one of the maintainers of those Python builds, so if you don't like something about how they build, feel free to yell at me. You will probably be justified in doing so; there are certainly several open issues. :) But it works pretty well, and it avoids the entire class of problems of conflicting with the OS.

uv is relatively new (a little over a year, and half these features for less than that). It's rapidly gaining consensus, and I think there's a strong consensus that _something like_ uv is the right approach, whether or not this particular software project is the way forward. But that's why you're not seeing a consensus quite yet. There's been a lot of work in the past few years to improve Python packaging. It is starting to pay off well, but only just starting. The single-file stuff, in particular, is extremely under-advertised, and I think it's likely the way most sysadmin types (among whom I count myself) would like to be doing things.

Depending on what you're doing, I'm not sure things will be better in another language. If you are writing interesting enough Python that you've moved beyond what the standard library offers, and especially if you're doing things where you'd want compiled libraries like BLAS/LAPACK, dependency problems have to be solved somehow. (I'm curious, incidentally, what the sort of dependencies you're installing are.) Or put another way - there's a reason there's enough stuff on your OS that's written in Python that doing "sudo pip install" puts it at risk. Despite all the problems you see, it does actually solve several more problems very well.

I HATE PYTHON

Posted May 24, 2025 0:14 UTC (Sat) by himi (subscriber, #340) [Link] (4 responses)

I've run into issues with uv's managed python installs, specifically with the netaddr package - which is an unmaintained beast of a thing, admittedly, but also a common enough requirement (particularly in the OpenStack context, where I work) that I didn't have much choice. In the end I just went with the system python and the distro's python3-netaddr package, because I couldn't find a way to get the managed python install to successfully build netaddr from source (netaddr's fault, not uv's), or provide some way to inject my own build into the environment.

Unfortunately, I don't have notes about what exactly I did, and it was on my previous machine so it's rather hard to go back and re-run things to give a proper bug report, or see if things have changed in the ~6 months since then (a distinct possibility given how fast uv has been evolving) . . .

I'd like to say that fixing the issue of unmaintained or badly maintained packages isn't something that the Python packaging community should be responsible for . . . and it really isn't . . . but there are degrees of failure, some more graceful than others, and it felt like I was hitting a brick wall rather than something more accommodating. Obviously a managed python install is a very curated and controlled environment and there's a limit to the degree of flexibility that you can provide, but that can make them unusable sometimes, particularly when it comes to oddball cases like netaddr.

I HATE PYTHON

Posted May 24, 2025 1:05 UTC (Sat) by himi (subscriber, #340) [Link]

Hah, I seem to have picked exactly the wrong time to try stuff with netaddr, because it's apparently a lot more live and maintained than it was when I hit these issues . . . That doesn't change the broader point, of course, but failing to note the changes would be rude to the netaddr devs . . .

I HATE PYTHON

Posted May 24, 2025 1:10 UTC (Sat) by geofft (subscriber, #59789) [Link] (1 responses)

Er, are you sure you're thinking about netaddr? That's a super straightforward pure Python package and so building it should be trivial, but also it has platform-independent wheels so you don't even need to do that.

For completeness, I just checked `uv run --with netaddr python`, and then pinned to various recent and not-so-recent versions, including the versions packaged in Debian stable/oldstable which are well older than uv itself. I also tried with `--no-binary` to force a local build. All seemed to work fine.

(Looking around Debian for similar-sound arch-dependent packages... did you mean netifaces? That one has compiled code and a pretty dusty build system, but it does seem to work fine with uv, too. I did get a failure the first time I tried because I didn't have a C compiler on my PATH, and I got a failure the second time because netifaces caches its configure-esque checks and uv leaves that around, but it works if you blow that away and retry.)

Regarding old versions, I actually think that's a reasonable expectation, provided that the actual code is -compatible with your current Python version. Sometimes you might need to do things like pin an older setuptools in the build environment (to get back to the subject of the article), and maybe that process needs better docs and error messages, but it's the sort of thing that should be doable.

I HATE PYTHON

Posted May 24, 2025 1:54 UTC (Sat) by himi (subscriber, #340) [Link]

Yes, insufficient caffeine on a lazy Saturday morning caused me to confuse netaddr for the real culprit, netifaces. If it's working now that may well be due to changes since I ran into this issue - uv development is almost as fast as its runtime, it can be hard to keep up. It may also have been using a more modern Python version than netifaces wanted? I honestly can't recall the details, and I'd have to do a lot of work to rebuild enough context to try and replicate the issues.

One of the things that would have been useful, I think, would be a way to use the system python's packages to resolve certain dependencies - possibly by importing them into the managed python install, or into the venvs that uv builds. Obviously assuming you're working with sufficiently compatible versions (which may well be hard to figure out) . . . It's not going to be something you want to use regularly, and it'd need to be explicitly chosen per-package, but it might be useful if the problem packages are a ways down the dependency tree and unresolvable otherwise.

I HATE PYTHON

Posted May 24, 2025 1:15 UTC (Sat) by himi (subscriber, #340) [Link]

ugh . . . I am, apparently, an idiot when I haven't had enough coffee - the package I was having issues with was netifaces, not netaddr. Definitely a case where editing comments would be useful, if only to minimise my embarrassment . . .

More about Python wheels ("binary packages")

Posted May 29, 2025 19:00 UTC (Thu) by zahlman (guest, #175387) [Link]

> That does clear up some confusion I was having around binary packages for Python though. Like, why do you need binaries for a scripting language at all?

Python source distributions and binary distributions are both "binaries" in the sense that they're two different flavours of zip archives; but I assume you're talking about wheels (binary distributions).

Generally speaking, they may "install non-Python dependencies", but they don't do this by putting anything in system folders (you're meant to always be able to install anything without sudo). Rather, the archive will contain compiled shared libraries which are then put within the standard install paths for the Python code; and then the Python code can find them with relative paths if necessary.

This happens because Python *isn't* simply "a scripting language" (to the extent that the term means anything any more); in particular, it's often used to interface to code in C (and other languages) for performance reasons, and also simply because the libraries exist and it's useful to access them from Python.

If your system Python includes Numpy, it might be very stripped down - though it still makes use of "local" binaries. For example, my installation includes `/usr/lib/python3/dist-packages/numpy/core/_multiarray_umath.cpython-312-x86_64-linux-gnu.so` weighing about 5.5MB. If I separately install a wheel in a new virtual environment, that environment will get its own copy of that which is 10MB, as well as a 25MB .so for OpenBLAS (the system Python can be reworked to use a shared BLAS build).

The libraries generally are compiled with broad compatibility in mind (see https://peps.python.org/pep-0600/, https://github.com/pypa/manylinux and https://packaging.python.org/en/latest/specifications/pla... if you want gory technical details), and of course C is still the dominant language for this. But in principle you can use anything as long as you can figure out explicit bindings from Python (for example, with tools like Cython or SWIG for C) or use dlopen (which is wrapped in the Python standard library by "ctypes").

I HATE PYTHON

Posted May 22, 2025 13:39 UTC (Thu) by dskoll (subscriber, #1630) [Link] (2 responses)

Any software that broke the system in the ways you describe, and was not quickly fixed, would find itself no longer being used.

The problems you mentioned almost never occur in practice.

I HATE PYTHON

Posted May 22, 2025 14:23 UTC (Thu) by geofft (subscriber, #59789) [Link] (1 responses)

> Any software that broke the system in the ways you describe, and was not quickly fixed, would find itself no longer being used.

Correct. That is why it was in fact fixed in the way I describe. :-)

> The problems you mentioned almost never occur in practice.

They do occur in practice. I'm the primary author and instigator of the Python standards doc that made "--break-system-packages" happen. You are more than welcome to yell at me about it - many people have - but we did this for a reason, with the involvement of both people working on Python-ecosystem packaging tools and people repackaging Python for Linux distros, both of whom saw systems broken in the ways I described and wanted to see it fixed.

The need for user-level protection of site-packages

Posted May 29, 2025 19:23 UTC (Thu) by zahlman (guest, #175387) [Link]

> They do occur in practice.

For what it's worth: I've wanted to blog about the topic for quite some time, but I realize that while I "know" these problems occur in practice, I can only wave my arms around hypotheticals and not point at concrete known examples.

It's not exactly *intuitive* that a Python package installed with pip, for the system Python, with `--user` (i.e. without requiring sudo) could cause problems. I understand that it certainly can (since system tools written in Python and *run as* a user will see those packages on `sys.path`). But to my understanding (of course I don't use pip this way myself ;) ), user `site-packages` folders aren't supposed to show up on `sys.path` when Python runs as root. So it's unclear how much damage can really be done; the worst I was able to come up with myself involved user-level malware attempting to socially engineer its way into a privilege escalation.

I'd be happy for any links to existing write-ups of, for example, the specific ways in which some poor unfortunate user broke Apt etc.

--user installs can also cause problems

Posted May 22, 2025 13:36 UTC (Thu) by zahlman (guest, #175387) [Link]

You generally have the right of it, but please note that the same "externally-managed-environment" protection system applies to the `pip install --user` target location as well - see e.g. https://stackoverflow.com/questions/75608323 . Quoting from the PEP:

> The python3 executable available to the users of the distro and the python3 executable available as a dependency for other software in the distro are typically the same binary. This means that if an end user installs a Python package using a tool like pip outside the context of a virtual environment, that package is visible to Python-language software shipped by the distro. If the newly-installed package (or one of its dependencies) is a newer, backwards-incompatible version of a package that was installed through the distro, it may break software shipped by the distro.

> This may pose a critical problem for the integrity of distros, which often have package-management tools that are themselves written in Python. For example, it’s possible to unintentionally break Fedora’s dnf command with a pip install command, making it hard to recover.

> This applies both to system-wide installs (sudo pip install) as well as user home directory installs (pip install --user), since packages in either location show up on the sys.path of /usr/bin/python3.

I HATE PYTHON

Posted May 21, 2025 20:02 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (1 responses)

> (I'm guessing it can if you know the right magic, but the warning from pip3 not to do it is pretty stern.)

It's not magic. It's one flag.

sudo pip install --break-system-packages [...]

The problem is, now your system package manager doesn't know that you went behind its back and installed random things into /usr with a third-party tool. At best, this causes some confusion about which files are managed by the system and which are "managed" by you manually running pip. At worst, it actually does cause some system packages to break, because they were tested against version X of libfoo and you installed version Y. Pip is entirely right to tell you not to do it - it can and will break your system in unpredictable ways, hence why it is called --**break**-system-packages and not e.g. --global-install or some innocuous-sounding thing.

Pip used to work in the exact manner you describe (silently overwriting system-wide packages if invoked with appropriate privileges). It was changed because too many users were accidentally breaking their systems (as discussed at the time in https://peps.python.org/pep-0668/#motivation). So now, it will only break your system if you ask it to do so, which IMHO is a win.

Also, I know that you do have an OS package manager to break, because you would not be getting that error message if you didn't. Pip looks for a file named EXTERNALLY-MANAGED in the package directory, and that file is not distributed with upstream Python, so it can only exist if your distro put it there to indicate that pip should not touch it. In other words: Your problem is between you and your distro, and pip is just the messenger.

Here are some alternatives that are less terrible:

* pip install --user [...], which installs into a subdirectory of $HOME.
* Make a venv. A venv is not heavy or difficult to work with, although Debian users need to install python3-venv. It's literally just a directory with a copy of (or symlink to) the Python binary, plus some scripting gunk to tell Python and pip to use that directory as the installation root (instead of some directory or combination of directories under /usr/lib).
* Use your OS package manager to install the desired OS packages instead. Pip doesn't support doing this automatically, presumably because it does not want to build out integrations with APT and RPM and Pacman and... etc., as well as converting the PyPI name of a package into the name used by the OS repository. Don't ask me how Perl figured that one out, I'm not a Perl person.
* Configure Python so that it has a separate package directory under /usr/local where you can do whatever you want, without breaking the OS package manager, and then pip will not complain about system-wide installations. See https://docs.python.org/3/library/site.html and https://packaging.python.org/en/latest/specifications/ext... for how this works internally.

I HATE PYTHON

Posted May 22, 2025 0:43 UTC (Thu) by nliadm (subscriber, #94000) [Link]

There's also the "INSTALLER" metadata file, which can (should) be used to record what installed a given python package. I think Fedora-likes and Debian-likes also install python packages from their repositories into one path (/usr/lib) and configure pip to install into another (/var/lib) and put both in the Python search path. You can still break utilities by installing system-wide, but it's much easier to get the distro-provided tools working again by just nuking the system-wide pip install.

I HATE PYTHON

Posted May 22, 2025 9:13 UTC (Thu) by ceplm (subscriber, #41334) [Link]

It can and it does … on some other distributions. Complain with your python/setuptools packager.

(There's no need for all-caps vitriol.)

Posted May 22, 2025 13:22 UTC (Thu) by zahlman (guest, #175387) [Link] (2 responses)

> Perl happily lets you install both OS packages and packages from CPAN and manages to keep them in separate directories. Why can't Python do this?

Python doesn't do any of the installation. The entire packaging system, and package ecosystem, is kept at arm's length from core language development.

Installers like pip can and do install packages from PyPI separately from Python packages provided by a Linux distro. Not only that, but it has separate system-level and user-level installations. The system package manager will use (typically) /usr/lib/pythonX.Y/dist-packages . When installing for the system Python, pip will use /usr/lib/pythonX.Y/site-packages when run with sudo, and ~/.local/lib/pythonX.Y/site-packages otherwise.

The reason your distro now insists on using virtual environments is because *this folder arrangement is still problematic*. What happens is that a user-installed module shadows a system-provided one that a system script is trying to use. The ~/.local packages are only visible when running that system script as the user, but *this still causes enough problems* that distros decided "just don't give sudo rights to pip" isn't enough protection.

I genuinely don't understand why "easily installing *system-wide* packages" is such a priority. But you *can* do this with venvs (with sudo of course) - for example: make a venv somewhere within /opt and install there - and if it's an "application" with a driver script, maybe even symlink it from /usr/local/bin. In fact, newer versions of pipx (https://pipx.pypa.io/stable/) wrap this exact process for applications (using a shared vendored copy of pip). I have more about this in a personal blog article here: https://zahlman.github.io/posts/2025/01/07/python-packagi...

It may help to think of the virtual environment as effectively a separate installation of Python - because it's close to that, just with symlinks to the "base" Python executable and special access to the standard library, instead of copying it all. If you get annoyed with remembering to "activate" a virtual environment, keep in mind that this actually does very little. Generally you can get by with just giving the path to the virtual environment's `python` symlink.

(There's no need for all-caps vitriol.)

Posted May 22, 2025 17:07 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (1 responses)

Fun fact: If a distro wants to support --user, it's not outrageously hard to do so. You just have to make sure that any system packages that require Python set the -s flag (on /usr/bin/python) or the PYTHONNOUSERSITE environment variable (to any value), and then whatever is in the user's home directory is silently ignored. I imagine there are varying levels of difficulty here:

* For stuff managed by whatever init you are using (let's not start a systemd holy war here), this should be trivial.
* For stuff managed by systemd --user (or any equivalent, as well as session-related stuff like GDM/KDM/login/getty/etc.), this should also be trivial.
* That just leaves binaries that the user runs. If it's run through the desktop shell, this can be configured in the .desktop file, and if it's run from the command line, it can be configured in the shebang line... unless you are using the #!/usr/bin/env python trick, in which case you already used up your one argument and cannot pass flags through to Python (or, for that matter, ask env to set the environment variable for you). But if you're the distro, you should know where Python is installed and do not need to use /usr/bin/env in the first place.

I imagine the main reason they don't do this in practice is the shebang. All of the other things can plausibly use the environment variable, which is preferable because it Just Works and does not require pervasively editing /usr/bin/python command line flags everywhere. The shebang, on the other hand, does require pervasive editing and also has the downside of not being automatically inherited by child processes. So it's a question of whether distros are willing to spend the time to support that.

(There's no need for all-caps vitriol.)

Posted May 29, 2025 21:16 UTC (Thu) by zahlman (guest, #175387) [Link]

> unless you are using the #!/usr/bin/env python trick, in which case you already used up your one argument and cannot pass flags through to Python (or, for that matter, ask env to set the environment variable for you). But if you're the distro, you should know where Python is installed and do not need to use /usr/bin/env in the first place.

More to the point, the distro *shouldn't* use /usr/bin/env this way, because that allows the user to override which Python is used - in particular, after "activating" a virtual environment, that would generally be used instead. Which will generally be Bad for system-provided scripts.

(But for the user's use, some versions of env support the -S flag, which *does* allow for passing command-line options like -s along to Python. And, of course, you can always put a shell wrapper around the startup process too.)

> I imagine the main reason they don't do this in practice is the shebang. All of the other things can plausibly use the environment variable, which is preferable because it Just Works and does not require pervasively editing /usr/bin/python command line flags everywhere. The shebang, on the other hand, does require pervasive editing and also has the downside of not being automatically inherited by child processes.

If the child is also Python code, there are other ways to run that code in a new process, but they aren't trivial changes.

I HATE PYTHON

Posted May 23, 2025 4:24 UTC (Fri) by raven667 (subscriber, #5198) [Link] (1 responses)

> I'm more of a Perl person than a Python person, and Perl happily lets you install both OS packages and packages from CPAN and manages to keep them in separate directories. Why can't Python do this?

I work with much more perl than python too, primarily on RHEL-style machines, and I don't run CPAN as root to install system wide files, I make RPMs. perl doesn't need a python-wheel style package format because the RPM integration with the perl/CPAN ecosystem is _excellent_ and by working directly with the system-wide package database including dependency management, you don't end up with two different ecosystems with a partial dependency graph fighting each other and breaking stuff, you just use the normal RPM mechanisms for detecting and resolving conflicts. From what I can tell the provided rpmmacros to pass Makefile.PL/Build.PL cli options are comprehensive and the provided scripts reliably collect the inbound and outbound runtime dependencies using simple regex matching for /^(use|require) (.+)/ and /^package "(.+)"/ with more specific builddeps templated in by `cpanspec`. It helps a lot that there is a standard naming convention mapping perl packages to RPM package and library dependency names, eg. perl lib `Foo::Bar` is rpm `perl-Foo-Bar` and provides `perl(Foo::Bar)`. python setuptools can create an RPM but it doesn't populate it with any of the data pip uses to resolve dependencies AFAICT and the RPM build macros/scripts don't seem to have any way to parse import statements to make that list either, or an agreed upon way to map between the two systems, if there are better tools which can give RPM the same info that pip/uv have I haven't found them yet.

I don't know enough about either ecosystems native packaging metadata formats to comprehensively compare and contrast them, I don't use them I just build my in-house perl using RPM and let the built-in rpmbuild scripts handle it, but I have found building python packages for system-wide install to be more complicated, and I never found a tool as good as `cpanspec` or even `rpmbuild-newspec -t perl` such that I've pretty much given up on it and just package an entire venv as an RPM that other scripts can depend on, rather than making individual RPMs for each part of the dependency tree (for NAPALM). I don't know if it's because python and pypi is operating at a much larger scale than cpan, so you see problems more often, or if the culture around backwards compatibility is very different, but with perl I've rarely or never had an app with an incompatible shared dependency with some other system-wide installed utility, or even libraries with breaking changes over the last 10-15 years, in fact some of the code in the darkest depths of the application appears to be written in a perl4 style and still works fine, but I've definitely had to be very careful with python because it's been far more common for some library in the dependency tree to bump a new major version that is incompatible and break the app at runtime, even when building a virtualenv I sometimes have to mask library versions until other parts catch up. I've also had the case where (re)installing a python app in a venv with a version that is more than 6mo old becomes difficult unless you froze a list of exactly what versions of the entire venv library set were available at the time the app was initially released, because it's basically guaranteed that something will have made an incompatible change in that time and the app won't run until you figure out which libraries you need to pin their versions back, which is very tedious if you run an app for longer than a 6-week sprint cadence. Maybe my old perl code is just depending on stagnant unmaintained libraries, stable because they are old and unchanging, but most of what we use is in RHEL or EPEL so must still be in active use by common shipping software, and we still haven't had nearly the problems as with python, but I can't make a definitive statement as to the true reasons why.

I wouldn't mind a deeper dive comparing CPAN to PyPI and NPM to see what problems each solved differently and why, CPAN is the OG maybe they learned something about how to do library management at scale, or maybe there are language support differences which means that the same solution in one ecosystem can't be translated to the other for technical reasons, or maybe there are longstanding policy differences that preclude some family of technical solutions to these problems. I dunno and can definitely stand to learn more about it.

I HATE PYTHON

Posted May 23, 2025 6:59 UTC (Fri) by AdamW (subscriber, #48457) [Link]

"if there are better tools which can give RPM the same info that pip/uv have I haven't found them yet."

Sure there are. Modern Fedora (and RHEL, 10.x at least, I don't remember how much is in 9.x) Python packages - with sane upstreams - can automatically generate both build and runtime dependencies and provides. Fedora Python packages always provide "python3dist(modulename)" for each module they contain, this is auto-generated.

https://docs.fedoraproject.org/en-US/packaging-guidelines... is the sample spec from the modern packaging guidelines. You can see that there's very little in it, because an awful lot is done automatically.

breaking compatibility is like nicotin

Posted May 21, 2025 16:57 UTC (Wed) by ballombe (subscriber, #9523) [Link] (4 responses)

...the only way to stop is not to start.

How could they break 12000 packages and not notice ?

breaking compatibility is like nicotin

Posted May 22, 2025 12:06 UTC (Thu) by lunaryorn (subscriber, #111088) [Link] (3 responses)

Well, it's all about perspective.

PyPI hosts about 630.000 packages by its own account. You say they broke 12000 packages and didn't notice. I'd say setuptools broke only two percent of all known packages.

Now, finding those two percent might present a bit of a challenge. I don't think Python has any kind of infrastructure for large scale automated rebuilds across the whole ecosystem to test for such regressions.

Considering this, and the wildly heterogeneous Python packaging and tooling landscape, I find it surprising that they only broke 12000 packages.

breaking compatibility is like nicotin

Posted May 22, 2025 17:05 UTC (Thu) by mb (subscriber, #50428) [Link] (2 responses)

>Now, finding those two percent might present a bit of a challenge.

One of their own dependencies broke due to the change:

> In fact, this change was already known to be incompatible with the popular Requests library [...]
> The 78.0.1 patch simply removed Requests from the Setuptools integration tests

breaking compatibility is like nicotin

Posted May 23, 2025 14:09 UTC (Fri) by lunaryorn (subscriber, #111088) [Link]

Yes, I've read that, and–among other things–sure indicates that this change wasn't quite done well. From what I've seen, the setuptools maintainers sometimes seem to have... peculiar opinions.

But we should still consider the number in perspective.

The setuptools maintainers really don't have chance at all to assess the impact of their changes upfront. The whole Python ecosystem is neither prepared nor equipped for large-scale testing of its foundations. It desperately lacks the equivalent of Rust's crater, all the more for its fragmented tooling.

Their only way to see what they'd break is to rollout the change and see. Consider this, it's quite surprising that this doesn't break more frequently, and in worse ways.

breaking compatibility is like nicotin

Posted May 29, 2025 21:24 UTC (Thu) by zahlman (guest, #175387) [Link]

Just to clarify (in case I didn't adequately communicate it in the article):

Requests isn't a dependency of Setuptools; rather, the *tests* for Setuptools would *try to install* a separate copy of Requests - from source - in a separate environment, for the purpose of testing integration with pip. Setuptools doesn't try to download anything by itself (it asks pip to do so in these tests), so it doesn't have any use for Requests.

In 78.0.1, Requests was removed from the set of example packages used for this kind of integration testing. (As far as I can tell, this was *not* rolled back when the breaking change was rolled back.) The rationale was that already-built versions of Requests already exist; this "build" process is trivial (no code is actually compiled, metadata is just shuffled around); and the overwhelming majority of Requests users will get this pre-built version (you essentially have to request the source version explicitly if you want it, and there's normally no reason to do so for anyone except perhaps Linux distro maintainers).

2to3?

Posted May 21, 2025 17:15 UTC (Wed) by mb (subscriber, #50428) [Link] (18 responses)

>it would be nice to have breaking changes bundled together so that they occur less often.

Just like, ehrm... the success story of the 2 to 3 transition?

2to3?

Posted May 22, 2025 13:46 UTC (Thu) by zahlman (guest, #175387) [Link] (17 responses)

My personal experience was: I could see that 3.0 and 3.1 were botched and people were arguing about the Unicode string prefixes still, so I skipped them. Starting on 3.2 was a breath of fresh air - a whole bunch of things I'd considered serious design problems with the language had simply been fixed, and things *worked the way they were supposed to*. I no longer had to wonder why decoding a string from raw byte data could produce an error claiming something went wrong with encoding, or vice-versa. Beginners were no longer being taught on day 1 to introduce arbitrary code execution exploits into their programs (`input()`). The new print function allowed for amazing and elegant things; and `print 1,` now meant the same thing as `print (1,)`, like you'd expect from basically everything else in the language; and redirecting to a file didn't involve arcane C++-like syntax. There was no longer a speed bump between the `int` and separate `long` types. And so on and so forth.

And then I looked around me at people grumbling and dragging their heels, and I had no comprehension of it.

And they got extra time to fix everything, and missed the official deadline, and kept grumbling after that, and I still didn't understand it.

And I still don't.

2to3?

Posted May 22, 2025 17:15 UTC (Thu) by mb (subscriber, #50428) [Link] (3 responses)

>and I still didn't understand it.
>And I still don't.

Big break-the-world changes are bad, because they cost a lot of time at once to fix and a lot of money at once to fix.

For me it is easy to make small deprecation removal changes together with the required feature development. I don't have to ask for additional time or money. It can be done during the normal development process and nobody complains.

But it is *much* harder to get time and money for doing the one big major-version-bump transition that broke the world.
For lots of projects that's basically impossible to do.

I still maintain Python 2 scripts and they will never be replaced with Python 3. Ever.
Python 2 code is even still being actively developed, because the whole ecosystem of the project is stuck at a Python 2 interpreter. And that also will not change for these projects. Ever.
Python 2 will have to die no earlier than the projects that still use it.

Please don't do this mistake with anything ever again.

2to3?

Posted May 29, 2025 21:52 UTC (Thu) by zahlman (guest, #175387) [Link] (2 responses)

> For lots of projects that's basically impossible to do.

I don't suppose you know any places that are willing to hire or contract people to try anyway?

2to3?

Posted May 29, 2025 21:58 UTC (Thu) by mb (subscriber, #50428) [Link] (1 responses)

I think you are 10 years too late ;-)

2to3?

Posted Jun 2, 2025 0:33 UTC (Mon) by zahlman (guest, #175387) [Link]

... But you say that there is still active 2.x development somewhere. I would like to discourage this or steer it away if possible, because I consider 2.x to be actually broken.

2to3?

Posted May 22, 2025 22:22 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (12 responses)

Python was horribly broken until around 3.5, when they finally re-allowed "+" and format operations for mixed strings/bytes ( https://peps.python.org/pep-0461/ ).

I spent countless hours debugging issues caused by an error handler somewhere that did not have a correct encode/decode for one of the arguments. It was painful.

2to3?

Posted May 29, 2025 21:49 UTC (Thu) by zahlman (guest, #175387) [Link] (11 responses)

Python 3 before 3.5 supported concatenation of bytes to bytes (and of course str to str); Python 3 today still does not support concatenation of bytes to str or vice-versa, and won't going forward. The bytes type also still doesn't have a .format method, although minds might more conceivably be changed about that.

PEP 461 is about the % operator for formatting, which still doesn't allow mixed types - it just allows you to have, for example, a byte with value 37 followed by a byte with value 115 in a bytes (or bytearray) and then replace those with a different sequence of bytes.

It's a little hard to imagine how an error handler failing to convert between str and bytes could cause such a headache - because an error handler would presumably be formatting *an error message*, which means the left-hand side should clearly be a string *literal*. And in Python 3 before 3.5, just as today, formatting a bytes into that with % would simply simply give the wrong result (with a 'b' prefix and quotes and escapes). And once you see badly formatted error messages in the logs, you just track them down to the handler according to the template they're formatted into. (Supposing, for the sake of argument, that the formatting isn't actually beneficial in determining what went wrong in the code that had the original error.)

If you mean that code decoded byte data into strings in order to be formatted, and then wasn't properly converted back to bytes, I sympathize; but the original 2.x code in this case *was then already broken* and could have easily failed silently instead (by implicitly using the wrong encoding on either side, or both sides). I understand that many data protocols use "tags" in a binary data stream that are intended to be readable as ASCII (e.g. fourCCs, or HTTP headers interpreted before an encoding has been negotiated); but insertion into such binary data is fundamentally an array-splicing operation, not a text formatting one. (Or if you create HTTP headers from scratch, you must sanitize the data anyway - see https://stackoverflow.com/questions/5251824 - and catching an exception from .encode('ASCII') is as good a way as any.)

2to3?

Posted May 30, 2025 1:33 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (10 responses)

> PEP 461 is about the % operator for formatting, which still doesn't allow mixed types - it just allows you to have, for example, a byte with value 37 followed by a byte with value 115 in a bytes (or bytearray) and then replace those with a different sequence of bytes.

Yes, thanks for the correction.

> It's a little hard to imagine how an error handler failing to convert between str and bytes could cause such a headache - because an error handler would presumably be formatting *an error message*, which means the left-hand side should clearly be a string *literal*.

You lose the original error message and get a useless type mismatch exception as a result. Which is NOT helpful when you're trying to debug the code that led to the error using logs.

> If you mean that code decoded byte data into strings in order to be formatted, and then wasn't properly converted back to bytes, I sympathize; but the original 2.x code in this case *was then already broken* and could have easily failed silently instead (by implicitly using the wrong encoding on either side, or both sides).

Py2 was almost perfect for data processing. The strings were just bytes and encodings could be ignored, for the most part. That's how Go treats strings, btw.

And I have _never_ _ever_ had a case where Py3 strings helped me. Not once was I in a situation: "Wow! That string/bytes split saved the day for me, good job Python developers!"

2to3?

Posted May 30, 2025 7:46 UTC (Fri) by mb (subscriber, #50428) [Link] (5 responses)

>And I have _never_ _ever_ had a case where Py3 strings helped me

I guess you probably only process texts with ASCII compatible encoding, right?

For people processing texts with non-ASCII encoding py3 is a *huge* step forward into the right direction.
It basically put an end to incorrect decoding as it forces developers to do the correct thing.

Ich möchte nicht mehr zurück.

2to3?

Posted May 30, 2025 21:34 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

> I guess you probably only process texts with ASCII compatible encoding, right?

If you want, I can tell you the tales of DOS vs. DOS Alternative encodings for matrix printers. Or about KOI vs. Win1251 settings for the Linux console fonts.

The problem is that you can't really do anything meaningful with true strings, unless you:

1. Just work within ASCII.
2. Know EXACTLY what you're doing and have enough context information.

The 1. is trivially handled by UTF-8 encoding (ASCII stays ASCII), and it actually covers the majority of needs. And 2. requires you to be aware of the current locale (i->I or i->İ), permissible string splits, invisible characters, LTR switches, etc.

So Python ends up with the disadvantages of both approaches. Python strings are not just bags of bytes, they are some magic black boxes that mere mortals are not supposed to open. Yet they don't provide enough context information to do "interesting" stuff, like being able to correctly split strings at word breaks, or handle case conversion.

2to3?

Posted May 30, 2025 21:43 UTC (Fri) by mb (subscriber, #50428) [Link] (3 responses)

Nope. The actual problem was that many developers only ever tested ascii and delivered their software if it was "good". And that's trivially true, just as you said.
Today that breaks.
And that's a good thing.
If you handle strings you *need* to be aware of (2). Always.

2to3?

Posted May 30, 2025 21:46 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

> Nope. The actual problem was that many developers only ever tested ascii and delivered their software if it was "good".

And I would take that over Py3 string mess. It could have been easily fixed by just setting the string encoding to utf-8.

> If you handle strings you *need* to be aware of (2). Always.

No, you don't. If you're only using string operations for protocol-level stuff, like splitting the HTTP headers on a ":" character, then you can just treat strings as bags of bytes. Same for filesystem operations.

2to3?

Posted May 30, 2025 21:49 UTC (Fri) by mb (subscriber, #50428) [Link] (1 responses)

yeah, right. If you don't use strings ("just treat strings as bags of bytes"), then you don't need to use strings.
That's trivially true and I don't see the problem. Just use bytes.

2to3?

Posted May 30, 2025 21:58 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

> yeah, right. If you don't use strings ("just treat strings as bags of bytes"), then you don't need to use strings.

That is correct. The problem is that the type for string literals in Python is, well, a string. It pops up in a myriad of places in the standard library.

For example, the exception message is (you guessed it) a string.

> That's trivially true and I don't see the problem. Just use bytes.

Well, I recommend you to try your own advice.

2to3?

Posted Jun 2, 2025 1:40 UTC (Mon) by zahlman (guest, #175387) [Link] (3 responses)

> And I have _never_ _ever_ had a case where Py3 strings helped me. Not once was I in a situation: "Wow! That string/bytes split saved the day for me, good job Python developers!"

Every time that you open a file in text mode in Python 3, you are being helped by the new system - which enables Python to read those files in "universal text mode" while applying a text encoding at the same time. So now you just supply an 'encoding' keyword argument, and you don't have to separately decode each line (and don't need to worry about how newlines are encoded!). Nor do you ever wrestle with the 'codecs' standard library module.

Every time you specify the wrong encoding when opening a file, Python 3 helps you by forcing you to attempt the decoding at the correct point in your program - thus reporting the error at the time you read from the file, instead of arbitrarily far into the rest of your code.

Every time you *don't* see a UnicodeDecodeError from a line of code that calls .encode, or a UnicodeEncodeError from a line of code that calls .decode, you are being helped by Python 3's refusal to convert implicitly from one type to the other (with a poorly documented encoding choice) before using your own explicit conversion back - see e.g. https://stackoverflow.com/questions/5096776 and https://stackoverflow.com/questions/9644099 (and a whole pile of duplicates, whether or not they've been recognized as such yet).

Since Python 3.3 (see https://peps.python.org/pep-0393/), every time you use len() on a string that includes characters outside the BMP, Python 3 helps you by always correctly counting the characters you put into the string, and not *sometimes, implicitly* decomposing them into surrogate pairs because you have a "narrow Unicode build". It also helps you by saving space versus older "wide" builds in cases where you *don't* have such characters in the string. (For that matter, it can store an ASCII string as a proper Unicode string with only one byte per character.)

Python 3 lets you use a bytes object as a sequence of byte values, i.e., unsigned 8-bit integer values. Every time you index into it and get an integer result, rather than a length-1 bytes object, you benefit. Similarly, since Python 3.2, you benefit from int.from_bytes and int.to_bytes existing - an enhancement that would make little sense to add as long as there's still a risk of people trying to use bytes to represent text directly. Similarly, Python 3 gave you bytes.fromhex, and 3.5 added bytes.hex - which would again be nonsense when the type is used as a defective representation of text. (Needing to reach for strangely named standard library methods could be viewed as having been a kind of safety measure here.)

Every time you had to deal with legacy binary data that has embedded text data, or text in a legacy encoding - say, script data from an old reverse-engineered game, with custom control codes to indicate who's speaking or to control the text scroll etc., or uses some bytes to represent custom glyphs with two ASCII characters on the same tile - Python 3 helped you by keeping the type calculus clear. It does this by reserving a type that only ever represents the raw data, and not any extracted, custom-decoded strings, which must be in the other type. I don't personally use type annotations for much of anything, but type checkers like Mypy certainly appreciate the extra clarity, too.

2to3?

Posted Jun 2, 2025 2:32 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

> Every time that you open a file in text mode in Python 3, you are being helped by the new system - which enables Python to read those files in "universal text mode" while applying a text encoding at the same time.

Thanks for reminding me. Yes, I sometimes forget to add the binary mode and this often bites me later when Python inanely converts files.

Bonus point if the computer where it's deployed has an ISO encoding, and not UTF-8.

> Every time you specify the wrong encoding when opening a file, Python 3 helps you by forcing you to attempt the decoding at the correct point in your program - thus reporting the error at the time you read from the file, instead of arbitrarily far into the rest of your code.

Or not reporting, and just silently corrupting the data.

> Since Python 3.3 (see https://peps.python.org/pep-0393/), every time you use len() on a string that includes characters outside the BMP, Python 3 helps you by always correctly counting the characters you put into the string

Sigh. Now it's clear that people like you are the result of this string disaster. You never once stopped to ask: "WHAT is the length of string?"

Is it the number of characters? The number of code points? The number of graphemes? What exactly do you mean by it? Why do you even _want_ to know the length of a string?

Py3 string length does not even uphold this equality: "len(string) = len(split1) + len(split2)" for all arbitrary strings split on the "character" boundary!

2to3?

Posted Jun 2, 2025 10:23 UTC (Mon) by intelfx (subscriber, #130118) [Link] (1 responses)

> Sigh. Now it's clear that people like you are the result of this string disaster.

Whatever point you were trying to make, I'm sure it would have been better off *without* this personal attack.

2to3?

Posted Jun 2, 2025 12:51 UTC (Mon) by Wol (subscriber, #4433) [Link]

I can understand Cyberax' frustration ...

To clarify his point (and yes it took me a while to get it)

"why do other people think it's perfectly okay to trash my data when I haven't told them to!".

File opens should NOT DEFAULT to "assume the contents are mutable and can be altered by the computer without warning". It's a disaster in waiting ...

Still, it goes back as far as ftp, so that's a long way - I can still remember trying to recover the contents of a binary data file that had been transferred in text mode ... and even if it is text, I can remember trying to get printers to work because some stupid system thought it would treat it according to the system convention not the printer definition ...

Cheers,
Wol

I don't get why people expect no breaking changes

Posted May 21, 2025 18:30 UTC (Wed) by vmpn (subscriber, #55435) [Link] (11 responses)

If you want software to be maintainable long term, you have to have ability to make backwards incompatible releases to clean up technical debt.

While breaking consumers of the app is not great they have to bare some responsibility for consumption of new major version that documents the incompatibly.

I don't get why people expect no breaking changes

Posted May 21, 2025 18:58 UTC (Wed) by mb (subscriber, #50428) [Link] (7 responses)

>some responsibility for consumption of new major version

Well, have you read the article section "Reporting and updating"?

>If you want software to be maintainable long term,
>you have to have ability to make backwards incompatible releases to clean up technical debt.

That depends on what the technical debt is, what the advantage of removing it is and how many downstream users are affected.
This ratio doesn't look too good for a hyphen removal that breaks 12000 packages.

I don't get why people expect no breaking changes

Posted May 21, 2025 19:16 UTC (Wed) by vmpn (subscriber, #55435) [Link] (6 responses)

> Well, have you read the article section "Reporting and updating"?
I did and I find the response stating you been warned for 3+ years and took no action to remediate, so please don't be surprised your workflow broke right on the money.

I don't get why people expect no breaking changes

Posted May 21, 2025 19:51 UTC (Wed) by mb (subscriber, #50428) [Link] (5 responses)

It takes one package in the dependency chain that I don't have control over to break the build.

Please always keep in mind the benefit and the costs of breaking changes as a trade off.
A simple breakage that breaks 12k packages easily wastes several years of net human worktime. So is it worth it to save you 5 hours of maintenance cost during the next 5 years?

It's always easy to expect everybody to update and do the work, if you don't have to actually spend/pay the work.

I don't get why people expect no breaking changes

Posted May 22, 2025 0:40 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (4 responses)

> It's always easy to expect everybody to update and do the work, if you don't have to actually spend/pay the work.

We can turn that around very easily. The Setuptools developers were, technically, under no obligation to revert the change at all. They could just as easily have said "What part of 'NO WARRANTY' and 'major version bump' did you not understand?" and WONTFIXed all the bug reports. That they were generous enough to revert their change should be taken with gratitude, not contempt for the change having been made in the first place.

I don't get why people expect no breaking changes

Posted May 22, 2025 5:51 UTC (Thu) by mb (subscriber, #50428) [Link]

I was not talking about whether to revert or not, but whether to deprecate or not.

I don't get why people expect no breaking changes

Posted May 22, 2025 10:00 UTC (Thu) by tux3 (subscriber, #101245) [Link] (2 responses)

If we take the view that the setuptools developers are under no obligation to play nice and be considerate, then it stands that their users are under to obligation to respond with gratitude. You're perfectly allowed to do that, but it always cuts both ways.

You can take a "let others deal with it" attitude, but you can't force users to like it. Doing unpopular things is a reliable way to become unpopular.

I don't get why people expect no breaking changes

Posted May 29, 2025 5:05 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (1 responses)

FOSS is a gift. Gratitude is the normal social response to gifts. Nobody was forced to use Setuptools. Or at least, the Setuptools developers themselves did not force anyone to use Setuptools.

I don't get why people expect no breaking changes

Posted May 29, 2025 6:41 UTC (Thu) by Wol (subscriber, #4433) [Link]

> FOSS is a gift. Gratitude is the normal social response to gifts.

??? Depends on the giver.

Don't remember the story (I don't like Dickens), but he wrote a wonderful short about about a Christmas meal thrown for "the deserving" (or maybe undeserving).

Oh - and as far as development "aid" is concerned, most gift horses most definitely SHOULD be looked in the mouth.

Cheers,
Wol

I don't get why people expect no breaking changes

Posted May 22, 2025 17:08 UTC (Thu) by dbnichol (subscriber, #39622) [Link] (2 responses)

There are 2 difference here from a regular library API break:

1. This is a build tool. The expected input to it is released software that, by definition, can't be changed to adjust to the API changes. Imagine if GNU Make decided to stop supporting some old Makefile features that have long been considered undesirable. That might be just fine for your project that builds with make, but you could hardly expect this new version of GNU Make to be adopted since it would break builds of released software. In other words, breaking the API of a build tool has a big impact since the typical API consumer is frozen.

2. Unfortunately for setuptools, the latest version is used implicitly for all Python old source releases. That puts it in a situation where API breaks are especially damaging since there's no opt-in. With hindsight, it probably would have been better to take distutils from Python 2.7 or something like that and brand that as the implicit default build tool.

I don't get why people expect no breaking changes

Posted May 27, 2025 10:25 UTC (Tue) by LtWorf (subscriber, #124958) [Link]

Python dumped distutils in favour of setuptools. Because being too static and unmoving is not a good characteristic in python.

I don't get why people expect no breaking changes

Posted May 29, 2025 22:00 UTC (Thu) by zahlman (guest, #175387) [Link]

> With hindsight, it probably would have been better to take distutils from Python 2.7 or something like that and brand that as the implicit default build tool.

distutils specifically would just not have cut it. By the time it was removed from the standard library it was considered an unmaintainable mess that nobody really knew what to do with.

It would have been nice for Setuptools to be able to extract a "core" build system that is restricted to the part that actually implements PEP 517 (discussed in other comment threads), which could then be used separately from the part that enables workflows on the developer's machine. Flit and Hatch work this way (using flit-core and Hatchling respectively). However, a "Setuptools-core" would have to have been restricted to only handling projects that can configure everything in `setup.cfg` (and `pyproject.toml`) since `setup.py` is the same file used for implementing, for example, arbitrary new sub-commands for command-line developer use, or for setting up tests in arbitrary Python code (subsequently run with `setup.py test`).

Hidden build warnings?

Posted May 22, 2025 0:40 UTC (Thu) by marcH (subscriber, #57642) [Link] (11 responses)

> Regarding front-end tools failing to expose warnings, Bravalheri proposed that installers should show build warnings to end users by default.

Dear Lazyweb, how can hiding build warnings ever be considered a good idea? Baffles me.

Hidden build warnings?

Posted May 22, 2025 6:44 UTC (Thu) by Wol (subscriber, #4433) [Link] (8 responses)

> Dear Lazyweb, how can hiding build warnings ever be considered a good idea? Baffles me.

Do you really want no-clue-bies to be shown an endless stream of warnings? I get p***ed off with dealing with people who don't want to understand. Do I really want to have to deal with even more people who *can't* understand?

Don't get me wrong, hidden warnings I don't know how to find are a real pita. But no-clue-bies panicking over harmless warnings they don't understand are worse.

Cheers,
Wol

Hidden build warnings?

Posted May 22, 2025 9:17 UTC (Thu) by ceplm (subscriber, #41334) [Link] (1 responses)

Yes and yes. This is UNIX, we are supposed to be all consenting adults here. If somebody doesn’t understand what happens when he runs `rm -rf`, he will learn soon.

Hidden build warnings?

Posted May 22, 2025 13:57 UTC (Thu) by zahlman (guest, #175387) [Link]

Well, no, that's the thing. Python and pip are cross-platform, and the same build process can be triggered on a Windows end-user's machine just as easily.

Hidden build warnings?

Posted May 22, 2025 19:50 UTC (Thu) by marcH (subscriber, #57642) [Link] (5 responses)

> Do you really want no-clue-bies to be shown an endless stream of warnings?

1. People _building_ things are not supposed to be "no-clue-bies", Unfortunately https://lwn.net/Articles/1022219/, but even then people running "pip install ..." are supposed to be somewhat computer-literate.

2. Countless times, I was the very first one to stop some old build warning in a project with a _very experienced_ team. Why? Because NO ONE looks at build logs unless the build fails. And even when it fails, people try to read as little as they can. Only what they need to fix the build and carry on. So, for the sake of the 0.0001% developers who do look at build warnings, do NOT hide build warnings - ever. Of course this does not means: "spam with the build logs with a ton of garbage". But a "warning" should by definition not be garbage.

Hidden build warnings?

Posted May 22, 2025 20:08 UTC (Thu) by pizza (subscriber, #46) [Link] (3 responses)

> 1. People _building_ things are not supposed to be "no-clue-bies", Unfortunately https://lwn.net/Articles/1022219/, but even then people running "pip install ..." are supposed to be somewhat computer-literate.

You're the one who keeps going on about how *users are not programmers*.

Someone cut-n-pasting instructions they found elsewhere isn't going to be expected to understand (much less fix) these warnings. They just want to use that software.

(At $dayjob-1, those users consisted mostly of PhDs playing around with bleeding-edge research tools. I'm the one who had to somehow shoehorn that "works for me on $desktop" mudball into something that could make it through a CI environment for deployment onto custom hardware that had some very strict python packaging requirements due to interaction with $$$ EDA tooling)

Hidden build warnings?

Posted May 22, 2025 23:20 UTC (Thu) by marcH (subscriber, #57642) [Link] (2 responses)

> > are supposed to be somewhat computer-literate.

> You're the one who keeps going on about how *users are not programmers*.

I know it's getting more and more difficult to accept nowadays but it's generally impossible to classify people in only two binary categories. Someone using a command-line can be neither an iPhone user, nor a developer. There are plenty other literacy levels in the middle. You just gave a great example yourself.

> Someone cut-n-pasting instructions they found elsewhere isn't going to be expected to understand (much less fix) these warnings. They just want to use that software.

So what?

If they copy/paste instructions they don't understand, they ALSO ignore warnings they don't understand; fact! So, there is absolutely zero need to hide any of those warnings (as long as they don't fill up the terminal). It makes absolutely zero difference to careless people who don't know what they're doing, and it makes a huge difference for the 0.0001% people who cares.

I understand it can be counter-intuitive to care for the 0.0001%. But based on years of experience, it works great.

Also, when their house of pip cards finally blows up and they call you for help, YOU can see those warnings.

Hidden build warnings?

Posted May 22, 2025 23:37 UTC (Thu) by pizza (subscriber, #46) [Link] (1 responses)

> Also, when their house of pip cards finally blows up and they call you for help, YOU can see those warnings.

That was most of my last job.

No thank you.

Hidden build warnings?

Posted May 23, 2025 8:16 UTC (Fri) by taladar (subscriber, #68407) [Link]

You might not like the job in general but it certainly doesn't get any easier if you first have to talk them through sending you some obscure log file and then interpret the very same warnings they could have sent you more easily if they were just shown to them.

Hidden build warnings?

Posted May 27, 2025 10:29 UTC (Tue) by LtWorf (subscriber, #124958) [Link]

Since with python is impossible to distribute an application, it's normal that applications tell users to use pip install to obtain them.

Hidden build warnings?

Posted May 22, 2025 13:54 UTC (Thu) by zahlman (guest, #175387) [Link] (1 responses)

The warnings would mostly be shown to people who didn't necessarily even *expect that a build would take place*. Pip doesn't warn in advance that it's going to install from an sdist, because it doesn't know in advance that it might have to (in order to satisfy dependency version constraints).

You may also be interested in https://github.com/pypa/pip/issues/9140 .

Hidden build warnings?

Posted May 22, 2025 20:01 UTC (Thu) by marcH (subscriber, #57642) [Link]

> The warnings would mostly be shown to people who didn't necessarily even *expect that a build would take place*.

Darn.

I think these build warnings should still be shown. In fact, pip should not even try to hide the difference between building something and not building anything; why would it do that? User interface always matters but "pip install" is not something "consumers" are expected to use, it's not an iPhone that should pretend that everything is fine when some glitch happens.

Also, it's incredibly hard to deprecate anything. A very reasonable amount of "spamming" end-users can only help.

Googling error messages and warnings is the most useful ever, everyone does it all the time. Don't prevent that.

Just stop hiding stuff already. Ignoring logs is easy, everyone does it ALL THE TIME. Digging out useful warnings buried in obscure log files with unknown names or behind obscure --debug-whatever option is the most time-consuming and horrific user experience ever.

> You may also be interested in https://github.com/pypa/pip/issues/9140 .

I'm not interested because I wrote "dear Lazyweb" :-D
Also, Python packaging looks like something where you can too easily lose... mental health points. A bit disappointed for a scripted language but what do I know (not much).

OK, I read the description of 9140 and I think I agree with it. I upvoted it.

Frequent updates

Posted May 22, 2025 5:47 UTC (Thu) by geofft (subscriber, #59789) [Link] (3 responses)

> One might argue that the simple fact that Setuptools is on version 78 points to a problem in itself — that Setuptools development is moving far too quickly. This is not entirely fair: about half of Setuptools major versions date to before the implementation of the new pyproject.toml-based system, with version 39 being released in March 2018. However, we still now see about five new major versions of Setuptools per year. Progress on updating legacy projects to modern standards, needless to say, has been far slower. While I'd prefer not to see Setuptools development artificially slowed, it would be nice to have breaking changes bundled together so that they occur less often.

This is intuitively appealing from the point of view of "Wow, I was just broken by an update," because you extrapolate from that to the belief that _every_ update might break you. But the whole point of frequent releases like this is the exact opposite: that almost no updates do break you.

Keep in mind that the question here is not how much breakage happens. That happens at its own pace regardless of version number. The question is what the experience is of having the breakage happen to you. If there are three breaking changes that affect you happening in the span of five years, again, it's intuitively appealing to say that you want to be disrupted once instead of three times. But actually dealing with all three changes at once is much more difficult. If you don't have a handle on what broke, you have much more code to bisect. If you are trying to adopt a new feature, you now have a roulette wheel that will with one-third probability impose three times the work on you before you are able to adopt that feature. And in any case, you still have to actually deal with all three changes, so you're not doing any less work.

If you really, really want to have coarser-grained updates, there are various ways to pull that off; you can set a global constraints file on setuptools and upgrade versions rarely, you can use some sort of frozen mirror, you can use the OS-packaged version of setuptools, etc. If you're getting all 78 versions of setuptools, that's because you're installing directly from PyPI, and presumably have decided that subjecting yourself to whatever version of everything is current on PyPI is worth doing.

The Linux kernel, for instance, is on its 74th release in the 2.6.x series. It has a little more whimsy in its actual version numbering scheme than setuptools' spartan integers, but it's not actually conveying anything different. It has an infamously strict rule on backwards compatibility. And yet people do not upgrade to every minor release and pay good money for the ability to stay on a minor release for a decade (or more). I don't think I've ever seen a proposal that the fact that the kernel is on version 6.14 means that kernel development is moving far too quickly, and it would be nice to have fewer mainline kernel releases, each with a heftier set of changes.

Frequent updates

Posted May 22, 2025 11:46 UTC (Thu) by pizza (subscriber, #46) [Link]

> Keep in mind that the question here is not how much breakage happens. That happens at its own pace regardless of version number. The question is what the experience is of having the breakage happen to you. If there are three breaking changes that affect you happening in the span of five years, again, it's intuitively appealing to say that you want to be disrupted once instead of three times. But actually dealing with all three changes at once is much more difficult.

You forget that not everyone has the luxury of upgrading everything they have deployed in lock-step. Especially given that every [sub-[sub-[sub-]]]component they depend on has its own independent lifecycle.

...Case in point being setuptools iterating about 5x faster than actual python releases, and 10-25x faster than most deployment environments [1]. But more anecdotally, the lion's share of my time at $dayjob-1 was spent constantly cleaning up the messes from the impedence mismath between some rapidly-iterating bleeding-edge dependencies, others provided by EDA tool vendors [2], and everything in between.

[1] ie LTS/EL platforms with 2+ year release cadences and ~5yr standard support lifecycles
[2] More accurately, python code _generated_ by said EDA tools.

Frequent updates

Posted May 22, 2025 12:01 UTC (Thu) by aragilar (subscriber, #122569) [Link]

I would agree lots of small breaking changes tend to be easier to handle than a large one (as long as the effort to fix them is minimal, which is not always the case). I think though this all assume said changes are worth the pain, and maybe in some cases a major break is sometimes better because it forces a reset of how things are though of.

From what I've seen of setuptools issues, because of its long legacy, no matter what setuptools does, any changes setuptools makes to try to modernise will break large numbers of packages (and because wheels as distribution artefacts vs build caches are in many ways a least-common-denominator option, people keep relying on legacy distutils and setuptools features). This then means as soon as a package's packaging config is not kept up-to-date (even though it may need no other changes), issues arise. It also means the long tail of versions break in painful ways.

The obvious solution is to be picky about your dependency tree, and vendor and rewrite as needed, but if we're routing around issues with setuptools, at what point does it make sense to fork it (either freezing the old version, or making a more stable fork) rather than all the other packages?

Frequent updates

Posted May 22, 2025 14:06 UTC (Thu) by zahlman (guest, #175387) [Link]

The thing is that these kinds of breaking changes in Setuptools can break thousands of packages while still only impacting a very small percentage of packages. Both because of how big PyPI is, and because of ecosystem effects (when your package breaks, so do your dependents, transitively). If Setuptools bundled together multiple changes on the order of the one in version 78.0.0, it would IMHO be pretty unlikely that any given package had to deal with multiple of them at once. (Although the kinds of projects that are destined to be hit by multiple such changes eventually, might be better off just switching to a different build backend now.)

Also, if Setuptools had a slower release cadence for major versions, packages could think more seriously about setting a version upper-bound on the Setuptools version used to build them - since they could re-evaluate the decision less frequently.

License issues

Posted May 22, 2025 12:40 UTC (Thu) by jelly1 (subscriber, #70659) [Link]

Another recent issue with setuptools is that the MIT LICENSE file was removed [1] making it unable to be packaged in distributions, this has now been reverted but in dependencies of setuptools the license file is now still being removed. [2]

[1] https://github.com/pypa/setuptools/issues/5001
[2] https://github.com/jaraco/jaraco.itertools/issues/23

Empathy?

Posted May 22, 2025 13:08 UTC (Thu) by lunaryorn (subscriber, #111088) [Link] (4 responses)

I pity the setuptools maintainers.

They're keep dragging around a big pile of odorous excrements from two decades of failed approaches to Python packaging, but whenever they make even the slightest attempt at cleaning up this huge mess things break back and forth because there's always this one small, old, unmaintained package, with obscure packaging scripts none understands anymore, that for some obscure reason half the world seems to depend upon, and for which three decades of deprecation period wouldn't be enough.

When they achieve the occasional trace amount of success no one notices, no one appreciates, no one pays.

But when the fail at a mere two percent of existing packages the whole world comes down on them with indignation and outrage, often from people who'd not even in their dreams live up to the standards of maintenance they seem to expect from the volunteer setuptools maintainers.

They never stand any chance. The community doesn't help them, and they don't have any kind of tooling for quality assurance on a scale of 630k packages.

We wouldn't expect a well-paid carpenter to build a skyscraper with but a hammer, but we somehow seem to expect unpaid volunteers to build the seven wonders of Python packaging in their spare time.

Empathy?

Posted May 27, 2025 10:33 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (3 responses)

Let's not forget that python had distutils as part of the standard library but dropped it.

The entire idea that pip and setuptools and venv exist outside of the standard library and aren't part of python is the mistake here.

Empathy?

Posted May 27, 2025 12:18 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (1 responses)

How do you reconcile being outside of the CPython repository "the mistake" as well as:

> Because being too static and unmoving is not a good characteristic in python.

Is the stdlib far harder to change than an external package? You also get the possibility of using newer {venv,pip,setuptools} with older Python releases (or older versions with newer Python in the case of regressions[1]).

Empathy?

Posted May 28, 2025 17:25 UTC (Wed) by raven667 (subscriber, #5198) [Link]

I read that with a little sarcasm, which makes more sense to me.

Empathy?

Posted May 29, 2025 22:13 UTC (Thu) by zahlman (guest, #175387) [Link]

In point of fact, `venv` *is* part of the standard library, and pip is bootstrapped by the standard library `ensurepip`.

The main advantage of this setup is that it allows pip to be developed by a separate team, *with a separate release cadence*, which also automatically backports the new features to all the currently-supported Python versions (all the pip team has to do is test on those versions and be conservative about using new Python features).

The downside is that this model encourages copying pip into each virtual environment separately, and considerable know-how is required to manage a single copy of pip and empower it to install cross-environment (even though it *should* be easy and even though there's really nothing about being written in Python that actually causes a problem for this). There's also generally low awareness that pip (since version 22.3) *can* install cross-environment (although it uses a pretty brutal hack for this internally).

As for Setuptools, my understanding is that PEP 517 (and to a lesser extent 518 and 621) was *explicitly* designed to solicit alternative build backends. And there was really never a good way and a good moment to extract "just a" build backend from the project.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds