|
|
Subscribe / Log in / New account

I HATE PYTHON

I HATE PYTHON

Posted May 22, 2025 13:07 UTC (Thu) by Karellen (subscriber, #67644)
In reply to: I HATE PYTHON by geofft
Parent article: Recent disruptive changes from Setuptools

but what if it installs a binary like bash or perl or python or something and then breaks existing system packages

Then that's a terrible, broken package that no-one should be installing on any system under any circumstances. Also, you should probably add its authors/maintainers to a list of people to avoid installing any software from in the future.

I guess I don't understand enough about python packaging to have any clue why a problem that wild would even be on your radar. It feels like an abyss I'm frankly kind of terrified to look into, for fear of dropping SAN points.


to post comments

I HATE PYTHON

Posted May 22, 2025 14:19 UTC (Thu) by geofft (subscriber, #59789) [Link] (14 responses)

You think bash is a terrible, broken package? :)

I guess part of the problem here is that Python package managers install packages' dependencies too. So, yes, I agree that if `wget ftp.gnu.org/zardoz.tgz && tar xf zardoz.tgz && cd zardoz && ./configure && sudo make install` causes a /usr/local/bin/bash to exist, that is terrible, broken, and fully unacceptable. But `sudo pip install zardoz` may well pull in a newer version of a dependency (probably not bash, of course, but there are plenty of CLI tools written in Python, and occasional needs for CLI dependencies not written in Python).

Another way of looking at this is, _if_ there is a need for a newer version of some common dependency, something like automake will make you figure that out on your own, at which point you as a sysadmin will ponder what's going on before you install a newer version of bash to /usr/local/bin. The Python ecosystem prefers not to make you figure that out on your own. There are good arguments for both philosophies, but this is a fundamental part of why there are orders of magnitude more people who install Python packages than have ever typed the letters "./configure".

I HATE PYTHON

Posted May 22, 2025 16:22 UTC (Thu) by Karellen (subscriber, #67644) [Link] (13 responses)

Python package managers install packages' dependencies too.

Even non-Python dependencies, which are generally distributed outside of Python package repos?!?

Oh. My. God.

That does clear up some confusion I was having around binary packages for Python though. Like, why do you need binaries for a scripting language at all?

Anyway, I've been doing some stuff in Python recently, just using the system python. I've been thinking about looking into non-system-python stuff, and even packaging some of my code, but been put off by the multitude of python packaging systems, and an apparent lack of consensus around which to use. I think I'll just give them all a miss now, and try looking at another language altogether. Or maybe take up goat farming instead.

I HATE PYTHON

Posted May 22, 2025 16:53 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (5 responses)

> Even non-Python dependencies, which are generally distributed outside of Python package repos?!?

Eh... sort of but not really. The de facto standard tool for installing truly non-Python dependencies is conda, but conda is an entirely different ball of wax, with its own separate package repositories etc., and most distros either do not package it at all, or at least do not install it by default. Conda also sets up a totally isolated Python installation, so it actually does play reasonably well with the system Python (in the sense that it usually does not touch the system Python at all).

The problem with pip (not conda) is that, in practice, there are a ton of libraries written in Python, including the Python *bindings* for many non-Python libraries. Installing any Python package that depends on those libraries will cause them to be installed into /usr/local if no venv or --user is active, and then plenty of system packages may break.

> Anyway, I've been doing some stuff in Python recently, just using the system python. I've been thinking about looking into non-system-python stuff, and even packaging some of my code, but been put off by the multitude of python packaging systems, and an apparent lack of consensus around which to use. I think I'll just give them all a miss now, and try looking at another language altogether. Or maybe take up goat farming instead.

Gentle recommendation: Learn uv, venv, and *maybe* pip, and ignore literally everything else in this space. You don't really need to know venv and pip for that matter, because uv handles both of those use cases, but it's good to understand them in broad conceptual terms since they are the closest things that Python has to standard package tooling.

Rationale: uv handles 90%+ of use cases exceptionally well, has very good performance compared to most of the other tooling, is actively supported and maintained, favors correctness over footguns (especially in terms of how it resolves dependencies), and does not massively deviate from the de facto best practices that most other tooling follows (so if uv suddenly gets replaced by yet another New Shiny, you'll probably be able to migrate from it without massive effort on your part).

I have previously heard good things about Poetry and a couple of other solutions, and I probably would have recommended them to you in the past, but nowadays, the common wisdom is that uv's high performance blows most of the competition out of the water, to the point where it doesn't make much sense to touch any non-uv tool unless you really need a feature that uv lacks (or your project is already using one of the alternatives, of course). These days, there are not a whole lot of things missing from uv (at least as far as I know), so that's an increasingly niche reason to use something else.

I HATE PYTHON

Posted May 23, 2025 1:56 UTC (Fri) by aragilar (subscriber, #122569) [Link] (4 responses)

90% is optimistic: uv is very much embedded in the webdev part of the ecosystem (it's the same line as pipenv and poetry) so solves their needs well (modulo assumptions uv makes which are its primary source of speedups but also cause issues), but outside webdev it makes things harder to package for linux distros (because now you have two languages to manage), it doesn't play well with the numerical ecosystem because of its assumptions (and so conda is used), and it tries to hide the real need for design when managing an install (e.g. look at how it installs python, and all the minor things that break due to trying to make an almost-static install). If you are in its target group, it's a significant improvement over pipenv/poetry (because they do make the same assumptions, and helped pave the path to uv), but otherwise to me it seems like another wedge splitting apart the Python ecosystem.

I HATE PYTHON

Posted May 23, 2025 2:14 UTC (Fri) by geofft (subscriber, #59789) [Link] (3 responses)

> it doesn't play well with the numerical ecosystem because of its assumptions (and so conda is used)

I've spent the last week at PyCon in a room with the Conda folks and other people working with numerical/scientific computing in Python, trying to both solve these problems for the PyPI-based ecosystem (pip, now also uv and poetry and others) and trying to do it in a way where we're compatible with the Conda ecosystem / where they can adopt the same designs. Honestly the PyPI ecosystem has been _pretty_ good over the last few years for this task, certainly far better than a decade ago when `pip install numpy` would always try to build from source. NVIDIA has resumed offering first-class support for the PyPI ecosystem in the last few years after very loudly quitting it in 2019 (https://medium.com/rapids-ai/rapids-0-7-release-drops-pip...). There's still a lot left to do, but we should be at the point (and are, in my experience) where most people, certainly most individuals, will have just as good a time with pip/uv/etc., so I am genuinely curious what sorts of things you personally find don't work.

> and it tries to hide the real need for design when managing an install (e.g. look at how it installs python, and all the minor things that break due to trying to make an almost-static install).

I'm one of the maintainers of those Python builds and I am absolutely open to bug reports, either here or on our bugtracker. Going after all these minor things is basically my top priority at the moment. Please let me know, I love having more test cases. :-)

I HATE PYTHON

Posted May 23, 2025 10:46 UTC (Fri) by aragilar (subscriber, #122569) [Link] (2 responses)

I want to draw a distinction between the underlying standard/tools, which are mostly fine apart from some key spots I know are mostly being worked on (and which I am happy to use), and the higher level wrappers (of which poetry and uv are the poster children, and which I have an issue being described as working for 90%).

I'd classify things on a spectrum from papercuts to blockers. Things listed on https://gregoryszorc.com/docs/python-build-standalone/mai... are close to papercuts, in that they can be avoided by using a different build (or even building from source, plus they're actually documented), but when tools like uv default to using them implicitly, they start moving towards blockers (as people tend to go with the defaults, and so it becomes a matter of debugging layers of issues). Once certain IDEs start getting involved, now you're having to debug someone poorly configured machine and wasting hours or days fixing.

Then there are the parts of the ecosystem where pre-built binaries cannot be used (e.g. MPI), you end up to start adding flags like `--no-binary` and rebuilding half the ecosystem anyway because you need to link to a single BLAS. uv is really not designed for this (neither is poetry, they assume too much about wheels), and while you can (and I have) created your own private index to control what the installers see, using pip is far easier (and less likely to break).

Then there are the fun bits like the PyPI packages of Jupyter plugins just straight up not working (though the conda version works), due to wheels not being able to install to specific paths (because they need to integrate with the system they are installed on).

This all flows from the assumption that pre-built PyPI-associated binaries (with all their limitations) are a solution (rather than a workaround) to users needing to set up their machine properly. It's worth comparing this to R, where the installer on Windows and MacOS sets up a proper dev environment (with compilers), and so these kind of problems do not happen.

I HATE PYTHON

Posted May 24, 2025 22:47 UTC (Sat) by geofft (subscriber, #59789) [Link] (1 responses)

Thank you, this is very helpful and I appreciate you taking the time to write this up! A few of those quirks in that document are gone now (e.g., the musl build is now a dynamic binary, not a static one, so extensions work fine); I'll update it.

> Once certain IDEs start getting involved, now you're having to debug someone poorly configured machine and wasting hours or days fixing.

Just to clarify, do you mean that people use their IDEs to set up uv and thus python-build-standalone, and that makes things harder to debug because it's harder to figure out what their setup is? Or is there something specific to usage inside some IDE versus at the CLI?

> Then there are the parts of the ecosystem where pre-built binaries cannot be used (e.g. MPI), you end up to start adding flags like `--no-binary` and rebuilding half the ecosystem anyway because you need to link to a single BLAS.

This is definitely one of the things we spent time talking about last weekend (and it's also called out on https://pypackaging-native.github.io/key-issues/abi/ , an excellent website written by folks who work on the scientific Python ecosystem and would like the non-conda ecosystem to work well.) It's certainly a deficiency right now, but it's also a high priority for the scientific Python community as a whole. While I'm sure people have concrete examples, if you have a specific favorite prebuilt binary wheel or package name that doesn't work well or a combination of libraries that need to share a single BLAS, again, I love having more test cases. :)

> [...] uv is really not designed for this [...] using pip is far easier (and less likely to break).

uv and pip both install the same packages from the same places, and when you build from source with --no-binary you can link your exact system libraries when using either uv or pip, so I'm curious about this - is this because you can do venv --system-site-packages (or use pip without a venv in some fashion) and get some compiled Python packages from the OS, and uv doesn't support that? Or something else?

> Then there are the fun bits like the PyPI packages of Jupyter plugins just straight up not working (though the conda version works), due to wheels not being able to install to specific paths (because they need to integrate with the system they are installed on).

This one is new to me - do you have a specific plugin / example command that doesn't work? (I know there are tons of people using Jupyter with extensions in production using pip + venv, so I'm assuming it's not "any extension.")

> This all flows from the assumption that pre-built PyPI-associated binaries (with all their limitations) are a solution (rather than a workaround) to users needing to set up their machine properly. It's worth comparing this to R, where the installer on Windows and MacOS sets up a proper dev environment (with compilers), and so these kind of problems do not happen.

Yes, conditional on actually setting up a proper dev environment with compilers and library dependencies, which has its own host of problems. :) This was, after all, the status quo of pip until about a decade ago, and the fact that it was a very bad experience is both why conda came into existence and why pip started doing wheels. (Unless you mean that Python or uv or some other installer should install _its own_ compiler toolchain, independent of what is on the host, and build stuff from source, including C libraries like BLAS? That's an intriguing idea....) Conda supports R as well as Python, and my impression is that its R support is popular for largely the same reasons, in that building all your R packages from source is a difficult experience.

I HATE PYTHON

Posted May 28, 2025 9:55 UTC (Wed) by aragilar (subscriber, #122569) [Link]

> Just to clarify, do you mean that people use their IDEs to set up uv and thus python-build-standalone, and that makes things harder to debug because it's harder to figure out what their setup is? Or is there something specific to usage inside some IDE versus at the CLI?

Sorry, I meant that IDEs providing Python installs as well, and now you've got some unreproducible conda-uv-IDE frakenstate where python points somewhere, pip somewhere else and who knows what the `sys.path` will be. Too many tools want to be in charge (for legitimate reasons) and only pip is really set up to be a bit player where it plays nicely with others (though naturally it too can mess things up, as noted by PEP 668). Hence why I push people to understand what their tools are doing, and not mix them arbitrarily.

> This is definitely one of the things we spent time talking about last weekend (and it's also called out on https://pypackaging-native.github.io/key-issues/abi/ , an excellent website written by folks who work on the scientific Python ecosystem and would like the non-conda ecosystem to work well.)

Yeah, I've provided PRs to it previously :) To me the solution here was the "local" or "custom" wheel tag, which I suggested a while back, but didn't go anywhere.

> uv and pip both install the same packages from the same places

It's not the source that's the problem, it's that uv wanted a venv (at least when I tried it) whereas it should have used (in that particular case) the (purpose-built) conda environment.

The jupyter plugin (I don't recall it's name, I'd need to dig into my notes) in this case was to do VNC on the web, and my colleague (who is not a fan on conda) said he needed to use both conda and pip to get it working. When I had time I found out there was a separate binary that was being shelled out to (I think) to set up a websocket or something, but it would only be set up right with the conda install, because it needed some post-install stuff (I'm not sure why it couldn't have been configured correctly).

> Conda supports R as well as Python

My impression (based on the astronomers and biologists I know) is R from conda is used when people want to do R<->Python connections using rpy2 or similar, and most actual R users just use CRAN. This may not be accurate across the ecosystem though (I'm very much embedded in the university ecosystem).

> Yes, conditional on actually setting up a proper dev environment with compilers and library dependencies, which has its own host of problems.

While I don't think this is trivial, I do think we (the Python community) can do a lot better (and if we assume pre-built wheels can be used, then issues like the discussion around the future of setuptools get brushed aside). R on Windows (and possibly MacOS) installs it's own copy of GCC plus a bunch of other "standard" tools, and I've not heard complaints there (I have previously suggested Python on Windows should make it easy to install MSVC, but I got pushback against that). Conda definitely made things better on Windows, but at least when I tried using it to install packages without wheels, it apparently hadn't added the configuration that the conda gcc should work with the conda python, which seemed like a missed opportunity to me.

It also encourages groups to not use PyPI to upload their packages, as the assumption is that you should produce wheels (NVIDIA was one example, but I know of others that I'm not going to name otherwise someone will squat on their packages). I think having a index where all packages get registered, and a separate (downstream?) index which is a self-contained wheel archive would go a long way to solve this, and then you tell people that the high level tools only work with the wheel archive would be a better experience as then PyPI isn't forced to become a worse conda.

I HATE PYTHON

Posted May 22, 2025 16:59 UTC (Thu) by geofft (subscriber, #59789) [Link] (5 responses)

> Like, why do you need binaries for a scripting language at all?

Two reasons. One, it's a general-purpose programming language; it happens to be good at scripting, but it also happens to be good at other things. For instance, it's very widely used for scientific computing (even before the current AI furor), and an extremely common thing to do with it is to install wrappers around BLAS/LAPACK and use them. (Most of these Python programmers don't even know that the thing they're using is BLAS/LAPACK at its core.)

Two, the thing that a scripting language does, very often, is to call binaries! Graphviz is a good example - the common Python bindings call 'dot' as a subprocess. Yes, you can install it on your own (and the version you get from pip wants you to do exactly that), but there are good reasons to want a version that matches what was tested by the wrapper, etc.

(Also also, if you insist on getting these things from the OS, that means you need sudo to install them. Weirdly enough, the Python ecosystem is probably the best fully unprivileged package manager out there for binaries on Linux. I'm not saying it's good, we have a lot of work to do, but the other options aren't better.)

> I've been thinking about looking into non-system-python stuff, and even packaging some of my code, but been put off by the multitude of python packaging systems, and an apparent lack of consensus around which to use. I think I'll just give them all a miss now, and try looking at another language altogether. Or maybe take up goat farming instead.

Don't let me stop you from taking up goat farming, but if you are willing to take a recommendation for what to use, give uv (https://docs.astral.sh/uv/) a shot. (I recommend the provided installer, but if your OS has a relatively recent version, that's fine too.) The approach of uv is to abstract all the virtual environment stuff from you (and, really, to treat it as an implementation detail).

In fact you can write a single-file Python script with some metadata about its dependencies and use "uv run myscript.py", and behind the scenes it will create a temporary virtual environment with those dependencies if it doesn't have one and run your script, completely independent of anything going on with the OS.

One other thing uv does is it also is able to install its own version of Python, generally noticeably newer than what your OS will have, further increasing how independent it is of the OS. I am one of the maintainers of those Python builds, so if you don't like something about how they build, feel free to yell at me. You will probably be justified in doing so; there are certainly several open issues. :) But it works pretty well, and it avoids the entire class of problems of conflicting with the OS.

uv is relatively new (a little over a year, and half these features for less than that). It's rapidly gaining consensus, and I think there's a strong consensus that _something like_ uv is the right approach, whether or not this particular software project is the way forward. But that's why you're not seeing a consensus quite yet. There's been a lot of work in the past few years to improve Python packaging. It is starting to pay off well, but only just starting. The single-file stuff, in particular, is extremely under-advertised, and I think it's likely the way most sysadmin types (among whom I count myself) would like to be doing things.

Depending on what you're doing, I'm not sure things will be better in another language. If you are writing interesting enough Python that you've moved beyond what the standard library offers, and especially if you're doing things where you'd want compiled libraries like BLAS/LAPACK, dependency problems have to be solved somehow. (I'm curious, incidentally, what the sort of dependencies you're installing are.) Or put another way - there's a reason there's enough stuff on your OS that's written in Python that doing "sudo pip install" puts it at risk. Despite all the problems you see, it does actually solve several more problems very well.

I HATE PYTHON

Posted May 24, 2025 0:14 UTC (Sat) by himi (subscriber, #340) [Link] (4 responses)

I've run into issues with uv's managed python installs, specifically with the netaddr package - which is an unmaintained beast of a thing, admittedly, but also a common enough requirement (particularly in the OpenStack context, where I work) that I didn't have much choice. In the end I just went with the system python and the distro's python3-netaddr package, because I couldn't find a way to get the managed python install to successfully build netaddr from source (netaddr's fault, not uv's), or provide some way to inject my own build into the environment.

Unfortunately, I don't have notes about what exactly I did, and it was on my previous machine so it's rather hard to go back and re-run things to give a proper bug report, or see if things have changed in the ~6 months since then (a distinct possibility given how fast uv has been evolving) . . .

I'd like to say that fixing the issue of unmaintained or badly maintained packages isn't something that the Python packaging community should be responsible for . . . and it really isn't . . . but there are degrees of failure, some more graceful than others, and it felt like I was hitting a brick wall rather than something more accommodating. Obviously a managed python install is a very curated and controlled environment and there's a limit to the degree of flexibility that you can provide, but that can make them unusable sometimes, particularly when it comes to oddball cases like netaddr.

I HATE PYTHON

Posted May 24, 2025 1:05 UTC (Sat) by himi (subscriber, #340) [Link]

Hah, I seem to have picked exactly the wrong time to try stuff with netaddr, because it's apparently a lot more live and maintained than it was when I hit these issues . . . That doesn't change the broader point, of course, but failing to note the changes would be rude to the netaddr devs . . .

I HATE PYTHON

Posted May 24, 2025 1:10 UTC (Sat) by geofft (subscriber, #59789) [Link] (1 responses)

Er, are you sure you're thinking about netaddr? That's a super straightforward pure Python package and so building it should be trivial, but also it has platform-independent wheels so you don't even need to do that.

For completeness, I just checked `uv run --with netaddr python`, and then pinned to various recent and not-so-recent versions, including the versions packaged in Debian stable/oldstable which are well older than uv itself. I also tried with `--no-binary` to force a local build. All seemed to work fine.

(Looking around Debian for similar-sound arch-dependent packages... did you mean netifaces? That one has compiled code and a pretty dusty build system, but it does seem to work fine with uv, too. I did get a failure the first time I tried because I didn't have a C compiler on my PATH, and I got a failure the second time because netifaces caches its configure-esque checks and uv leaves that around, but it works if you blow that away and retry.)

Regarding old versions, I actually think that's a reasonable expectation, provided that the actual code is -compatible with your current Python version. Sometimes you might need to do things like pin an older setuptools in the build environment (to get back to the subject of the article), and maybe that process needs better docs and error messages, but it's the sort of thing that should be doable.

I HATE PYTHON

Posted May 24, 2025 1:54 UTC (Sat) by himi (subscriber, #340) [Link]

Yes, insufficient caffeine on a lazy Saturday morning caused me to confuse netaddr for the real culprit, netifaces. If it's working now that may well be due to changes since I ran into this issue - uv development is almost as fast as its runtime, it can be hard to keep up. It may also have been using a more modern Python version than netifaces wanted? I honestly can't recall the details, and I'd have to do a lot of work to rebuild enough context to try and replicate the issues.

One of the things that would have been useful, I think, would be a way to use the system python's packages to resolve certain dependencies - possibly by importing them into the managed python install, or into the venvs that uv builds. Obviously assuming you're working with sufficiently compatible versions (which may well be hard to figure out) . . . It's not going to be something you want to use regularly, and it'd need to be explicitly chosen per-package, but it might be useful if the problem packages are a ways down the dependency tree and unresolvable otherwise.

I HATE PYTHON

Posted May 24, 2025 1:15 UTC (Sat) by himi (subscriber, #340) [Link]

ugh . . . I am, apparently, an idiot when I haven't had enough coffee - the package I was having issues with was netifaces, not netaddr. Definitely a case where editing comments would be useful, if only to minimise my embarrassment . . .

More about Python wheels ("binary packages")

Posted May 29, 2025 19:00 UTC (Thu) by zahlman (guest, #175387) [Link]

> That does clear up some confusion I was having around binary packages for Python though. Like, why do you need binaries for a scripting language at all?

Python source distributions and binary distributions are both "binaries" in the sense that they're two different flavours of zip archives; but I assume you're talking about wheels (binary distributions).

Generally speaking, they may "install non-Python dependencies", but they don't do this by putting anything in system folders (you're meant to always be able to install anything without sudo). Rather, the archive will contain compiled shared libraries which are then put within the standard install paths for the Python code; and then the Python code can find them with relative paths if necessary.

This happens because Python *isn't* simply "a scripting language" (to the extent that the term means anything any more); in particular, it's often used to interface to code in C (and other languages) for performance reasons, and also simply because the libraries exist and it's useful to access them from Python.

If your system Python includes Numpy, it might be very stripped down - though it still makes use of "local" binaries. For example, my installation includes `/usr/lib/python3/dist-packages/numpy/core/_multiarray_umath.cpython-312-x86_64-linux-gnu.so` weighing about 5.5MB. If I separately install a wheel in a new virtual environment, that environment will get its own copy of that which is 10MB, as well as a 25MB .so for OpenBLAS (the system Python can be reworked to use a shared BLAS build).

The libraries generally are compiled with broad compatibility in mind (see https://peps.python.org/pep-0600/, https://github.com/pypa/manylinux and https://packaging.python.org/en/latest/specifications/pla... if you want gory technical details), and of course C is still the dominant language for this. But in principle you can use anything as long as you can figure out explicit bindings from Python (for example, with tools like Cython or SWIG for C) or use dlopen (which is wrapped in the Python standard library by "ctypes").


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds