|
|
Log in / Subscribe / Register

Python, packaging, and pip—again

By Jake Edge
January 24, 2024

Python packaging discussions seem like they often just go around and around, ending up where they started and recapitulating many of the points that have come up before. A recent discussion revolves around the pip package installer, as they often do. The central role that is occupied by pip has both good points and bad. There is a clear need for something that can install from the Python Package Index (PyPI) immediately after Python itself is installed. Whether there should be additional features, including project management, that come "inside the box", as well, is much less clear—not unlike the question of which project management "style" should be chosen.

Early in 2023, we tried to cover a wide-ranging discussion regarding Python packaging, its tools, the governance of the Python Packaging Authority (PyPA), and more. The "authority" part of the name, which originally was meant as something of a joke, is not entirely accurate; there are efforts underway to update (or replace) PEP 609 ("Python Packaging Authority (PyPA) Governance") with a new "packaging council" and a clearer mandate. Meanwhile, there has been some progress in the packaging world since our article series, but it seems likely that none of the participants are completely happy with its extent. There is still a huge amount to do.

There are some PEPs that are being discussed and worked on in that area, including PEP 735 ("Dependency Groups in pyproject.toml"), which is authored by Stephen Rosen and sponsored by Brett Cannon; it will be accepted or rejected by the PEP delegate, Paul Moore. It specifies a way to store package dependencies in a pyproject.toml file that is analogous to the requirements.txt file that pip uses today. The TOML file is meant to be used by various other tools, such as IDEs and program launchers, in ways that are well beyond what the current format can provide. In addition, the requirements.txt file format is not standardized, but trying to do so for now would have lots of backward-compatibility concerns.

It is against that backdrop, and apparently under the assumption that the PEP will be accepted, that Luca Baggi asked if pip would be changed to add dependencies to pyproject.toml. Moore, who is also a pip maintainer, noted that pip "isn't a project manager, it's an installer"; there are other tools for that job. There was a bit of discussion about what it might look like to add the feature, but "sinoroc" agreed with Moore that it is not in pip's scope: "pip does not deal with modifying this kind of data, pip is only a consumer of such data".

Pip maintainer Pradyun Gedam said that he would like to see pip expand into some "additional parts of the workflow" but that there simply are not enough developers to handle the existing load; "if we can't keep up with existing feature [maintenance], it gets worse when we add more features". That led Damian Shaw to suggest that it may be time to "consider bundling a tool with the CPython installer that does support this kind of package/environment manager workflow", though he recognized there would be a high bar for any project to cross to get bundled that way.

Moore cautioned against extending pip further, noting that many of the existing workflow tools use pip under the hood, so any changes would need to take that into account. Beyond that, for a new workflow tool to get adopted, a new PEP, along the lines of PEP 453 ("Explicit bootstrapping of pip in Python installations"), would need to be written and get approved. Things have changed in the ten years since that PEP, so it makes sense to consider a new path, but:

[...] I doubt there's a realistic possibility of anyone ([packaging] council, PyPA or community) being able to come to a decision on which workflow tool is going to be blessed as the "official answer". We've had way too many unproductive discussions on this in the recent past for me to think there's anything even remotely like a consensus.

Sinoroc suggested that pipx, which automatically installs command-line applications from PyPI into virtual environments and adds them to the user's path, might be a better choice for bootstrapping these days. Moore pointed out that pipx uses pip, so it does not remove the need for pip as part of the initial Python install. Both pip and pipx are experimentally available as standalone zipapp applications, which might mean that Python could stop shipping pip, but even then it is still "a non-trivial process getting from 'install Python' to 'ready to go'", he said.

When pip was added to the install, it was done to provide a means for users to install the workflow tool of their choice (as well as other packages of interest), it was not meant to be the be-all and end-all, Moore said. But Shaw thinks that users see it differently: "With pip bundled with the official installer and pypi.org displaying pip install package-name on every package page, the impression is that pip is the blessed official tool to use for managing 3rd party packages, not merely a simple way to bootstrap to your favorite tool of choice." If there is sufficient interest by both CPython developers and those of a project-management tool, he said, it may be worth looking at bundling such a tool to supplant pip.

But some of the difficulties that pip struggles under, such as needing to adhere to the CPython release cycle and to vendor all of its dependencies, would also affect any other tool that gets shipped. It is hard to see the developers of other tools being willing to do so, Moore said. In addition, even if there were candidates, as a core developer he does not "have the appetite to get CPython sucked into the 'which tool is best' controversies this would involve". Perhaps unfortunately, though, making that choice is exactly what last year's Python user survey showed was most desired, Brendan Barnwell said. Not making a choice is:

[...] ultimately incompatible with the goal of solving the fragmentation problem of Python packaging tooling. As long as the tools that are bundled with Python leave out large chunks of functionality that people want, while the ones that aren't bundled compete and none is clearly endorsed by Python, users will feel confused and irritated. It doesn't exactly have to be choosing the "best" one but I think a choice does need to be communicated about which tool(s) are recommended.

But core developer Steve Dower concurred with Moore; he also does not want to get pulled into trying to make that controversial choice. Furthermore, he pointed out that the core developers are not particularly sympathetic to packaging concerns:

And I've raised packaging-like questions with the broader core developer group before - the responses tend to range between "what's packaging" to "I don't care about packaging" to "they can do whatever they want". I'm afraid the most support from the core team comes from those of us who participate in this category, which is very few of us.

That division is not healthy, Moore noted, especially given that the responsibility for ensuring access to PyPI should be shared between the PyPA and core-development team. Since that is not happening, solutions to some of the problems that users complain about cannot come about:

Other solutions are possible. Making zipapps work better, and shipping a pip (or pipx) zipapp with core Python might be an option, for example. But only if the core devs take a more active interest in the deployment side of the developer experience.

That's pretty much where things stand; there was a bit more discussion, which continues as of this writing, about pip and its central—privileged—role in the ecosystem. That is, of course, much like many of the other, interminable discussions that are ongoing in the packaging category of the Python discussion forum. Incremental progress is seemingly being made, but the main problem identified by the user survey—and huge number of complaints before that—remains. It is not at all clear what, if anything, will break the logjam.


Index entries for this article
PythonPackaging


to post comments

Python, packaging, and pip—again

Posted Jan 25, 2024 0:44 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (15 responses)

IMHO they really need to pick an option: Lead, follow, or get out of the way.

Lead: Make a new official packaging tool (or add such functionality to pip).
Follow: Endorse somebody else's packaging tool.
Get out of the way: Disband PyPA and hand its authority over to somebody who is actually willing to do one of the above things.

I don't care which they choose at this point. The status quo is intolerable.

Python, packaging, and pip—again

Posted Jan 25, 2024 5:03 UTC (Thu) by Baughn (subscriber, #124425) [Link] (14 responses)

I used ‘sudo pip install’ for years, and it usually kind of worked. Never saw anything suggesting that wasn’t the right way to do it, though it did turn me off Python after the second time it broke my OS.

Python, packaging, and pip—again

Posted Jan 25, 2024 6:47 UTC (Thu) by LtWorf (subscriber, #124958) [Link] (1 responses)

That no longer works now, finally.

Python, packaging, and pip—again

Posted Jan 25, 2024 13:28 UTC (Thu) by ceplm (subscriber, #41334) [Link]

It doesn’t work, meaning it doesn’t break your system, but it still works, meaning it installs to `/usr/local/` tree hierarchy (or the equivalent on your system).

Python, packaging, and pip—again: system/platform Python

Posted Jan 25, 2024 22:37 UTC (Thu) by geofft (subscriber, #59789) [Link] (11 responses)

Yeah, half of why I got involved in making PEP 668 happen was to turn the oral tradition that "sudo pip install" is a terrible idea into an actual written policy somewhere. You now get an error message, but it was mostly about reaching the consensus that the Python community thinks it's a terrible idea and it's not just the individual experience of people who have gotten burned by it.

There is an underlying cause to this problem: Python is both a product for end users - a language in which you can write code yourself - and a systems language used by your operating system and applications you run. This makes perfect sense for Python as a language: one major target audience of end users is people putting together an operating system or writing applications. But for a concrete Python installation, it is not really possible to serve two masters in this way.

If everyone were installing Python as a standalone, self-contained package from somewhere, it would be totally fine to include whatever installer in the base Python distribution along with however many dependencies it needs. And it would be totally fine to let users install what they like inside that Python distribution, because they can just download another copy if they need to do something incompatible. (It's not the perfect solution - the real answer is that you should be guided towards setting up virtual environments - but it would not break anything for this to be permitted.) For the installation of Python used by your OS itself, neither of these are fine. The OS just needs Python as a runtime; it doesn't have any need to ship a package manager don't want to ship a package manager for the same reason that Android doesn't come with Maven. It usually wants to avoid shipping one, partly for size reasons, partly because it comes with its own package manager which wants to manage the same set of files. And if you let people install what they like systemwide with e.g. "sudo pip install," they will eventually brea their OS by upgrading a dependency of some OS component.

The real answer here is what's been called "system Python"/"platform Python" - the Python installation used by the OS itself should be somewhere out of the way like /usr/libexec/system-python/bin/python3. Then one option is not to have a package providing a /usr/bin/python3 at all, and expect people to install Python from outside the distribution into their home directory or /usr/local. Or, if you want to package Python as a product for end users, that's fine: this non-default package can come with all the installer tools that an end user would immediately want, and nothing in the OS depends on it, so the local sysadmin can use those installer tools as much as they want. In other words, "sudo pip install" becomes every bit as safe as, say, installing Emacs packages under sudo. The danger moves to "sudo /usr/libexec/system-python/bin/pip install," which might not even exist.

Fedora tried this, and as far as I can tell the big sticking point is that sometimes people actually did want to install things in ways that affected OS-provided commands. The immediate thing that caused it to get reverted was breaking dnf plugins (https://bugzilla.redhat.com/1483342), because dnf itself is written in Python. I have also heard that people who use their OS-packaged version of Ansible (/usr/bin/ansible) expect to be able to "sudo pip install" Ansible plugins, and that breaks in this model. I'm not sure what the answer is - probably each of these applications should have its own virtual environment? But that could lead to the OS having to package up the same Python package multiple times, once for each of the applications they ship - which has a certain theoretical purity to it, but is probably a pain in practice. https://fedoraproject.org/wiki/Changes/Platform_Python_Stack has more discussion about how it was tried and links to what went wrong. (To my knowledge, no other distro has even attempted this.)

So! People of LWN! This is a call to figure out how one would successfully ship "system Python" in a real-world Linux distro without that breakage. Once we do it, we can make the whole Python experience safer and less confusing.

Python, packaging, and pip—again: system/platform Python

Posted Jan 26, 2024 5:14 UTC (Fri) by gps (subscriber, #45638) [Link] (2 responses)

Fedora should plod ahead and do it anyways. a system python isolated from and unusable by others that does not have its own package manager is the right thing to do

Those dnf and Ansible users are wrong. Ignore their complaints of how something "worked" in the past and cut through to their underlying needs and provide a proper alternative. The answer is never going to be pip modifying the global state of a shated runtime across all applications on the system.

Nothing that comes as a System (OS provided and managed) package should ever have another package manager they do not fully own and operate be able to install additional things for use by it. Those either need to come from the System package manager or a system package manager blessed sub package manager specific to each application wanting this in their own hermetic application worlds with zero interference on others or impact of global state. I.e. never sudo pip.

Python, packaging, and pip—again: system/platform Python

Posted Jan 26, 2024 21:41 UTC (Fri) by NYKevin (subscriber, #129325) [Link]

IMHO the most straightforward way to fix this is to repackage those dnf plugins as Fedora packages. I'm not terribly familiar with how Fedora packaging works, but I find it hard to believe that the systemwide package manager is incapable of installing these plugins into a hypothetical system-wide Python environment. These are Fedora-specific plugins. They really shouldn't be Pip's problem.

Python, packaging, and pip—again: system/platform Python

Posted Jan 28, 2024 10:28 UTC (Sun) by MKesper (subscriber, #38539) [Link]

That's the only way to sanity. I'm glad pip is refusing to sudo pip install nowadays. But even installing globally for the user isn't the right thing to do in CI for example. Just always use a virtualenv.
Python core devs still giving a shit about packaging is not raising hope for the future of the Python ecosystem though.

Python, packaging, and pip—again: system/platform Python

Posted Jan 27, 2024 15:17 UTC (Sat) by donald.buczek (subscriber, #112892) [Link] (2 responses)

> The real answer here is what's been called "system Python"/"platform Python" - the Python installation used by the OS itself should be somewhere out of the way like /usr/libexec/system-python/bin/python3. Then one option is not to have a package providing a /usr/bin/python3 at all, and expect people to install Python from outside the distribution into their home directory or /usr/local.

Yes, we adopt a similar approach with a dedicated system Python located at `/usr/local/system/python3`. It's primarily for essential startup and maintenance scripts that require minimal dependencies. For the users we maintain an extensive archive of Python versions, currently totaling 65, each bundled with a comprehensive suite of scientific packages to ensure reproducibility of our scientific work.

All these Python installations are read-only. Frequently used versions are cached locally, while others and old ones keep transparently accessible over the network.

Paths like `/usr/bin/python` are just wrappers that point to the recommended default version of these installations. We don't hesitate to update that. If anything breaks for the user, they can just select the version of the previous build via the shebang path or a wrapper command or by sourcing a profile file.

For adventurous users eager to test additional packages, `pip` is the go-to for installations in `~/.local`. The more discerning crowd leans towards `pyenv` for its stability. Then there are the Conda converts, swayed by the siren calls of the Internet, who frequently overlook the abundance of pre-installed packages and plunge into the depths of home directory bloat. They endure the ritual of letting the Conda installer rewrite their `.bashrc` and accept the sluggish login delay as a rite of passage, all for the privilege of navigating a sea of superfluous files.

Python, packaging, and pip—again: system/platform Python

Posted Jan 31, 2024 18:46 UTC (Wed) by intelfx (subscriber, #130118) [Link] (1 responses)

> we adopt a similar approach with a dedicated system Python located at `/usr/local/system/python3` <...> For the users we maintain an extensive archive of Python versions

Who are "we"? That sounds interesting.

Python, packaging, and pip—again: system/platform Python

Posted Feb 1, 2024 9:19 UTC (Thu) by donald.buczek (subscriber, #112892) [Link]

> Who are 'we'? That sounds interesting.

Apologies for any confusion. By "we", I'm referring to the IT group of a German research institute [1]. We manage 102 workstations and 153 server systems for our users. Our compute cluster has 10,024 cores, 48TiB of memory, and several A100 GPUs across 50 systems as of today. We don't use distributions; instead, we build everything from source locally and apply continuous rolling upgrades. We've developed our own software for everything, including the package manager, authentication, management, firewall, backup, archives, and cluster queuing software. We rely on Linux user separation for security in a multi-user environment.

All systems, from aging workstations to high-end cluster servers, run identical software. Users have access to a common file system namespace for home directories and scientific projects, etc., via autofs/nfs. This setup is unusual for these days, but it works very well. We're doing interesting stuff, and our users generally love us. :-)

[1]: https://www.molgen.mpg.de/en/it, but be aware that the data is highly obsolete.

Python, packaging, and pip—again: system/platform Python

Posted Jan 28, 2024 7:20 UTC (Sun) by amarao (guest, #87073) [Link]

Excellent overview, thank you.

There is one more case for system python with system packages, enhanced by pip.

Some python libraries need non-python dependencies. Compilers, headers for libraries. It's much easier to install pre-packaged system library, because it handles all dependencies. And then you have 2-3 python libraries not in OS you need, so, pip install.

And there are two types of machines: single purposes (we configured it, if it works, we repeat this config on many machines, no problem), and multipurpose (e.g. desktop).

Second will be burn badly by sudo pip install, first will either not work (fixable) or will work just fine.

Same goes into containers. If I want container with Ansible, sudo pip install is the way.

Python, packaging, and pip—again: system/platform Python

Posted Jan 30, 2024 16:24 UTC (Tue) by smurf (subscriber, #17840) [Link] (3 responses)

Meh. What's the problem? The System Python is /usr/bin/python3 and friends; if you need your own environment (because the system doesn't package something that you need, or with the wrong version), use a venv. If you want to live dangerously anyway, install to $HOME/local/share (user) or /usr/local (root).

All of this already works. So what exactly is the problem?

Python, packaging, and pip—again: system/platform Python

Posted Jan 30, 2024 17:32 UTC (Tue) by geofft (subscriber, #59789) [Link] (2 responses)

The problem is that (until PEP 668 was implemented) very few users knew that the latter option was "live dangerously" as opposed to "the normal thing you do when you're using Python." Almost every package's home page says "Foobar is available on PyPI! Install it by running 'pip install foobar'." So you do that, and it gets you a permission error, and you know the normal way to solve those when installing software is to put 'sudo' in front. At no point in this process do you even learn about the concept of a venv.

In any case, it seems pretty un-Pythonic for there to be a "live dangerously" option at all ("There should be one, and preferably only one, obvious way to do it," "If the implementation is hard to explain, it's a bad idea," etc.). There is a perfectly good safe option: everything, including each system application and each user development environment, has its own environment. What's the fundamental downside of this approach (apart from non-fundamental downsides like disk space usage or getting from here to there)? Let's get rid of the term "virtual" environment while we're at it - what's the need to have a systemwide environment at all? Is there anything that you can do with a systemwide environment that you cannot do with per-project environments?

Python, packaging, and pip—again: system/platform Python

Posted Feb 9, 2024 9:34 UTC (Fri) by sammythesnake (guest, #17693) [Link] (1 responses)

What I yearn for is something like virtualenv with integration between pip communicating and the distro's package manager to install suitable distro packages (where available) system-wide plus something semantically similar to symlinks within individual virtualenvs.

You'd need much better support for having multiple versions of a package in place simultaneously than most distros have currently (generally they only support it in specific cases where there are separate foo-1 and foo-2 packages and that isn't scalable) and some story for keeping track of which versions can be removed because they're no longer needed and which need to be kept in place...

Ideally, such a system would have a standardised interface between distros and work for much much more than just python, of course.

I've thought quite a lot about ways to do something like this using things like containers/bind mounts etc. but it's... intricate :-/

Python, packaging, and pip—again: system/platform Python

Posted Feb 9, 2024 9:46 UTC (Fri) by Wol (subscriber, #4433) [Link]

When the LSB was in its early days, I thought it was all arse-about-face, and tried to get it to do a u-turn but failed. And because I failed, I think that's why the LSB failed too and is pretty irrelevant now... It sounds like you want the same thing.

Okay, it did achieve something, but the whole focus was on telling applications what the distro was providing. WHO CARES! If the distro doesn't provide what you need, and you can't bundle it (which is pretty much true of EVERY luser - ie most users), then you're stuffed.

What you need is a mechanism for the app to call the distro package manager and say "please provide" !!!

I'm not saying this would be easy - it won't! - but it'd be a damn sight more useful for 99% of the user base than what LSB provides the other way round!

Cheers,
Wol

Python, packaging, and pip—again

Posted Jan 25, 2024 5:32 UTC (Thu) by sroracle (subscriber, #124960) [Link] (16 responses)

It's incredible to me that this problem, which is now more than a decade old with extremely little material progress, is still being disowned by the leadership of the Python ecosystem. There is no accountability for the total lack of ownership of this problem. It seems like an endless game of pointing fingers at one another and complaining that consensus is impossible.

At some point, some prominent Python developer is going to have to stand up and take responsibility for crafting an official solution (or working towards sanctioning one of the many existing ones). And in order to get to that point, the community needs to recognize that perfect is the enemy of good. Total consensus will not be achieved. The community just needs a blessed path forward that caters to the common case rather than to the whim of every single Python developer, packager, and user on the planet. The latter is of course impossible. But in these discussions it routinely appears to be the opinion of senior Python community members that it is necessary.

Python, packaging, and pip—again

Posted Jan 25, 2024 10:06 UTC (Thu) by Wol (subscriber, #4433) [Link] (4 responses)

Where's Guido when you need him?

You're spot on, you need someone who (a) has the trust of the community, and (b) has the balls to say "This is what we're doing, naysayers be damned!". This is pretty much the definition of leadership - going ahead with the reasonable expectation that most people will follow, regardless of their own feelings, because they trust you.

Cheers,
Wol

Python, packaging, and pip—again

Posted Jan 25, 2024 13:50 UTC (Thu) by aragilar (subscriber, #122569) [Link] (3 responses)

There is a BDFL for packaging (and one for PyPI). They are the one who approves packaging (or PyPI/indexes) PEPs. It's because everyone puts pressure on them that's there's the plan for the packaging council (which is modelled on the switch from Guido to the steering council).

The "build the thing and everyone will follow" *was* tried (distutils2 by core python/web community; bento by the scientific/data science community). bento failed (I'm not sure when support for it was finally removed from scipy, probably somewhere between 2015-2018), and while distutils2 (which was supposed to replace distutils in the stdlib) eventually became what we now know as setuptools, burning out a least a few maintainers and core devs along the way.

Nothing is stopping more experimentation and innovation, but you can't really have the "one true tool" if you only target (at best) a third of the Python community (which is what I've seen tools advocated as "one true tool" do).

Python, packaging, and pip—again

Posted Jan 29, 2024 9:28 UTC (Mon) by jwilk (subscriber, #63328) [Link] (2 responses)

> distutils2 (which was supposed to replace distutils in the stdlib) eventually became what we now know as setuptools

Uh? setuptools predates distutils2 by 5 years or so.

Python, packaging, and pip—again

Posted Jan 31, 2024 9:45 UTC (Wed) by aragilar (subscriber, #122569) [Link] (1 responses)

The name "setuptools" predates distutils2, but original setuptools was forked to becomes distribute (there was even an infographic: http://www.fengbingji.com/wp-content/uploads/2013/10/3c98... which seems to have mostly disappeared from the internet), which sort-of became distutils2, which was then fed back into distribute and eventually just took over the name "setuptools". https://www.pypa.io/en/latest/history/ gives more details (it seems to be lacking the latest changes, and doesn't really cover the "conda" side of the changes, but it gives a starting point and links to the mailing lists where discussions happened).

Python, packaging, and pip—again

Posted Feb 8, 2024 12:21 UTC (Thu) by jwilk (subscriber, #63328) [Link]

> distribute […] sort-of became distutils2, which was then fed back into distribute

I've seen no evidence of this happening.

Python, packaging, and pip—again

Posted Jan 25, 2024 13:28 UTC (Thu) by aragilar (subscriber, #122569) [Link] (5 responses)

What do you think is the common case?

The challenge for packaging is there *isn't* one. The "Python Developers Survey 2022" (https://lp.jetbrains.com/python-developers-survey-2022/ – done by JetBrains, though advertised on PyPI, so probably the best data we have, but is likely biased in ways we do not know) has under "Purposes for Using Python" a number of graphs which demonstrate that there are instead multiple divergent use-cases, all of which packaging has to support (these aren't tiny niches, these are major parts of the ecosystem, and why Python is the way it is). This fact isn't some new revelation, it's been the norm for at least the past 15 years, and it's practically an internet law now that those suggesting "Python packaging should just support the common use case" are really saying "my use case is the only valid one, and Python packaging should only support that" (someone should come up with a pithy name for it).

As for material progress over the last 10 years, pulling from https://www.pypa.io/en/latest/history/, we had:
* the (2018) switch from the old cheeseshop to the new warehouse codebase (which cleaned up a significant amount of legacy, and enabled the use of a CDN, making downloads as well as the rest of the infrastructure significantly more reliable—no more random download failures).
* the introduction of wheels (2013), with further clarifications of platform specific wheels coming in the years after (e.g. manylinux)—especially on windows this made installation of the scientific stack (and any other projects with binary extensions) significantly easier.
* PEP 517 and 518 (2016/2017), which meant no longer did you have to try to interface to the whole of distutils, instead a clear and straight-forward API could be used to build projects, allowing the wider use of rust, or being able to defer to cmake or meson for building C/C++/Fortran libraries.

Python, packaging, and pip—again

Posted Jan 25, 2024 19:54 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (4 responses)

There are, to a gross approximation, two common cases:

* People who need conda.
* People who do not.

This leaves us with, broadly, two options:

1. Ignore conda, and solve for the remaining use cases. Broadly speaking, this means "install wheels, manage dependencies between wheels, don't touch anything that is not a wheel, don't add new features to wheels beyond basic dependency-management metadata and the like." People who need conda will continue to use it and PyPA will continue to ignore them.
2. Move everybody to conda, and people who don't need all of its features can just ignore the parts they don't need. This would require conda to be willing to take on the role of standard Python package manager, and I'm not sure if they are willing or reasonably able to do that. But it does have the advantage of one package manager for everybody.

Yes, there are like a hundred corner cases where neither of these options are ideal. But either one of those options would be better than the status quo, and there would be a clear path towards improving those corner cases (or folding them into the common case).

Python, packaging, and pip—again

Posted Jan 25, 2024 23:06 UTC (Thu) by intelfx (subscriber, #130118) [Link] (1 responses)

> Move everybody to conda

Please, no.

Conda is an abomination.

Python, packaging, and pip—again

Posted Jan 26, 2024 0:29 UTC (Fri) by cytochrome (subscriber, #58718) [Link]

Very insightful and compelling. Geez.

I, myself, have found conda to be extremely useful in my scientific programming work. The ability to install packages from various channels, create specified reproducible virtual environments, and transfer these environmental specifications through yaml files to other users has worked quite well.

Python, packaging, and pip—again

Posted Jan 26, 2024 19:41 UTC (Fri) by kmccarty (subscriber, #12085) [Link] (1 responses)

> Move everybody to conda

This is a non-starter for many, ever since Anaconda changed its license to be no longer free for large-scale commercial use, and given that the default configuration of Conda includes Anaconda packages. See e.g. https://stackoverflow.com/q/74762863

While it might be possible for commercial users to use some Conda repositories without licensing issues, I know of at least one very large company in the energy sector that has expressly forbidden Conda usage internally as a result of this change.

Python, packaging, and pip—again

Posted Jan 26, 2024 21:45 UTC (Fri) by NYKevin (subscriber, #129325) [Link]

That would certainly be a valid rationale for choosing option 1 instead of option 2. The question is whether the Conda people are committed to their non-free licensing, or if they can somehow be reasoned with. Technically, the actual code is FOSS, so in theory I suppose you could spin up a PSF-owned FOSS repository (with no usage restrictions or other licensing problems), but if it results in splitting the community in half, it's probably a bad idea.

Python, packaging, and pip—again

Posted Jan 25, 2024 22:07 UTC (Thu) by da4089 (subscriber, #1195) [Link]

I agree.

I believe that this is ultimately an existential issue for Python. Choosing to ignore it is unsustainable.

Personally, I think extending (or "fixing") pip is the right solution today. It has the history and the mindshare. Changing to something else is just going to cause more pain. Sometime back in the past, building this into the python executable would have been better: python --install <package> would have made sense, but there's no point changing now.

Python, packaging, and pip—again

Posted Jan 25, 2024 22:53 UTC (Thu) by geofft (subscriber, #59789) [Link] (3 responses)

Maybe I'm just not reading things well today, but can you clarify what you mean by "this problem"? Do you mean the need for a project management tool (not a package manager - pip is already there) within the stock Python distribution?

As an end user, I'm not sure this problem has ever quite occurred to me in those terms. Outside of my day job, which has a lot of custom tooling for project and dependency management, I've never written anything that is more complicated than "pip install -r requirements.txt" would be suitable for. I'm totally willing to believe that it is a problem, but I think step one is getting people to believe that this really is a problem that needs solving and that we should see the requirements.txt file convention as a partial solution.

But since you mention people pointing fingers at each other, and I don't think I've really seen any discussion of this problem at all, I'm wondering if you mean something else.

Python, packaging, and pip—again

Posted Jan 26, 2024 10:50 UTC (Fri) by cyperpunks (subscriber, #39406) [Link] (2 responses)

Try to install pytorch from source, start here.

https://github.com/pytorch/pytorch

Python, packaging, and pip—again

Posted Jan 26, 2024 13:07 UTC (Fri) by aragilar (subscriber, #122569) [Link]

pip install . (which will give a CPU only build) should work for pytorch (I tried this, then got bored of waiting for the C++ vendored deps to build, but the readme says this should work), but naturally it would be better to use a build with GPU support (which is naturally going to require installing a build system for compiling to GPU). This isn't the fault of Python packaging (unless you expect it to handle all languages, and even zig can't do that yet ;) ), and if the package is well designed (I'm not sure how easy it is to avoid the vendored C++ libraries in pytorch, and reuse existing installed versions), then it should be simple to have a tool which maps the metadata from Python packaging to that needed for your general purpose distro/package manager (whether it's conda, a linux one, nix, homebrew etc.), with some additional metadata added to cover things that are not exposed currently (e.g. which versions of a C library are needed), though there are efforts to fix that as well (see the in progress PEP 725 for example).

Python, packaging, and pip—again

Posted Jan 26, 2024 14:18 UTC (Fri) by geofft (subscriber, #59789) [Link]

I build PyTorch from source regularly, linked against dependencies that have also been built from source. It's a breeze compared to building TenaorFlow from source in the same way. :)

But also, I don't think this is what the article is about, at all. I certainly agree there are lots of things about compiling software that could be made better. (Very little of what's hard about building either PyTorch or TensorFlow has much to do with Python itself.) But this article isn't "there are things that are hard about compiling software," it's talking about a specific question of pip's scope and suitability in the default Python installation.

(Unless you're proposing that the answer is pip should grow so big as to be better than both CMake and Bazel at compiling large C++/CUDA projects, and all of that should be in the default Python distribution?)

Python, packaging, and pip—again

Posted Jan 25, 2024 19:11 UTC (Thu) by cyperpunks (subscriber, #39406) [Link] (1 responses)

Building and shipping Python modules which is not pure Python is basically the same problem as building and shipping shared libraries.

That is one most complex topic in computing as it's done differently on every modern platform.

Linux, macOS and Windows treat shared libraries in special ways and follow specific rules.

The only tool which have fairly good understanding of all these technically details are CMake, hence the sane way forward for Python
is to integrate wanted $tool with CMake. Any other approach will end up in a low quality re-implementation of CMake any way.

Python, packaging, and pip—again

Posted Jan 26, 2024 8:25 UTC (Fri) by aragilar (subscriber, #122569) [Link]

This exists for CMake (scikit-build) and meson (meson-python, which is used by numpy and scipy).

Python, packaging, and pip—again

Posted Jan 25, 2024 20:08 UTC (Thu) by ringerc (subscriber, #3071) [Link]

I've actually given up on using Python for most of my personal work because I spend so much time fighting pip, pipx, pipenv, setup.py, setup.cfg, pyproject.toml, Pipfile, and other WTFery that it's just not worth it. Especially for dev dependencies.

There is no clear big picture "right way" that actually works. Everyone is left to stumble in the dark, navigating a bunch of mutually semi-incompatible yet semi-interdepenent tools and config files. Some of them are sort of deprecated but don't say so anywhere in their own docs. Others are incomplete. Few refer to the others and how to make them play nice together. Have fun using pipx for installs + pipenv for dev including dev dependencies.

Different subsets of runtimes and tools support different subsets of other tools. Few say so. Writing an Azure function? Will it understand pyproject.toml? Who knows‽

It's a massive drag and it has become much worse over time.

Python, packaging, and pip—again

Posted Jan 27, 2024 4:59 UTC (Sat) by pj (subscriber, #4506) [Link]

For better or worse, every 5 years or so I have to go re-choose a python packaging method.

It used to be: use setuptools and write a setup.py.

Now it's: write a pyproject.toml and use pip-tools to compile that into a requirements.txt. Oh, and make an optional 'dev' dependency that includes twine so I can build a wheel for pypi.

This scheme makes it so that pyproject.toml has the package names that get used, and requirements.txt has the package _versions_ (and their transitive dependencies) that I've tested against.

All that said: protocols over specific implementations. Don't build another tool, instead define the protocol packaging tools use to talk to repositories, and make pip the pilot implementation.

Python, packaging, and pip—again

Posted Jan 27, 2024 23:41 UTC (Sat) by smitty_one_each (subscriber, #28989) [Link]

I submit that the least-worst way ahead might be a kickstarter to pay someone at the PSF level to whittle the whole pipeline problem down to the minimal Venn diagram of well-tested, blessed and supported tool(s).

Python, packaging, and pip—again

Posted Jan 28, 2024 7:11 UTC (Sun) by amarao (guest, #87073) [Link]

Lack of a proper dependency manager within Python is a nail into a coffin of 'battery included' slogan.

Python, packaging, and pip—again

Posted Feb 1, 2024 12:05 UTC (Thu) by callegar (guest, #16148) [Link]

Would be great if virtual envs had some kind of deduplication among them. If I am not incorrect, conda provides some sort of deduplication amoung its environments?


Copyright © 2024, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds