|
|
Subscribe / Log in / New account

Users and Python packaging

By Jake Edge
February 8, 2023

Python packaging

A lot of digital ink has been expended in recounting the ongoing Python packaging saga, which is now in its fourth installment (earlier articles: landscape survey, visions and unification, and pip-conda convergence). Most of that covered conversations that took place in November and the discussion largely settled down over the holidays, but it picked up again with a packaging-strategy thread that started in early January. That thread was based on the results of a user survey about packaging that was meant to help guide the Python Packaging Authority (PyPA) and other interested developers, but the guidance provided was somewhat ambiguous—leading to lots more discussion.

Survey

The packaging survey was analyzed by Shamika Mohanan, who summarized the results in late November. Tzu-ping Chung noted that there seemed to be conflicting answers to two of the questions asked:

People seem to generally disagree with the statement "Python packaging deals well with edge cases and/or unique project requirements", but also feels that "Supporting a wider range of use cases" is the least impactful thing for the PSF [Python Software Foundation] and PyPA to focus on. I kind of sense there's some interesting thinking going on but can't quite identify what exactly.

Brett Cannon suggested a cause, perhaps somewhat cynically: "People want their problem solved, but don't care about others and their problems." Paul Moore had a different take, though; he thought that respondents wanted the packaging community to "continue focusing on the majority cases" and not spend time on supporting edge cases. "To me, that fits with the desire for an 'official' workflow and associated tool(s) - the edge cases would be outside the 'official' workflow, and hence not a priority." Somewhat in keeping with Cannon's comment, however, Moore said that the pip developers have found that "people really don't take kindly to being told their use case is unusual or unsupported". Dustin Ingram thought that by catering too much to the edge cases, everything suffers: it "leads to both the majority cases and edge cases being poorly supported".

At the end of December, Ralf Gommers announced the pypackaging-native site that is meant to collect "the key problems scientific, data science, ML/AI [machine learning/artificial intelligence] and other native-code-using projects & authors have with PyPI, wheels and Python packaging". But on the site he purposely does not offer "more than hints at solutions", since it is meant to foster discussion, larger projects, and Python Enhancement Proposals (PEPs). The site came up regularly in the strategy discussion as a fairly neutral recounting of the trials and tribulations associated with packaging, especially for scientific-Python projects (like NumPy and SciPy) where there are native dependencies, such as for compilers, build tools, and specialized libraries (e.g. for CUDA).

Strategy

In the strategy thread, Mohanan began by noting that several survey respondents had explicitly called for a single tool that would unify the various disparate efforts and eliminate the fragmentation. "In a nutshell, there are too many tools and users are not sure which ones to use." She asked if there is a path toward some kind of unification, and what it would entail.

It is not simply a matter of tools, Filipe Laíns said: "just unifying tools won't do any good, we need to do a full UX [user experience] analysis and then design something that fills the needs of users more naturally". pip already tries to be the single-tool solution, but it suffers from a need to continue supporting all of its historic use cases, which makes it hard to change. In addition, pip and other packaging tools all suffer from a lack of maintainer time.

Moore readily agreed, but noted that for some unified tool to be successful, "it will be at least as important to decide what workflows we won't support, as what we will". The lesson learned from pip is that trying to support every workflow just ends up in a mess, he said. But one problem is that respondents to the survey have likely been making an "implicit assumption that their preferred workflow will be supported" in whatever unified tool comes about. Ultimately, that is where the tension will lie.

But Moore is not so sure that a single, unified tool is worth the enormous amount of effort it would take to get there; it would be a substantial improvement for users, but it would take a lot of resources that might be better used in other ways. "Even as a funded project, with its own hired resources, it would consume a big chunk of attention from the packaging community [...]". He thinks there may be a better way:

One alternative that I think we should consider is continuing to work on the goal of splitting out the various "components" of packaging into reusable libraries. (In particular, I'd like to see an ecosystem where "pip is the only viable approach to installing packages" is no longer the self-evident truth that it is today.) Projects like installer, build, packaging and resolvelib are good examples of this. Using those libraries, and more like them, it will be a lot easier to build new workflow tools, any one of which has the potential to become the unified solution people want. It's definitely going to result in more complexity on the way to something simpler, and I'm pretty sure it's not what the survey respondents were imagining, but I feel that it might be a better trade off for the long term.

Cannon said that in his work on Python in Visual Studio Code (VS Code), he also encounters the problem of everyone expecting that their workflow will be supported. "People often don't have exposure to other workflows, so they innately think their workflow is 'normal', so if something doesn't work for them then it obviously is broken for everyone, right?" Changing to a more common workflow is not something that some people will be willing to do.

He agreed with Moore's suggested approach of reusable libraries for packaging; "If we can get everything backed by standards and then make using those standards easy by making sure they have backing libraries, then we can make it easier for people to experiment a bit as to what the proper UX should be for any front-end packaging tool." He also noted that the complexity in Python packaging is a double-edged sword: "While people bemoan Python's packaging story as being too complex, [its] flexibility is what helped it become the glue language of the programming world."

Conda

The elephant in the room, though, is the scientific-Python world and the conda ecosystem that goes along with it. H. Vetinari worried that it was not really being considered in the discussion. "As long as the ecosystem currently being served by conda cannot be folded back into 'the One True Way', we have not actually solved the schisms in python packaging (i.e. everyone can just use the same tool)." From Vetinari's perspective, either the monumental task of solving most of the problems described at pypackaging-native needs to happen in the unified tool, or to "define large parts of the data science ecosystem as out of scope". Neither of those is particularly palatable, but the PyPA has generally considered the needs of the scientific-Python world as being out of scope for its tools because of how much work it would take to handle the non-Python dependencies, Vetinari said.

But Moore made a distinction between "use cases" and "workflows"; he is adamantly opposed to labeling any particular use case as "out of scope". "I do expect that we may need to ask users to do things in certain ways in order to address their use cases [...]". Beyond that, though, conda has to install its own Python build, which means that anyone who wants to get, say, NumPy through conda, has to suddenly switch away from the Python they are already using; pip and others are meant to work with any (and every) Python installation:

The PyPA focus is on supporting the "standard" builds of Python (python.org, Linux distro builds, Windows Store, self-built interpreters, Homebrew, ...) Solutions that require the user to switch to a different Python build don't fit that remit. I don't think that "declare all of those Python builds as out of scope" has any more chance of being acceptable to the SC [steering council] than "declare a big chunk of the user base" does.

Vetinari suggested that looking at the problem differently might help: "For brainstorming about those solutions, I'd really like us not to think in terms of 'python.org installers' or 'Windows store builds' or 'conda', but in terms of anything that can satisfy the same relevant set of requirements." Moore agreed with that, but noted there is some amount of tension on the boundary between the PyPA and the core language (in the form of the steering council):

But I think that "how do people get Python" is part of the question about a unified solution. In much the same way that the "rust experience" isn't just cargo, it's also rustup. But just as we can't assume the SC would be OK with ignoring a chunk of the user base, I don't think we can assume the SC will accept dropping those ways of getting Python. We can ask, but we can't assume.

This is where the boundary between packaging (the PyPA) and the SC blurs. I personally think that the SC's "hands off" approach to packaging puts us in a bad place as soon as we get close to areas where the SC does have authority. Distutils was dropped from the stdlib, and the packaging community had to pick up the slack. We have to address binary compatibility, but we don't have control over the distribution channels for the interpreter. We provide tools for managing virtual environments, but we don't control the venv mechanism. We install libraries, but we don't control the way import hooks work. Etc.

Vetinari "fully agreed" that solving all of the problems will eventually require support at the language level, thus steering council involvement.

Unification

The discussion cue mentions "unification", Pradyun Gedam said, but everyone seems to have a different idea of exactly what would be unified. He came up with a list of six different aspects of the problem space that might be amenable to unification:

  1. Unification of PyPI/conda models (i.e. the non-Python code dependency problem).
  2. Unification of the consumer-facing tooling (i.e. consuming libraries).
  3. Unification of the publisher-facing tooling (i.e. publishing libraries).
  4. Unification of the workflow setups/tooling (i.e. organising files, running tests, linters, etc).
  5. Unification/Consistency in the deployment processes (i.e. going from source code -> working application somewhere).
  6. Unification/Consistency in `python` usage experience. (i.e. the rustup/pyenv aspects of this, which is absolutely a thing that affects users' "Python Packaging" experience (think pip != python -m pip, or python -m pip vs py -m pip, or python being on PATH but not pip etc)

That list, which others in the thread added on to, gives a nice, quick overview of where the complexity lies. These problems affect multiple users, including Python application users and developers, Python Package Index (PyPI) module developers, system administrators and operators, and so on. Each of those users has their own set of constraints (e.g. operating system, architecture), habits, and prejudices, but they are trying to share various global resources in a way that makes sense for them. It is perhaps not surprising that there is no unified solution that covers everyone. In fact it is hard to see that all of the things in that list can be fully united—at least in any kind of near-term time frame.

There is more to come on this Python-packaging journey for sure. Participants have been nothing if not verbose, with many good points being raised, as well as ideas for how to get where most everyone seems to want to go: a unified user experience for Python package installation. Something that just works for "everyone" and that every project can point to as its means of installation. Further unification could come later—if it ever comes about at all. Stay tuned ...


Index entries for this article
PythonPackaging


to post comments

Users and Python packaging

Posted Feb 9, 2023 0:52 UTC (Thu) by jhoblitt (subscriber, #77733) [Link] (4 responses)

After switching to conda... It is hard to imagine ever going back. It is amazing how poor the tooling is in the python world compared to other languages with much smaller user based.

Users and Python packaging

Posted Feb 9, 2023 14:48 UTC (Thu) by auc (subscriber, #45914) [Link]

The Elephant in the room, yes ... most Python core devs don't appear to see it. Looks too much like the Postgres avatar ? :)

Users and Python packaging

Posted Feb 9, 2023 15:40 UTC (Thu) by southey (guest, #9466) [Link] (1 responses)

All I can say is great that it works for you. I have yet to see that it helps me in any way and it continues to be the worst solution ever every time I have to deal with it. Just last week a student using conda with R couldn't install a package because the R version in the conda environment was old. Took ages to realize that because the Linux distro had installed the latest version. Sure that is rare, but it should have never happened.

One of the reasons for using a distro is that versions are updated with respect to the various libraries provided. That does not happen with conda so you are left with maintaining each Python package across each conda environment used. I just don't know how people can use it while keeping packages up to date especially across multiple computers. This is particularly an issue with new features and other change that come with new versions of Python and packages. Code that works in one environment fails in another resulting in lost time troubleshooting and solving it. The worst aspect is when it was the developer that used it as that has larger concerns (was a major issue with packages that needed to support multiple versions of Python 2 and 3).

I have yet to find any language that has better tooling than Python as they are all virtually the same. Newer languages may appear to be better (whatever that means to you) as the developers have usually learnt from experiences with other languages. However, that quickly disappears once you get a reasonable developer and user base.

(Yes, I am a sample size of one and that is 100% of the user cases.)

Users and Python packaging

Posted Feb 9, 2023 16:20 UTC (Thu) by smitty_one_each (subscriber, #28989) [Link]

> I have yet to find any language that has better tooling than Python as they are all virtually the same.

I had the impression that ruby gems were a rounder wheel.

Users and Python packaging

Posted Feb 16, 2023 12:42 UTC (Thu) by callegar (guest, #16148) [Link]

Conda has some very nice ideas and at the same time some aspects that appear simply too rough by today standards. No antialiasing on some application for instance is something that really delivers a bad impression. A bootstrapping problem, I understand: https://github.com/ContinuumIO/anaconda-issues/issues/6833

Users and Python packaging

Posted Feb 9, 2023 7:51 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (13 responses)

It is my personal opinion that they are massively overthinking this.

Users don't *care* about majority cases vs. edge cases. In fact, users don't want to think about "use cases" at all, if they can help it. Users want a tool that Just Works with a minimum of fuss.[1] If such a tool would have existed 20 years ago, then everything would be fine by now, but you can't get there from here. The *lack* of such a tool has resulted in the development of an elaborate ecosystem[2] of workarounds and partial solutions. Projects and workflows have grown up around them. An entire generation of Python programmers have gotten used to the idea that Python Packaging Is Hard, and they've learned to make do with what was available. This has resulted in users developing opinions about How Packaging Ought To Work, which is frankly a bad thing. In just about every language that does packaging right, the answer to "How does packaging work?" is, as far as the average user is concerned, "It works very well, thank you for asking." Python is nowhere near that ideal.

The only way I can imagine the situation improving is if PyPA does a complete about-face on their current approach:

0. Stop telling everyone that you're unofficial. Stop saying that packaging.python.org is "just" documenting well-known solutions. If you have to persuade the SC to give you more authority, or if you have to start writing PEPs for 80% of everything you do, that's just the price of admission.
1. Pick a tool. For the sake of argument, I'll assume that tool is Pip, but you could just as easily pick pipenv, or Poetry, or some other thing, but pick *something.*
2. Take the Unicode approach: Pip (or whatever tool you selected) must make a reasonable attempt to support all use cases. If there is a tool that does X, and X is related to Python packaging, then you accept any pull requests that would add X to Pip, unless they're broken or otherwise problematic. This includes copying code from other projects, if the licenses are compatible. (Do not pick a tool unless its maintainer is willing to commit to this.)
2½. To avoid making the UX any worse than it already is, a limited amount of bikeshedding will likely need to happen on those pull requests, but such bikeshedding should not last for longer than a week or two at most. Develop a voting system or steal Debian's, if needs be, just don't get bogged down in nonsense.
3. Once Pip (or whatever) supports all use cases of tool Y, then tool Y stops being recommended on packaging.python.org. Maybe move it to a different page or something if you don't want to hurt the maintainer's feelings.
4. If Pip is selected: Make a serious effort to improve Pip's default behavior so that it better comports with user expectations. Maybe this means that Pip needs to change names to avoid breaking backcompat, or maybe that means you need to introduce a new tool. Point is, the One Tool To Rule Them All must not install into system site-packages by default. Years of experience has shown that basically nobody wants that, except perhaps on Windows, and maybe not even there.
5. If you identify a use case that is really hard for Pip (or whatever) to support, then you write a PEP laying out what users are supposed to do instead of using Pip, and explaining why it is out of scope for Pip. Such a PEP should be well-justified, and should not describe a workflow that is significantly more painful than whatever users are doing today. It should preferably not require the use of other tools, except to the extent that those tools are compatible with Pip. If possible, try to work with the maintainer of the other tool to enable this compatibility, rather than declaring the whole thing out of scope.[3]
6. Workflows are not use cases. Implementation details are not use cases. If a user tells you that (for example) they want their packages installed in a very specialized way so that their non-Python software can reach inside the venv and fiddle with it, or (for example) they want to do some convoluted task "in one command" (they won't accept two or three commands), that is out of scope and they just have to live with it or hack on top of it.[4]

This would be a long, slow, painful, and toilsome road for the PyPA to go down. But that's the hole that Python has dug for itself. The first step is to put down the shovel.

[1]: https://xkcd.com/353/
[2]: https://xkcd.com/1987/
[3]: "Other tool" is a phrase which here means "conda." Conda compatibility is a hard problem to solve, and I'm not expecting this to happen any time soon. It would be nice if (as a starting point) Pip and venv were capable of hosting C libraries and other non-Python software, and conda were capable of installing such things into venvs in addition to its native environments. This would require significant work on both sides, and probably a fair amount of discussion of the specific mechanisms involved.
[4]: https://xkcd.com/1172/

Users and Python packaging

Posted Feb 9, 2023 14:50 UTC (Thu) by auc (subscriber, #45914) [Link] (3 responses)

> Conda compatibility is a hard problem to solve

What do you mean by this ? Compatibility with what ?

Users and Python packaging

Posted Feb 9, 2023 17:25 UTC (Thu) by smurf (subscriber, #17840) [Link] (2 responses)

Did you ever try to install any nontrivial software whose installation instructions read "run 'conda XXX'" into a non-conda environment? When the people involved all use conda and thus the non-conda tooling is nonexistent and/or suffers from bit rot?

Good luck.

Users and Python packaging

Posted Feb 9, 2023 20:10 UTC (Thu) by smurf (subscriber, #17840) [Link] (1 responses)

NB, in my case it was CadQuery. After miniconda was done my disk was 4.5 GBytes smaller, separate Python3.10 installation included.

That amount of space might be insignificant when you're doing some scientific stuff and your data is measured in petabytes, but to me this is not what a sensible packaging solution should ever do.

Users and Python packaging

Posted Feb 11, 2023 3:08 UTC (Sat) by himi (subscriber, #340) [Link]

> That amount of space might be insignificant when you're doing some scientific stuff and your data is measured in petabytes, but to me this is not what a sensible packaging solution should ever do.

Trust me, it's far from trivial in an HPC context. The thing that people do in that context (when they're following the bouncing ball provided by the doco, and that doco points them at conda) is to do the install in their home directory, which is generally small, tightly constrained by quotas (including inode quotas), and low performance. All that random gunk that's taking up 4.5GB of space will likely hit those constraints /very/ quickly, particularly the inode quotas (which no one ever thinks about these days, since they're basically irrelevant on any modern filesystem . . . except high performance distributed filesystems like, for example, Lustre).

Just as importantly, even if they /can/ get it installed, the performance characteristics of the filesystem mean that it's going to behave in ways they're not anticipating, and the fact that it's pulled in /all/ of its own system libraries and dependencies means it's going to be unable to use the underlying hardware efficiently (particularly network/IB/etc, but also potentially GPUs and even the full capabilities of the native CPU).

These problems apply to pip as well, in different ways, and to just about anything else out there packaging python plus native code, but conda makes it particularly hard because it assumes that it's in control of /everything/ related to the python installation. It also likes to do things like adding itself to the user's shell profile, which can then break /other/ things the user wants to run (since other tools end up trying to load libraries from the conda installation rather than the system libraries or optimised versions of things provided by vendors or the local apps team).

HPC is outside the scope of most of these packaging discussions, sure, but it does provide some solid reasons why the approach that conda currently uses is probably not what we want to standardise on . . .

Users and Python packaging

Posted Feb 10, 2023 19:33 UTC (Fri) by opsec (subscriber, #119360) [Link] (8 responses)

> It is my personal opinion that they are massively overthinking this.

Funny, I was thinking they are massivily simplifing by limiting it to python 8-)

Coming from the FreeBSD ports tree, I'm wondering why the packaging topic is being seen as a python specific issue. All distros need to package many projects written in one or more languages and need to cope with several dependency tracking systems (at least one for each language).

So shouldn't this whole discussion be much broader ? It makes it more complex, but we'll save a few iterations by that ?

Users and Python packaging

Posted Feb 11, 2023 3:39 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (3 responses)

No, because then you'd have to talk about Rust and Go and all the other languages that are perfectly happy with their existing packaging systems, and their immediate reaction will be "Why would we change anything? It works so well!" IMHO it's intractable anyway, because application developers and distro developers have diametrically opposed preferences in most technical matters:

* Distros want stability. App developers want the new shiny thing.
* Distros want every app to use the same version of libfoo. App developers want to use whatever version of libfoo supports their use case.
* Distros want exactly one copy of libfoo on the system. App developers do not care about this at all, as long as it works.

No amount of "discussion" is going to resolve those conflicts, and it is pointless to try and make a packaging solution that satisfies all of them at once.

Users and Python packaging

Posted Feb 11, 2023 8:41 UTC (Sat) by Wol (subscriber, #4433) [Link]

And you've completely ignored the user :-)

The reason I switched to gentoo, was I needed "recent" features in an application (lilypond) which was not available from the distro repository. (Okay, I was mad and also fancied tackling the learning curve - I didn't regret it :-)

Cheers,
Wol

Users and Python packaging

Posted Feb 11, 2023 16:06 UTC (Sat) by pizza (subscriber, #46) [Link] (1 responses)

> * Distros want stability. App developers want the new shiny thing.

I disagree with that. App developers want platform stability, except for one new shiny thing (or things) that is different for every app.

Evidence? Once an app is "working" they just keep shipping it with what it was developed against, unless forced to update to something newer by external pressure.

Users and Python packaging

Posted Feb 11, 2023 22:33 UTC (Sat) by NYKevin (subscriber, #129325) [Link]

Sure, but the point is, app developers want their app to work. Distros want the whole platform to work. This is a source of friction.

Users and Python packaging

Posted Feb 16, 2023 9:04 UTC (Thu) by callegar (guest, #16148) [Link] (3 responses)

IMHO Python is very specific wrt regular packaging that is done in a distro. Imagine having to build a distribution and packaging tools for it with the special rule that you cannot have two versions of the same item unless the package manager is capable of creating a self-contained mini-os for each variant. So, if you need say libjpeg8 for something and libjpeg62 for something else, the package manager should be able to create two separate environments where everything is duplicated but for the different libjpegs. Then you find out that you need libssl3 and libssl1.1 and the package manager would need to split out a new mini-system for libjpeg8+libssl3, one for libjpeg62+libssl3, one for libjpeg8+libssl1.1... and so on. Incidentally, this is the main reason why "pip installing at the system level" is always the wrong thing. There simply cannot be a single system python if not for a carefully, painfully, fragile hand crafted one made together to deal with the distro infrastructure needs and this is going to be so delicate that any attempt at touching it can immediately break it. On any system it is unavoidable to have a combinatorial explosion of "pythons" one for every possible useful/needed combination of package versions and IMHO it is simply non realistic to think that packaging tools born with at the os level and that have a view of the system as a "single thing" can deal with it.

Users and Python packaging

Posted Feb 16, 2023 15:45 UTC (Thu) by smurf (subscriber, #17840) [Link] (2 responses)

> So, if you need say libjpeg8 for something and libjpeg62 for something else,

Then you need to badger whoever still needs jpeg62 to freakin' update their code and get with the times. Or do it yourself and submit a patch. Still way cheaper (and, most likely, way more secure) than writing your own JPEG decoder.

> IMHO it is simply non realistic to think that packaging tools born with at the os level and that have a view of the system as a "single thing" can deal with it.

It should be. It's why distros exist, and IMHO they do a passable job for the most part.

Python (and C) cannot deal with diverging versions. There is no way for libjpeg62 to co-exist with libjpeg8 in the same program, and the same applies for versioned Python modules. As dependencies become more complex, the chance that you see colliding requirements approaches one, and the only way around that is to update your dependencies. You need to do that anyway, long-term.

Users and Python packaging

Posted Feb 16, 2023 17:25 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

> There is no way for libjpeg62 to co-exist with libjpeg8 in the same program

C libraries support mangling everything. You still can't use them in the same TU, but they can co-exist at runtime with a suitable dollop of `#define jpeg_func myjpeg_func` lines in a header, moving the headers in the install tree, and changing the SONAME on the library. I wouldn't call it fun, but it is at least possible.

I've yet to see a way to make it work reliably for Python at all because the lookup names are in the importing source without a preprocessor to get in there and mix all the names around.

Users and Python packaging

Posted Feb 20, 2023 17:51 UTC (Mon) by callegar (guest, #16148) [Link]

> Python (and C) cannot deal with diverging versions. There is no way for libjpeg62 to co-exist with libjpeg8 in the same program.

Python and C "cannot deal with diverging versions" in quite different ways.

Two versions of the same library cannot generally coexist in C *in the same program*, but they can certainly coexist *in the same environment*. This is the reason why you can `apt install` libjepeg62 and libjpeg8 at the system level and have programs in the same /usr/bin use either of the two.

Two versions of the same package cannot coexist in Python at all. They cannot coexist at the environment level. If you need two programs using two versions of the same package either you play tricks at packaging time, vendoring the packages inside the programs, or you patch the programs to use the same version of the package, or the typical distribution package manager will never be able to deal with this case, because the typical distribution level package manager does not know (and probably does not want to know) about having separate environments.

Users and Python packaging

Posted Feb 10, 2023 22:49 UTC (Fri) by poc (subscriber, #47038) [Link] (6 responses)

I don't want to install Python packages. I want to install software.

My distro already has a package installer and it installs everything I need from package repositories. Why does Python have to be different from C, Java, C++, Go, etc? (rhetorical question). Unless this is solved, the rest of the discussion is academic as far as I'm concerned.

Users and Python packaging

Posted Feb 10, 2023 22:57 UTC (Fri) by rahulsundaram (subscriber, #21946) [Link] (2 responses)

> Why does Python have to be different from C, Java, C++, Go, etc? (rhetorical question).

I will answer it anyway. Python is not different. Other than C and C++ language specific package managers (Conan exist but is more of a niche player) are popular even for Linux users because they often want the latest libraries etc that distribution aren't packaging or they want a cross platform workflow and other operating systems don't have it packaged. Go has go install built-in. Maven is very popular for Java etc.

Users and Python packaging

Posted Feb 11, 2023 16:02 UTC (Sat) by pizza (subscriber, #46) [Link] (1 responses)

You missed the point -- users care about *software* being installed -- not language-specific library packages.

You can't install a "python application" like you can a C, Java, Go, Rust, etc application. Unless you're using a distribution (or similar mechanism) that hides all of the (very) messy Python packaging mess from the user.

Users and Python packaging

Posted Feb 11, 2023 18:25 UTC (Sat) by poc (subscriber, #47038) [Link]

Exactly my point.

Users and Python packaging

Posted Feb 11, 2023 0:21 UTC (Sat) by mpr22 (subscriber, #60784) [Link]

> My distro already has a package installer and it installs everything I need from package repositories.

I'm very happy for you that you do not need any piece of software that your distro doesn't package.

Users and Python packaging

Posted Feb 11, 2023 9:07 UTC (Sat) by cyperpunks (subscriber, #39406) [Link] (1 responses)

> My distro already has a package installer and it installs everything I need from package repositories. Why does Python have to be
> different from C, Java, C++, Go, etc? (rhetorical question)

The core of problem is that shipping Python code is both very easy and extremely complex.

For pure Python code which is coded correctly so it works on Python 2.7, 3.6, 3.7, 3.8, 3.9, 3.10 and 3.11: just download the script and run.

For Python code that uses a large set of C/C++ libs it can be compared to maintaining a large subset of several Linux distros, which is extremely complex.

Try to ship a modern C++ lib that works on rhel7, rhel8, rhel9, ubuntu 18.04->ubuntu 22.10, Fedora 37, opensuse 15.4, alpine linux, gentoo, debian 10 and 11, now add macOS (x86 and arm) and Windows.

I would guess its less than a handful of people in the industry who manages to do this.

Users and Python packaging

Posted Feb 11, 2023 21:14 UTC (Sat) by mbunkus (subscriber, #87248) [Link]

I actually do ship a C++ application that runs on all of those platforms. For certain ones I have to bundle a very small set of libraries that aren't available or are too old. My configuration system takes care of the detection & deciding which copies to use (system ones or the bundled ones). For others I have to install certain libraries from some well-known third-party applications.

The caveat is that by "ship" I actually mean "ship the source code that can be compiled on all of them". Wrt. binaries I also ship those, but those are distro+version-specific binaries compiled on each of them with an additional cross-platform binary in AppImage format (this one really runs on all of the mentioned platforms, at the cost of bundling all libraries safe for libc).

As for packaging Python for a wide range of software: Borg backup provides all-in-one bundles that run on a very wide range of distributions & versions. It is basically an archive containing a Python binary & all the libraries required by the Python interpreter & all the Python modules required by Borg itself. It's unpacked into a temporary directory on the fly on each invocation. To me this seems have all the drawbacks of an AppImage (having to bundle the world) and springkling having to do a bunch of I/O each time it's called, making it… super bad. Or something.

With today's distro us application developers only have a very limited number of ways to distribute stuff:

1. Offer cross-distro/cross-version compatible binaries which bundling the world
2. Offer one binary for each (distro, version) pair we want to support
3. Only distribute source code & let others do 1 or 2

That's the reality. Python isn't different. All languages face the same problems here, and the solutions all boil down to one of the points above for any non-trivial piece of software:

• node.js applications? install the world, this is basically bundling and therefore 1
• letting packages get packaged by distros themselves? this is basically option 3 chosen by the developer, and the the distro packages then use option 2
• commercial software coming als installable shell scripts? bundle the world again (e.g. Graylog coming as a Debian package that contains its own Java interpreter & all the assorted libraries; it's really funny) → option 1
• Python applications? either come along as software & let the user deal with (option 3) or tell you exactly how to run a venv (which is basically bundling the world) or offer you a bundled Python interpreter as Borg does (which really is bundling the world)
• all those programs telling you to just run a Docker container? they're obviously bundling the world _including the C library_

Users and Python packaging

Posted Feb 14, 2023 14:17 UTC (Tue) by peter-b (guest, #66996) [Link]

Just a note to people following along: "H. Vetinari" is an obvious pseudonym, referencing the character of "Havelock Vetinari" from Terry Pratchett's "Discworld" novels. It's a fancy alternative to "J. Bloggs" or "A.N. Other".

Users and Python packaging

Posted Mar 7, 2023 7:43 UTC (Tue) by smammy (subscriber, #120874) [Link]

Aside from the comparatively “easy” problem of pip defaulting to system-level installation, my sense is that the most widespread issue is with packages that require external tools to build, usually because they need to compile some C code or want to link to a non-Python library.

I'm trying to think of whether there's a language package manager that does significantly better than Python on this and I'm drawing a blank. With most languages I can think of, it still comes down to, “oh yeah, you have to install such a such compiler and this and that library and these header files and make sure pkg-config knows about it,” and so on.

Does anyone know of a language package manager that does a particularly good job with this stuff?


Copyright © 2023, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds