|
|
Subscribe / Log in / New account

PyTorch and the PyPI supply chain

PyTorch and the PyPI supply chain

Posted Jan 12, 2023 11:44 UTC (Thu) by kleptog (subscriber, #1183)
In reply to: PyTorch and the PyPI supply chain by bferrell
Parent article: PyTorch and the PyPI supply chain

The problem is not new features, it's the bugs. Sure, there are big commonly used projects like Django which are well tested, but there's a long tail of smaller packages for things that only a much smaller group of developers use. If you're a developer using one of these packages in a slightly unusual setting you find bugs.

No problem, create a patch, send it to the developer, they merge it, push a new minor release to PyPI and you can get on with it. In my experience, a month or two is the usual turnaround time. This fits in the release cycle, we just pause the ticket till the upstream release. Telling us to "use the version shipped by the distribution" is equivalent to saying "work around this bug for the next year or two". And it's not just one bug, it's several over several different packages. Eventually tracking which workarounds you're waiting for an upstream release becomes a significant amount of work.

Besides, workarounds are annoying, this is open source, we should be fixing the upstream packages, not working around the issues elsewhere.

I know projects with the strict rule that all packages must be installed from Debian. And it *almost* works. If the packages are missing features you can simply tell the product owner it's not possible yet. But there are always a few packages for which the Debian release is simply buggy, but such a corner case that only affects basically you it's not going to be updated there (because upstream has fixed it in a new version, and Debian isn't going to bump the version). So you end up making an exception for just these handful of packages (basically using py2deb). And hope it doesn't get too many.

The step from there to "just pull everything from PyPI with version/hash pinning" is very, very small.


to post comments

PyTorch and the PyPI supply chain

Posted Jan 12, 2023 13:53 UTC (Thu) by farnz (subscriber, #17727) [Link]

And don't forget that predicting what a distro will have when you release your new version is hard. If you're going to release in April 2023, and something you depend upon has a necessary update released in December 2022, is that update going to be in the current stable Debian release when you make your release?

If you guess wrong, you end up in one of two sub-optimal situations:

  1. If you assume the update won't be submitted by the Debian Developers in time to make it into Debian 12 (Bookworm), or that the release will slip into May, and then the update makes it in anyway, you're carrying sub-optimal code to handle the pre-update version of your dependency.
  2. If you assume it will make it in, and then something causes Debian to have to delay the release, or if the update can't be put into Bookworm before the freeze kicks in without breaking something critical, your project now doesn't run on Debian stable on the day of release, because you're waiting for an update.

Bundling from a vendor source neatly sidesteps this - if Bookworm has the dependency version you need, then unbundling can be done then, while if it doesn't, no problem, you've got the vendored code. And then language repositories like PyPI make it simpler, because they're already working in terms of a dependency tree, not copied code, so you can look and go "aha, when I build the Debian package, I can unbundle libfoo, since Debian has the right version of libfoo already".


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds