Courtès: What's in a package
The first surprise when starting packaging PyTorch is that, despite being on PyPI, PyTorch is first and foremost a large C++ code base. It does have a setup.py as commonly found in pure Python packages, but that file delegates the bulk of the work to CMake.The second surprise is that PyTorch bundles (or "vendors", as some would say) source code for no less than 41 dependencies, ranging from small Python and C++ helper libraries to large C++ neural network tools. Like other distributions such as Debian, Guix avoids bundling: we would rather have one Guix package for each of these dependencies. The rationale is manifold, but it boils down to keeping things auditable, reducing resource usage, and making security updates practical.
Posted Sep 22, 2021 22:01 UTC (Wed)
by timrichardson (subscriber, #72836)
[Link] (2 responses)
Posted Sep 23, 2021 17:17 UTC (Thu)
by developer122 (guest, #152928)
[Link] (1 responses)
(oh, did I mention all AMD drivers rely on loading and calling into the AtomBIOS ROM stored on the GPU? Yeah, that fight was lost internally between AMD and ATI *years* ago. See: https://www.phoronix.com/forums/forum/linux-graphics-x-or...)
Posted Sep 23, 2021 18:03 UTC (Thu)
by flussence (guest, #85566)
[Link]
It says a lot about how little cultural progress they've made internally that their latest CPUs didn't even have a real cpufreq driver for 2 years before Valve stepped in.
Posted Sep 23, 2021 7:02 UTC (Thu)
by LtWorf (subscriber, #124958)
[Link] (4 responses)
Posted Sep 23, 2021 7:41 UTC (Thu)
by NAR (subscriber, #1313)
[Link] (2 responses)
Posted Sep 23, 2021 9:43 UTC (Thu)
by LtWorf (subscriber, #124958)
[Link] (1 responses)
Posted Sep 23, 2021 12:18 UTC (Thu)
by t-v (subscriber, #112111)
[Link]
Similar to what NAR suggests, is that my impression from hanging out on the PyTorch forums is that most people pick whatever version they want and then copy-paste whatever
https://pytorch.org/get-started/locally/
tells them to.
From the forums it looks like people are using conda a lot with PyTorch, I don't know if there is a way to distinguish CI-based downloads from human ones in PyPI.
Posted Sep 23, 2021 13:00 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link]
Posted Sep 23, 2021 7:46 UTC (Thu)
by t-v (subscriber, #112111)
[Link]
Posted Sep 23, 2021 8:34 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link]
> Long story short: “unbundling” is often tedious, all the more so in this case. We ended up packaging about ten dependencies that were not already available or were otherwise outdated or incomplete, including big C++ libraries like the XNNPACK and onnx neural network helper libraries.
This is why people vendor things in the first place.
Posted Sep 23, 2021 12:47 UTC (Thu)
by swilmet (subscriber, #98424)
[Link] (5 responses)
Bundling/vendoring a dependency is sometimes done because that dependency is not evolving well, in the direction that we want. So we simply pick up an older version that works well for us, and that's it. It's open source, after all.
Posted Sep 23, 2021 13:05 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
- options to use external copies;
Of course, there are some that we have patched for our own purposes (with upstreamed PRs, usually merged depending on upstream activity levels) and there's no public version that is viable yet.
I don't like doing it because it's a huge PITA, but when Windows and macOS are target platforms and you depend on lots of external libraries (no, Homebrew and MacPorts are not suitable in the general case for macOS), shipping copies is way *less* work than walking users through how to build with their pet copies (of which there's usually poor uniformity) and the `PATH` shenanigans that usually end up being required.
Posted Sep 24, 2021 4:10 UTC (Fri)
by pabs (subscriber, #43278)
[Link] (1 responses)
Posted Sep 24, 2021 11:51 UTC (Fri)
by swilmet (subscriber, #98424)
[Link]
Posted Sep 24, 2021 2:47 UTC (Fri)
by JanC_ (guest, #34940)
[Link] (1 responses)
Posted Sep 24, 2021 12:01 UTC (Fri)
by swilmet (subscriber, #98424)
[Link]
And it's definitely possible to make the "light fork" parallel-installable with the main, upstream version, so that Linux distros can install both, see for example:
Posted Sep 23, 2021 13:19 UTC (Thu)
by martin.langhoff (guest, #61417)
[Link]
As components mature, they get more users, move a bit slower (as they've accrued more complexity, so moving too fast breaks stuff), and start cleaning up their dependencies, build reproducililty/testabiltiy, etc.
The distro packager has an un-enviable role in pushing for a lot of this maturation to happen, often facing antagonism from the developers. But it's a key step. Once upon a time, foundational pieces of today's stack such as MySQL and PostgreSQL were a gnarly mess to package...
Courtès: What's in a package
Courtès: What's in a package
Courtès: What's in a package
Courtès: What's in a package
Yeah, they'd just put a Courtès: What's in a package
curl ... | bash ... command on their website...
Courtès: What's in a package
Courtès: What's in a package
Personally, I doubt that people have PyTorch installed automatically through dependencies that much, it would be hit-or-miss if it works with their hardware etc.
Courtès: What's in a package
As someone with a lot of involvement in PyTorch, maybe some comments from my personal point of view (not speaking for PyTorch):
Courtès: What's in a package
That said, I can see how packaging PyTorch is a huge task and I personally look forward to the day when most people can just grab it from Debian...
Courtès: What's in a package
Courtès: What's in a package
Courtès: What's in a package
- mangling symbols to avoid conflicts when co-existing with the "real" thing in a process;
- mangling library names to avoid runtime loader problems; and
- moving headers to a subdirectory to avoid conflicting with a "real" install.
Courtès: What's in a package
Courtès: What's in a package
Courtès: What's in a package
Courtès: What's in a package
https://developer.gnome.org/documentation/guidelines/main...
(this is used for instance for the different major versions of GTK, they can co-exist on the same prefix).
Courtès: What's in a package
