|
|
Subscribe / Log in / New account

Nightly PyTorch builds compromised

Anybody who installed a nightly release from the PyTorch machine-learning library between December 25 and 30 will want to uninstall it immediately:

At around 4:40pm GMT on December 30 (Friday), we learned about a malicious dependency package (torchtriton) that was uploaded to the Python Package Index (PyPI) code repository with the same package name as the one we ship on the PyTorch nightly package index. Since the PyPI index takes precedence, this malicious package was being installed instead of the version from our official repository. This design enables somebody to register a package by the same name as one that exists in a third party index, and pip will install their version by default.

This malicious package has the same name torchtriton but added in code that uploads sensitive data from the machine.



to post comments

Nightly PyTorch builds compromised

Posted Jan 2, 2023 16:29 UTC (Mon) by pbonzini (subscriber, #60935) [Link] (2 responses)

Why wasn't the package uploaded to PyPI in the first place?

Nightly PyTorch builds compromised

Posted Jan 2, 2023 17:45 UTC (Mon) by khim (subscriber, #9252) [Link]

It's nightly build, they were experimenting, I guess.

And weren't sure if they want it or not in stable release.

Nightly PyTorch builds compromised

Posted Jan 2, 2023 18:08 UTC (Mon) by SLi (subscriber, #53131) [Link]

Interesting. My first reaction was to question instead if any package should be pulled in from pypi that automatically.

Nightly PyTorch builds compromised

Posted Jan 2, 2023 19:29 UTC (Mon) by bluca (subscriber, #118303) [Link] (13 responses)

Looking forward to hearing again how language-specific package managers are the future and distributions as useless and obsolete.

Nightly PyTorch builds compromised

Posted Jan 3, 2023 9:34 UTC (Tue) by oldtomas (guest, #72579) [Link] (10 responses)

You won't get that.

Instead you'll get colourful fireworks on how to fix a social problem with technical means :-)

Happy New Year!

Nightly PyTorch builds compromised

Posted Jan 4, 2023 1:03 UTC (Wed) by bluca (subscriber, #118303) [Link] (1 responses)

Looking at the rest of the comments, seems like you were spot-on

Nightly PyTorch builds compromised

Posted Jan 11, 2023 5:34 UTC (Wed) by oldtomas (guest, #72579) [Link]

Glad to see there are some exceptions to our expectation :)

Nightly PyTorch builds compromised

Posted Jan 5, 2023 7:48 UTC (Thu) by groshu (subscriber, #113270) [Link] (7 responses)

But isn't "fixing a social problem with technical means" a definition of the thing we call "security"?

Nightly PyTorch builds compromised

Posted Jan 5, 2023 8:38 UTC (Thu) by Wol (subscriber, #4433) [Link] (6 responses)

I think it's called "security theatre".

It's the difference between hiring a company to provide a guy to check everyone's id, and employing a guy who recognises everyone's face ... night and day ...

Cheers,
Wol

Nightly PyTorch builds compromised

Posted Jan 5, 2023 13:41 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (5 responses)

There's still a level of problem with the former (well, at least when there's insufficient training). My state had very bendable and almost rubbery licenses for a time and people from far-away states were very suspicious of it. They're better now and have some interesting features in them (the transparent hologram window still confuses some people, but it is far better than the older style).

There's also the story of a grocery store in some Midwest state denying a Washington DC license because "DC isn't a state, how can they have driver licenses?" until the police showed up and said "no, this is fine". I recall hearing of disbelief in diplomatic passports as well.

Nightly PyTorch builds compromised

Posted Jan 5, 2023 15:01 UTC (Thu) by farnz (subscriber, #17727) [Link] (2 responses)

I've experienced places in the US refusing service for alcohol because my colleague's passport was clearly fake, since passports are blue and have the word passport on them, whereas his was red and had the "obviously misspelt" word passeport on it.

Nightly PyTorch builds compromised

Posted Jan 5, 2023 22:28 UTC (Thu) by Wol (subscriber, #4433) [Link] (1 responses)

Things don't change, do they ... :-)

35 years ago, a colleague told me stories of his time in Texas. I believe America had plastic licences even then ...

Anyways, the police stopped him and asked for his licence, so he handed them a piece of green paper.

"What's this!?"
"A driving licence."
"How do I know it's a driving licence?"
"It says so. On the front. In big black letters."

Cheers,
Wol

Nightly PyTorch builds compromised

Posted Feb 25, 2023 15:45 UTC (Sat) by nix (subscriber, #2304) [Link]

That of course assumes they can read whatever language the words are written in. I might point here at the tale of the most nefarious driving offender in Ireland, a protean master of disguise whose appearance was never the same twice, a Mr. Prawo Jazdy: <http://news.bbc.co.uk/1/hi/northern_ireland/7899171.stm>.

(Lest anyone think this is a joke about the Irish, it was of course an English (Welsh border) council which managed the impressive trick of putting up dual-language Welsh road signs where the Welsh "translation" was the Welsh for "I am out of the office at the moment but will be back on Monday." You'd think they could have at least spotted that the day of the week was in the translation but not the original and that something *must* be wrong, but nooo...)

Nightly PyTorch builds compromised

Posted Feb 7, 2023 16:56 UTC (Tue) by JanC_ (guest, #34940) [Link] (1 responses)

1. Allowing every state & territory & colony of the US to issue its own completely different & incompatible driving license
2. Using driving licenses & other random things instead of proper standardized ID cards as identification

… and then being surprised that the whole setup is confusing, error prone, easy to falsify, and raising suspicion?

Nightly PyTorch builds compromised

Posted Feb 7, 2023 19:25 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

RealID is (supposed) to fix at least a baseline of things. However, the requirement date keeps getting pushed back further and further…

Nightly PyTorch builds compromised

Posted Jan 3, 2023 15:46 UTC (Tue) by ballombe (subscriber, #9523) [Link] (1 responses)

What is interesting here is that it it seems the attackers have searched CI logs for that kind of situation...
Using the internet during CI build is always dangerous.

Nightly PyTorch builds compromised

Posted Jan 3, 2023 20:06 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

Well, the Internet without verified downloads (hash and/or GPG) at least. Lock files would have helped a lot here I imagine.

Nightly PyTorch builds compromised

Posted Jan 2, 2023 22:00 UTC (Mon) by NightMonkey (subscriber, #23051) [Link] (6 responses)

I'd really like to know more about how to short-circuit the precedence rules in pip to avoid this. In general, I want no surprises in sourcing modules and libraries in any programming language. I'd like the option to try one source, have a md5sum or other hash to lock my dependency on, and fail if that isnt available when requested.

Nightly PyTorch builds compromised

Posted Jan 3, 2023 7:22 UTC (Tue) by ms (subscriber, #41272) [Link]

Exactly. Having multiple repositories where the repository name does not prefix the namespace is pretty bonkers. Or better yet, there should be no central repository and just use URLs of each repo as the name. I.e. the Go design is basically right.

Nightly PyTorch builds compromised

Posted Jan 3, 2023 9:32 UTC (Tue) by kleptog (subscriber, #1183) [Link] (3 responses)

Pip does allow matching on sha256sum and failing if it doesn't match. It's however not the default and not exactly user friendly either. There are other options controlling repositories but it's not very helpful.

Python package repositories weren't created with an actual design, so this kind of thing wasn't really considered.

Nightly PyTorch builds compromised

Posted Jan 3, 2023 20:07 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (2 responses)

How's the hash matching work when each platform/version has its own wheel? Or is this yet another wheel feature that is only really supported if you are pure Python?

Nightly PyTorch builds compromised

Posted Jan 3, 2023 21:37 UTC (Tue) by anselm (subscriber, #2796) [Link]

IIRC you can list multiple acceptable hashes per package.

Nightly PyTorch builds compromised

Posted Jan 4, 2023 7:56 UTC (Wed) by auxsvr (guest, #120007) [Link]

poetry lock files contain hashes for all platforms, including tarballs.

Nightly PyTorch builds compromised

Posted Jan 3, 2023 16:14 UTC (Tue) by SnoopJ (guest, #162807) [Link]

There isn't much in the way of control of precedence in pip, unfortunately. It's a requested feature [1] but there has been relatively litttle work to make it work. PyPA does define a standard (.pypirc) for configuring indexes that gives very good control over precedence, but pip has zero support for it (honestly I don't know what *does* support it)

The gold standard (imo) for avoiding this kind of mistake is to set up your own index that is capable of falling back onto PyPI, and use `--index-url` instead. One of the pip maintainers publishes the tool `simpleindex` [2] for doing this, letting you specify explicitly which packages you want from your own index, and falling back to PyPI for the rest. There's also `devpi` [3] but it's substantially more complicated to operate.

Honestly, it feels like a huge mistake for pip to keep the `--extra-index-url` feature. It's hard to use safely and I think a big reason that pip hasn't grown a better way to do it is because it's "good enough" if you're willing to overlook the massive attack vector it brings along for the ride with any internal packages.

[1] e.g. https://github.com/pypa/pip/issues/6045 and https://github.com/pypa/pip/issues/4263
[2] https://github.com/uranusjr/simpleindex
[3] https://github.com/devpi/devpi

Nightly PyTorch builds compromised

Posted Jan 5, 2023 12:56 UTC (Thu) by NAR (subscriber, #1313) [Link] (4 responses)

Apart from the ubiquitous curl ... | sudo instructions there are also instructions around to add third party repositories (sometimes the addition of the third party repository is itself bundled in a package). What would happen if such third party repository would try to give e.g. a malicious bash package to the users?

Nightly PyTorch builds compromised

Posted Jan 5, 2023 16:21 UTC (Thu) by mbunkus (subscriber, #87248) [Link] (3 responses)

This would most likely work just fine. When you add such a repository, be it an APT repo for Debian-based systems or an RPM repository for RPM-based distros such as Fedora/RHEL/openSUSE, you pretty much always add a GPG key that the repo signs it package lists (APT repos) or the packages themselves (RPM packages) with to the list of trusted GPG keys. See e.g. installation instructions on my MKVToolNix home page at https://mkvtoolnix.download/downloads.html#debian

Then it's just a matter of adding a package called "bash" with a slightly higher version number to that repository, and a subsequent manual package upgrade should pick it up.

That being said, it will likely not be installed automatically. In the Debian-based world there's the "unattended-upgrades" mechanism/package that takes care of installing updates automatically. However, it's pretty much always configured to only download updates from specific APT sections (e.g. from the "security" section). Though I'm not sure how easy it is to fake it.

Both apt & dnf will show where packages are downloaded from; therefore you might spot that "bash" is coming from a server you don't necessarily expect it from. It might also just be overlooked if the number of downloaded packages is big.

It's minimally harder to set up an APT/dnf repository than it is to provide a malicious shell script & a sudo-curl-bash one-liner. But there's no real security there.

Nightly PyTorch builds compromised

Posted Jan 6, 2023 14:04 UTC (Fri) by mw_skieske (guest, #144003) [Link] (1 responses)

you can "harden" specifc package repo files on fedora/rpm distros with the option "includepkgs" which will only download the listed packages from that repo and ignore everything else from that URL.

see: https://man7.org/linux/man-pages/man5/yum.conf.5.html

however this is not really a solution if you don't trust the controlling instance of a remote package server.

i.e. if ms vs code repo content or their signing get somehow compromised they can just replace the "code" binary which you maybe want from this repository to include malicious content.

on the plus side for the attacker this might be much harder to detect for end users.

fwiw the official ms package for fedora does not automatically set these restrictions and I'm not aware of many repositories that do something like this.

I believe this option is ultimately not a security option but more of a bandaid against accidentally installing a package from a wrong repository.

Nightly PyTorch builds compromised

Posted Jan 6, 2023 21:23 UTC (Fri) by NYKevin (subscriber, #129325) [Link]

> i.e. if ms vs code repo content or their signing get somehow compromised they can just replace the "code" binary which you maybe want from this repository to include malicious content.

I doubt this is a solvable problem. You have to trust *somebody* (unless you want to download all of the source code and audit it by hand, in which case you should probably be using Gentoo instead of Debian), and in practice that probably has to be the packager, not the upstream (because the packager may have to carry patches or otherwise modify the software to be suitable for distribution). If you trust the packager, then you trust them, end of story. If you don't trust them, then you can't (shouldn't) run any software they give you.

APT preferences Pin-Priority

Posted Feb 7, 2023 17:24 UTC (Tue) by JanC_ (guest, #34940) [Link]

You can use an origin-based Pin-Priority (see apt_preferences(5) for how this works) to prevent this.


Copyright © 2023, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds