Working toward securing PyPI downloads
An effort to protect package downloads from the Python Package Index (PyPI) has resulted in a Python Enhancement Proposal (PEP) and, perhaps belatedly, some discussion in the wider community. The basic idea is to use The Update Framework (TUF) to protect PyPI users from some malicious actors who are aiming to interfere with the installation and update of Python modules. But the name of the PEP and its wording, coupled with some recent typosquatting problems on PyPI, caused some confusion along the way. There are some competing interests and different cultures coming together over this PEP; the process has not run as smoothly as anyone might want, though that seems to be resolving itself at this point.
PEP 458
("Surviving a Compromise of PyPI
") has been around since 2013
or so; LWN
looked at the PEP and the related PEP 480
("Surviving a Compromise of PyPI: The Maximum Security Model
")
in 2015. PEP 458 proposes a mechanism to provide
cryptographic-signature protection for PyPI packages using TUF. No changes
would be required for package authors or those downloading the packages,
but client-side programs like pip would be able to ensure that
they are getting the latest version of the codeāand that the code itself is
what is stored on PyPI.
A message was posted to the Python discussion forum in mid-November about the PEP. One of the PEP's authors, Trishank Kuppusamy, wondered about the status of it, but the message mostly seemed to be aimed at Donald Stufft, who is the "BDFL-Delegate" for the PEP. That means that the steering council has deferred the decision on the PEP to Stufft; a week or so later, he had seemingly replied to the request but via some other channel.
Neither message was a general call for discussion of the PEP, however, which might be expected, so it would seem that many may have skipped over the message entirely. In early December, Guido van Rossum noted in the thread that he had merged a pull request (PR) for a bunch of updates to the PEP, but cautioned that discussion of the PEP should be done on the forum rather than in a PR. That is the normal course of action on a PEP; clearly Van Rossum was a bit surprised that it wasn't being followed in this case.
In a reply to Van Rossum, Sumana Harihareswara said
that the PEP was ready for community review; "[...] the plan is for
contractors to start work on PyPI this month (on implementing the
foundations for cryptographic signing (and malware detection, which is not
relevant to this PEP))
". Van Rossum and Paul Moore were both
rather puzzled about the process being followed, some of the terminology,
and, to a certain extent,
the intent of the PEP itself. Van Rossum said:
Moore pointed out that the PEP is confusing even for some who are familiar with Python packaging:
Culture clash
A reply from Kuppusamy did not really address the concerns, however. It summarized the issues that had been raised, but indicated an intent to address them via PR, rather than through a discussion as is typical in the PEP process. That was not what Moore and Van Rossum were after, however. Moore said:
Part of the difficulty is that the PEP is being written by TUF researchers,
so it reads to some extent like the use of TUF is a foregone conclusion.
As Moore put it: "The abstract does actually give a reasonable
overview of what's being proposed, as long as I take 'implement TUF' as a
goal in itself [...]
" Van Rossum suggested
that there may be a culture clash present. Steve Dower agreed
with the culture-clash assessment, but noted it was not really the
packaging community behind it, as Van Rossum proposed; "The clash is
with the people behind TUF, not the packaging community in general.
"
Harihareswara said:
"There have been some miscommunications here but no one has meant to
bypass the community.
" She went on to outline the history of the
PEP, along with a number of links to discussions and the like in various
places, including in person with members of the packaging community and in
multiple threads on the distutils-sig
mailing list. She also noted that there was a 2018 gift from Facebook of
$100,000 "to be used for PyPI security work, specifically on
cryptographic signing and malware detection
".
That money is part of what revived PEP 458, which had been languishing for a few years. The Python Software Foundation (PSF), which administers the gift money, put out a Request for Information (RFI) in late August with a rather ambitious schedule given that getting the PEP accepted was part of it. The intent was to start work on an implementation on December 2. So in part the culture clash may also have been between the needs of the procurement process and that of the wider Python community.
PEP 458 had been marked as "deferred" in March, but was revived in some in-person discussions at PyCon shortly thereafter. Because the gift was partly targeting cryptographic signing, PEP 458 seemed like it might fit the bill.
But, as Christian Heimes said, the PEP is not really living up to its title:
Furthermore, he doesn't see the compromise of PyPI as the most important problem it is facing. There have been no reports of PyPI corruption along the way, but there are an ever-increasing number of other problems that cause bad code to end up on users' systems:
Moving forward
In a lengthy post covering replies to multiple messages along the way, Stufft pointed out that the proposal is coming from volunteers, so what they want to focus on is not really in the purview of the Python community per se:
We do have some funds that we plan on using to implement this PEP if it is accepted, and perhaps @sumanah or @EWDurbin [Ernest W. Durbin III] could better answer this question, those funds were given to us with the understanding we'd use them to implement, among other things, "cryptographic signing" (though it was left open ended what exactly that entailed) so even in that case we have limitations on how we're able to direct work to be done since part of it needs to be implementing a cryptographic signature scheme for PyPI (part of it is also implementing malicious package detection, but that doesn't have a PEP because it's just a new feature of the PyPI code base and doesn't have ramifications for projects beyond PyPI really).
He said that there were three paths forward that he could see. PEP 458 could be discussed and refined until it is ready to be accepted, someone could write a competing PEP that would offer a choice, or the whole thing could just be dropped, which leaves the status of the gift somewhat up in the air. No one argued in favor of dropping the idea, nor has anyone stepped up with an alternative. The conversation generally turned toward making the PEP better with an eye toward getting it accepted. To that end, Moore offered some concrete suggestions on improvements, including revising the title. He also noted that the situation is something new for the project:
Another of the PEP co-authors, Marina Moore, posted a list of items that needed to be addressed for the PEP, which received a number of "like" votes. As noted in the thread, though, "like" has no clear semantics; Moore said that his "like" meant that he was in favor of addressing those items and was awaiting further proposed text, which did not seem to be forthcoming:
Part of the problem is that the PEP authors are not up to speed on how the Python community works, as Harihareswara pointed out. She suggested that they be provided with concrete guidance on how to proceed with a discussion of this sort. Normally, a core developer will shepherd a PEP, which would have helped here. Stufft, as BDFL-Delegate, could also have guided the process, but he has been distracted with some real-life issues of late. Given that, Harihareswara thought it would make sense to look for a sponsor; Moore would seem an obvious choice, but he is also busy with other things right now.
In the meantime, the process does seem to be getting back on track. There have been some answers and proposed wording posts from Moore, along with comments on those, revisions, and so on. All of that is ongoing as of this writing. One gets the sense that it is moving in a good direction, though perhaps not at the speed some would hope for. While it gets sorted out, Durbin said that the PSF will use a subset of the funding to add automated malware detection capabilities to the upload portion of PyPI.
No one really seemed opposed to using TUF to try to ensure the integrity of packages that users get from PyPI (and local mirrors of PyPI), but the process to get there has been a bit haphazard, perhaps. As enumerated in the PEP, TUF does handle an impressive array of attacks against a repository like PyPI. Thwarting those attacks certainly seems worth doing even if there are other more common and prominent threats to the integrity of the PyPI ecosystem that also need work.
| Index entries for this article | |
|---|---|
| Python | Packaging |
Posted Dec 20, 2019 17:02 UTC (Fri)
by trishankkarthik (guest, #136236)
[Link]
Working toward securing PyPI downloads
