|
|
Subscribe / Log in / New account

Working toward securing PyPI downloads

By Jake Edge
December 11, 2019

An effort to protect package downloads from the Python Package Index (PyPI) has resulted in a Python Enhancement Proposal (PEP) and, perhaps belatedly, some discussion in the wider community. The basic idea is to use The Update Framework (TUF) to protect PyPI users from some malicious actors who are aiming to interfere with the installation and update of Python modules. But the name of the PEP and its wording, coupled with some recent typosquatting problems on PyPI, caused some confusion along the way. There are some competing interests and different cultures coming together over this PEP; the process has not run as smoothly as anyone might want, though that seems to be resolving itself at this point.

PEP 458 ("Surviving a Compromise of PyPI") has been around since 2013 or so; LWN looked at the PEP and the related PEP 480 ("Surviving a Compromise of PyPI: The Maximum Security Model") in 2015. PEP 458 proposes a mechanism to provide cryptographic-signature protection for PyPI packages using TUF. No changes would be required for package authors or those downloading the packages, but client-side programs like pip would be able to ensure that they are getting the latest version of the code—and that the code itself is what is stored on PyPI.

A message was posted to the Python discussion forum in mid-November about the PEP. One of the PEP's authors, Trishank Kuppusamy, wondered about the status of it, but the message mostly seemed to be aimed at Donald Stufft, who is the "BDFL-Delegate" for the PEP. That means that the steering council has deferred the decision on the PEP to Stufft; a week or so later, he had seemingly replied to the request but via some other channel.

Neither message was a general call for discussion of the PEP, however, which might be expected, so it would seem that many may have skipped over the message entirely. In early December, Guido van Rossum noted in the thread that he had merged a pull request (PR) for a bunch of updates to the PEP, but cautioned that discussion of the PEP should be done on the forum rather than in a PR. That is the normal course of action on a PEP; clearly Van Rossum was a bit surprised that it wasn't being followed in this case.

In a reply to Van Rossum, Sumana Harihareswara said that the PEP was ready for community review; "[...] the plan is for contractors to start work on PyPI this month (on implementing the foundations for cryptographic signing (and malware detection, which is not relevant to this PEP))". Van Rossum and Paul Moore were both rather puzzled about the process being followed, some of the terminology, and, to a certain extent, the intent of the PEP itself. Van Rossum said:

A bit of feedback: to a relative outsider of the packaging world, various references to "The RFI" (or "The RFP"?) are very mysterious. It took me way too long to understand what was going on (there's funding from Facebook for PyPI and/or pip work?), and I'm still not sure! Next time can you all be a little clearer about that?

Moore pointed out that the PEP is confusing even for some who are familiar with Python packaging:

Even as an insider of the packaging world, I can make little sense of this PEP. I just skimmed the current version of the PEP and was confronted with a wall of text covering security related issues, that I don't really follow. And yet, from @sumanah's comment, it sounds like we might have people lining up to implement this.

Culture clash

A reply from Kuppusamy did not really address the concerns, however. It summarized the issues that had been raised, but indicated an intent to address them via PR, rather than through a discussion as is typical in the PEP process. That was not what Moore and Van Rossum were after, however. Moore said:

I think you're missing the point - posting here isn't to gather feedback that you can go away and address, it's to have a discussion, where you engage with the community and come to a consensus about the way to address people's concerns. As it stands, this PEP seems to have been developed largely out of sight of the community, and that's what bothers me. Maybe no-one will be interested in having an extensive discussion, given that this is a pretty specialised proposal. But people should still be given the opportunity to engage, and discussions on a github PR in the PEPs repository isn't, in my view, sufficient for that purpose.

Part of the difficulty is that the PEP is being written by TUF researchers, so it reads to some extent like the use of TUF is a foregone conclusion. As Moore put it: "The abstract does actually give a reasonable overview of what's being proposed, as long as I take 'implement TUF' as a goal in itself [...]" Van Rossum suggested that there may be a culture clash present. Steve Dower agreed with the culture-clash assessment, but noted it was not really the packaging community behind it, as Van Rossum proposed; "The clash is with the people behind TUF, not the packaging community in general."

Harihareswara said: "There have been some miscommunications here but no one has meant to bypass the community." She went on to outline the history of the PEP, along with a number of links to discussions and the like in various places, including in person with members of the packaging community and in multiple threads on the distutils-sig mailing list. She also noted that there was a 2018 gift from Facebook of $100,000 "to be used for PyPI security work, specifically on cryptographic signing and malware detection".

That money is part of what revived PEP 458, which had been languishing for a few years. The Python Software Foundation (PSF), which administers the gift money, put out a Request for Information (RFI) in late August with a rather ambitious schedule given that getting the PEP accepted was part of it. The intent was to start work on an implementation on December 2. So in part the culture clash may also have been between the needs of the procurement process and that of the wider Python community.

PEP 458 had been marked as "deferred" in March, but was revived in some in-person discussions at PyCon shortly thereafter. Because the gift was partly targeting cryptographic signing, PEP 458 seemed like it might fit the bill.

But, as Christian Heimes said, the PEP is not really living up to its title:

It took me a while to realize what I don't like about PEP 458. It mixes the issue "How to [survive] a compromise of PyPI" with a technical solution (TUF). It feels like the PEP is tailored for TUF without exploring alternatives or even verifying if the PEP is asking the right questions. TUF might be the only viable solution, but it's impossible to gauge when the text is written as "Any security framework you like as long as it is TUF".

Furthermore, he doesn't see the compromise of PyPI as the most important problem it is facing. There have been no reports of PyPI corruption along the way, but there are an ever-increasing number of other problems that cause bad code to end up on users' systems:

Personally I see malicious content and package trust as a more pressing issue than a compromise of PyPI infrastructure. As a member of the Python security team (PSRT) I'm getting reports about typo squatting or malicious packages every week. (Fun fact: There [were] four email threads about malicious content on PyPI this month and today is just Dec 4.)

Moving forward

In a lengthy post covering replies to multiple messages along the way, Stufft pointed out that the proposal is coming from volunteers, so what they want to focus on is not really in the purview of the Python community per se:

While we could discuss the relevant merits of different problems to focus on solving, we don't really get to dictate what exactly we have people willing to work on. This particular PEP was written by volunteers (If memory serves me correct they were grad students at the time) and wasn't directed work from the PSF or the community. We don't get to tell volunteers what to work on (unless they ask us), so we have to judge contributions as they come in, not what we'd rather they work on. So in terms of whether we ultimately decide to accept this PEP we don't really get to decide the effort spent writing/discussing it would be better spent on something else.

We do have some funds that we plan on using to implement this PEP if it is accepted, and perhaps @sumanah or @EWDurbin [Ernest W. Durbin III] could better answer this question, those funds were given to us with the understanding we'd use them to implement, among other things, "cryptographic signing" (though it was left open ended what exactly that entailed) so even in that case we have limitations on how we're able to direct work to be done since part of it needs to be implementing a cryptographic signature scheme for PyPI (part of it is also implementing malicious package detection, but that doesn't have a PEP because it's just a new feature of the PyPI code base and doesn't have ramifications for projects beyond PyPI really).

He said that there were three paths forward that he could see. PEP 458 could be discussed and refined until it is ready to be accepted, someone could write a competing PEP that would offer a choice, or the whole thing could just be dropped, which leaves the status of the gift somewhat up in the air. No one argued in favor of dropping the idea, nor has anyone stepped up with an alternative. The conversation generally turned toward making the PEP better with an eye toward getting it accepted. To that end, Moore offered some concrete suggestions on improvements, including revising the title. He also noted that the situation is something new for the project:

There's a somewhat new situation here that we're having to navigate. We have got some volunteers, we've got some money to let them do what they propose, but we still need to ensure (as a community) that we want what they are offering, and someone is willing to pay for.

Another of the PEP co-authors, Marina Moore, posted a list of items that needed to be addressed for the PEP, which received a number of "like" votes. As noted in the thread, though, "like" has no clear semantics; Moore said that his "like" meant that he was in favor of addressing those items and was awaiting further proposed text, which did not seem to be forthcoming:

There still seems to be this misunderstanding that the response to feedback should be a revised PEP - and that's absolutely not the point here, we want a discussion first with consensus before the PEP gets updated.

Part of the problem is that the PEP authors are not up to speed on how the Python community works, as Harihareswara pointed out. She suggested that they be provided with concrete guidance on how to proceed with a discussion of this sort. Normally, a core developer will shepherd a PEP, which would have helped here. Stufft, as BDFL-Delegate, could also have guided the process, but he has been distracted with some real-life issues of late. Given that, Harihareswara thought it would make sense to look for a sponsor; Moore would seem an obvious choice, but he is also busy with other things right now.

In the meantime, the process does seem to be getting back on track. There have been some answers and proposed wording posts from Moore, along with comments on those, revisions, and so on. All of that is ongoing as of this writing. One gets the sense that it is moving in a good direction, though perhaps not at the speed some would hope for. While it gets sorted out, Durbin said that the PSF will use a subset of the funding to add automated malware detection capabilities to the upload portion of PyPI.

No one really seemed opposed to using TUF to try to ensure the integrity of packages that users get from PyPI (and local mirrors of PyPI), but the process to get there has been a bit haphazard, perhaps. As enumerated in the PEP, TUF does handle an impressive array of attacks against a repository like PyPI. Thwarting those attacks certainly seems worth doing even if there are other more common and prominent threats to the integrity of the PyPI ecosystem that also need work.


Index entries for this article
PythonPackaging


to post comments

Working toward securing PyPI downloads

Posted Dec 20, 2019 17:02 UTC (Fri) by trishankkarthik (guest, #136236) [Link]

Jake, the way I see it, I was doing pro bono security work for PyPI based on my graduate research. If some members of the community mistakenly thought we were sidestepping whatever ill-defined processes, that is on them, not me. Me, I was trying to do what was best for the community, and do not need people to misrepresent me. As you say, the process seems to be back on track now, so we shall see.


Copyright © 2019, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds