"Critical" projects and volunteer maintainers
Over the last five decades or so, free and open-source software (FOSS) has gone from an almost unknown quantity available to only the most technically savvy to underpinning much of the infrastructure we rely on today. Much like software itself, FOSS is "eating the world". But that has changed—is changing—the role of the maintainers of all of that code; when "critical" infrastructure uses code from a FOSS project, suddenly, and perhaps without warning, that code itself becomes critical. But many maintainers of that software are volunteers who did not set out to become beholden to the needs of large companies and organizations when they released their code, they were just scratching their itch—now lots of others are clamoring for theirs to be scratched as well.
The supply-chain security problem is clearly a serious one that needs to be addressed. The Log4j incident provides a recent example of how a security vulnerability in a fairly small component can ripple out across the internet by way of dependency chains. Some projects depended directly on Log4j, but many others became vulnerable because they were using some other library or package that depended on Log4j—directly or indirectly.
Some of the places where dependency chains are often lengthy, and thus more vulnerable to the intentional injection of malware, are various language-specific repositories of packages. Sites like the Python Package Index (PyPI) provide a huge palette of components that can be used by applications or other libraries. The pip tool that comes with Python will happily install PyPI packages along with all of their dependencies, recursively. Many other languages have similar repositories and tooling.
Critical components
There are multiple efforts these days to identify the most critical dependencies and to provide assistance to those projects so that they do not end up in the position of a pre-Heartbleed OpenSSL—or represent that one project in the classic xkcd. For example, the Open Source Security Foundation (OpenSSF) has its Alpha-Omega project that is identifying projects needing assistance with their security. PyPI has also been identifying its packages that have been downloaded the most over the last six months based on its public data sets; those that are in the top 1% are deemed "critical". Roughly 3500 projects have been identified in this manner and the maintainers of those projects are being offered a free security key to help them set up two-factor authentication (2FA) for their PyPI accounts.
Authentication using 2FA is not currently required for any packages, but
PyPI plans to require it for maintainers of critical projects "in the
coming months
". Once that goes into effect, maintainers who have not
enabled 2FA (using a security key or time-based
one-time password (TOTP) application) will presumably not be able to
make changes, such as updating the package. That, of course, has its own
risk, in that a critical package may not be able to get the update it needs
for some serious vulnerability because its maintainers failed to sign up
for 2FA.
On July 8, Skip Montanaro posted
a message to the Python discussion forum noting that a defunct project of
his, lockfile, had been
identified as critical. The project had been marked as deprecated at the
top of its README (with alternatives listed) and has
not seen any releases since 2015. He wondered why it was considered
critical and asked: "What should I do to get rid of this designation?
"
Donald Stufft said
that the package is being downloaded roughly 10-million times per month.
Dustin Ingram pointed to the FAQ in the security-key giveaway announcement
that says "once the project has been designated as critical it retains
that designation indefinitely
", so lockfile will be considered critical
henceforth. The lockfile module is part of the OpenStack project; the
README file for lockfile suggests contacting the openstack-dev
mailing list for assistance in moving away from it.
It turns out that "no OpenStack projects declare direct dependencies on lockfile since
May 2015
", according
to "fungi", who is a system administrator for OpenStack. But
lockfile is still used by parts of the OpenStack project.
In a perfect demonstration of the insidious nature of dependency chains,
fungi tracked down its use by the project:
I've found that some OpenStack projects depend on ansible-runner, which in turn depends on python-daemon, which itself declares a dependency on lockfile. I'll need to confer with other contributors on a way forward, but probably it's to either help python-daemon maintainers replace their use of lockfile, or help ansible-runner maintainers replace their use of python-daemon.
So most or all of the downloads of this "critical" PyPI project are
probably for continuous-integration testing of OpenStack and the
components that use lockfile should likely have replaced it with something
else nearly eight years ago. Hugo van Kemenade suggested
encouraging people to stop using it; "if you're still in a position to
release, emit a DeprecationWarning on import suggesting the
replacements. Or something noisier like a UserWarning.
" Paul Moore noted
that marking it as deprecated did not work, nor did ceasing releases
in 2015; "I'm not at all sure 'tell people not to use it' is a
viable strategy for getting marked as 'not critical'.
"
Opinions
On July 9, Armin Ronacher posted his
thoughts about PyPI's 2FA requirement; that post was extensively discussed
here at LWN, at Hacker News,
and elsewhere. Ronacher makes it clear that he does not see 2FA as an
unreasonable burden for maintainers of PyPI projects, but he does wonder
where it all leads. For one thing, it is, apparently, only critical
packages at PyPI that will be required to have 2FA set up, so "clearly
the index [PyPI] considers it burdensome enough to not enforce it for everybody
".
That creates something of a double standard. As Ronacher put it, he did
not set out to create a critical package, that was something that happened
organically. But the kinds of problems that can be prevented through 2FA,
such as a malicious actor logging into PyPI with stolen credentials, can
happen with any package, not just popular ones. "In theory that type of
protection really should apply to every package.
"
But there is also a question of what else might be required down the road. When the projects at PyPI today were created, there was no mention of 2FA, so other things may be added down the road as well.
There is a hypothetical future where the rules tighten. One could imagine that an index would like to enforce cryptographic signing of newly released packages. Or the index wants to enable reclaiming of critical packages if the author does not respond or do bad things with the package. For instance a critical package being unpublished is a problem for the ecosystem. One could imagine a situation where in that case the Index maintainers take over the record of that package on the index to undo the damage. Likewise it's more than imaginable that an index of the future will require packages to enforce a minimum standard for critical packages such as a certain SLO [service level objective] for responding to critical incoming requests (security, trademark laws etc.).
Some of those requirements make perfect sense from a security standpoint; in fact, some should perhaps be in place already. But there is now an ongoing discussion about disallowing projects from being deleted from PyPI. Obviously deleting a project that other projects rely on is kind of an antisocial act, but it does seem like something the author (and probably copyright holder) should be allowed to do. It can lead to chaos like the famous left-pad fiasco, however.
The
recent 2FA push from PyPI led a maintainer to accidentally
remove all of the old releases of the atomicwrites package. As
noted by Stufft in the PyPI deletion discussion linked above, he restored
the atomicwrites releases at the request of the maintainer, but "it took
about an hour to restore 35 files
". Finding a way to head off those
kinds of mistakes would be useful in addition to preventing downstream
chaos when a maintainer deletes their project.
What I like about the cargo-vet approach is that it separates the concerns of running an index from vetting. It also means that in theory that multiple competing indexes could be provided and vetting can still be done. Most importantly it puts the friction of the vetting to the community that most cares about this: commercial users. Instead of Open Source maintainers having to jump through more hoops, the vetting can be outsourced to others. Trusted "Notaries" could appear that provide vetting for the most common library versions and won't approve of a new release until it undergoes some vetting.
Reaction
Django developer James Bennett had a sharply worded
reply to Ronacher on July 11 (which was also discussed at Hacker
News and no doubt elsewhere). In much of it, Bennett seems to
be reacting to the arguments that others are making, rather than those that
Ronacher made. But Bennett's main complaint with Ronacher is that he thinks
the cargo vet approach is flawed and that those who release
FOSS have a responsibility to users in an "ethical and social sense
", even
though any legal responsibility has been disclaimed in the
license. "Yeah, if you publish open-source code you do have some
responsibilities, whether you want them or not.
"
Bennett's list of responsibilities for a FOSS maintainer seem generally
reasonable, "because what they really boil down to is the basic societal
expectation of 'don't be an asshole'
". But he is raising a strawman
here, since Ronacher never argued that maintainers should be
(allowed to be)
assholes. Ronacher simply wondered what other requirements might be
imposed on maintainers over time, some of those that he mentioned
(e.g. a service level objective) would be quite
onerous for a volunteer maintainer.
Bennett's weakest argument seems to be that Ronacher owes more to his users than he might voluntarily choose to give because his work on FOSS has opened various doors for him. It is a fairly strange argument, in truth. Overall, Bennett seems to be addressing lots of things that Ronacher did not say, or even imply. The heart of what Ronacher was trying to do was to try to figure out where the boundaries are, not to claim they had already been crossed.
It seems vanishingly unlikely that PyPI will be establishing two-day security-fix timelines, for example, on its critical projects, but there are surely lots of companies and other organizations out there that wish it would. There is a general tendency for all humans (and their constructs like companies) to shirk responsibilities if they can find another to pin them on. Companies and organizations that are shipping software that is dependent on the FOSS supply chain need to be deeply involved in ensuring that the code is secure.
Doing that work will cost a lot of money and take a lot of time. We are seeing efforts to do that work, and the PyPI 2FA requirement is one of those pieces. It is hardly a panacea, but it is a useful step.
As Luis Villa noted last year, FOSS maintainers are being asked to do more and more things; often they are being asked to do so without any compensation, though perhaps "doors opening" counts to a limited extent. As more critical projects are identified, it is likely we will see more conflicts of this nature. What happens when a maintainer does not want to follow the recommendations of OpenSSF (or some other similar effort) on changes? Forks are generally seen as a hostile move, but one suspects that may ultimately happen for projects that find themselves at odds with sponsoring organizations. That is a rather different world than the one FOSS grew up in.
