Code authenticity checking

By Nathan Willis
May 1, 2013

Cryptographically signed binary packages are mainstay of modern Linux distributions, used for verifying packages against both tampering and accidental corruption. But as ever fewer users need to compile their software from source, it is possible to forget that verifying the authenticity and integrity of binary packages depends first on the packagers' ability to verify the upstream source code. In April, Allan McRae posted an entry on his blog that looked at the verifiability of the source code releases made by a number of well-known projects, with less-than-perfect results.

McRae is a core Arch Linux developer, and he undertook his verifiability survey because Arch's packaging tool makepkg recently gained support for checking the PGP signatures of source files. Naturally, such verification is only possible when the source bundle includes a PGP signature, but McRae also looked at four other interrelated factors: whether or not the signing key can be easily verified as authentic, whether checksums for the source files are available, whether the checksums are signed, and whether the checksums are available on a different server than the one hosting the source files. This last criterion is a guard against a server compromise; if an attacker can replace the source files, he or she can also replace the checksum with one that matches the replacement source.

To provide a meaningful sample population for his survey, McRae looked at 63 "core" packages common to most Linux distributions. "For me, that basically means the packages required to build a fairly minimal booting system. This is essentially the package list from Linux From Scratch with a few additions that I see as needed …" His results are presented in a color-coded table; green cells indicate that a package meets all of the verification criteria, red indicates that no verification criteria pass, and yellow indicate some but not all criteria are met. Of the 63 packages examined, ten offered no means of verification and 14 others offered only partial verification.

The packages offering no verification mechanism are file, flex, iana-etc, isl, kbd, libarchive, pkg-config, procps, psmisc, and which. Those packages that partially meet the measured criteria fall into two basic categories, those that provide PGP signatures but have a difficult-to-verify key (gawk, groff, patch, sed, sudo, sysvinit, texinfo, and xz), and those that provide their checksum data on the same server as their files (bzip2, perl, tzdata, and zlib). In addition, gmp and less fell somewhere in between, providing a PGP signature, but making the public key or key ID available only on the same server as the source release.

To be sure, McRae's specific criteria and red/yellow/green assessments draw some rather arbitrary lines—as he observes, several of the projects have other means of verification available, and he admits that the definition of a "readily verifiable" key includes keys signed by keys that he trusts. But the aggregate picture is the important one: most of the packages are green (which is good news), but roughly 15% of them offer no source verification whatsoever (which is far from good). He also notes that the picture seems to rapidly deteriorate "as you move further away from this core subset of packages needed for a fairly standard Linux system".

Best practices

McRae's post was picked up on the oss-security mailing list, where the talk turned to how to establish a set of common guidelines for releasing source code with verifiable authenticity. Alan Coopersmith commented that X.org has received complaints asking it to do more, but without concrete suggestions. "If there was a common standard, with instructions, we'd be far more likely to spend the time to adopt it, than just a 'make signatures appear somewhere, in an unspecified format'". Eric H. Christensen concurred, saying he was interested in establishing a "best practices" recommendation for Red Hat—but, he asked, what really constitutes the best way to disseminate releases? A recommendation would advise against using MD5 for checksumming, he said, and although he favors PGP for signatures, perhaps it has its drawbacks as well.

Indeed, Alistair Crooks replied with a lengthy list of questions one might ask about a PGP-signed release, addressing everything from the key management practices employed by the signing entity to the options specified when generating the key itself (such as whether it is an RSA or DSA key and whether or not it requires a passphrase). A PGP signature proves only that a person with access to the signing key attests that the signed message had a particular value at a particular time, he argued, which does not provide much authentication:

So, all in all, what you have is a digest, signed by someone who knows the key, or who has access to the creds (if any) for the key, or who has found out the key creds, albeit with timestamp info for when the signature took place.

I'm not sure what using PGP gains us?

But the majority seemed to feel that PGP in fact provides quite a few gains. Nicolas Vigier and Florian Weimer both commented that key-continuity over multiple releases safeguards against a man-in-the-middle attacker replacing a release. Weimer noted that "hosting sites have been compromised, or serve their content exclusively over a mirror network which literally anyone can join." Kurt Seifried agreed, but acknowledged that the "real problem" with PGP is the cost of implementing it:

Key creation/storage/management/backup/etc is all non trivial and not free. Is the cost of this worth it?

I think if we are going to push this we need to come up with a pretty good set of guidelines that are easy to follow and implement.

Daniel Kahn Gillmor responded that even a simple "the release manager has the OpenPGP key for the project on the keyring of her personal development machine" workflow raises the bar for would-be attackers and would constitute an improvement for the numerous projects that do not currently sign their releases at all. But he still advocated producing guidelines:

I don't want us to spec out a byzantine ruleset that would put people off from *starting* to sign their releases. Maybe such a policy could break out the sophisticated stuff into the form of "baseline", "level 1", "level 2", etc.

That way we could encourage all projects to get to the "baseline" (which should be short and simple) without requiring them to "level up" right away (to offline key storage, key transition statements, etc).

He pointed to the existing OpenPGP best practices page at the Debian Grimoire wiki as an example.

For his part, McRae had comparatively simple goals in mind:

Despite PGPs limitations, what I really like to see when a release is made is a PGP signed release announcement email to the relevant mailing list with decent checksums and the file size in it. Bonus points if that email gets mirrored or at least archived on a different server than the source code. I figure for most open source software, a false release email would be spotted fairly quickly...

Identity cleft

Although a general consensus developed around the idea of crafting a "best practices" recommendation, Crooks's questions about the limitations of PGP signatures raised some valuable points—such as the importance of distinguishing between identity and trust. Some mistook his original email for a call to ditch PGP signatures on source releases, since they do not offer absolute security, but Crooks said that was a misinterpretation. "It's a bit disappointing that my advice (in pointing out ways that PGP can be worked around in order to diminish integrity and security) was categorised as an attack on PGP itself - I shall take that as a reminder that I should be more clear in what I write." Rather, he said, he hoped to "warn against the magic 'it's signed, so it's gospel' myth by pointing out the problems of key management."

The crux of Crooks's argument is that a PGP signature should not be trusted just because it is associated with a known identity (person or team of developers). As he elaborated in one message in the ensuing thread:

I don't know if you've ever done one of the key signing parties, where you get handed government id, and that is supposed to define someone's identity. It tells age, name, and ability to keep a dead pan face in front of a camera. It says nothing about how trust-worthy someone is, in the sense that I would compile/run software written by them.

Elsewhere in the same message, he noted:

I know lots of people who write software. Some of their personal lives are train wrecks. Some I wouldn't trust to sit the right way on a toilet seat. But, for various reasons, such as mentoring, peer-programming, peer review, stringent regression tests, personal audits of their work, or because of random audits, etc, I would trust the software they write.

In essence, this is an argument that the signer only earns trust by demonstrating their reliability over time. For individuals, the issue might be the quality of the software (as Crooks discussed above), but to trust the signature on a project's releases, more stringent requirements may be necessary. As he added later: "You actually know very little about the key before and after the signing took place; so you have no way of ascertaining whether the key has been used to sign other things fraudulently."

Add in Crooks's earlier questions about the key options and security on the machines used to sign releases, and proper key management, it would seem, remains an area where there is still work to be done. Crooks did reiterate his support for encouraging PGP signatures on all source releases; he simply cautioned that blind trust in the presence of a signature can lead to a false sense of security.

Then again, a campaign to persuade more upstream software projects to integrate easily-verified PGP signatures and out-of-band checksums into their release process has to start somewhere. There are practical challenges to consider, such as the role that popular hosting providers play in the build and release processes (Stuart Henderson noted that Github and Bitbucket dynamically generate tar archives, which complicates signing releases).

But convincing individual developers and release managers might be the best way to convince hosting services to adapt. Presumably few people enjoy seeing their project marked with the red "No verification available" label. It would certainly be informative to conduct a large-scale examination of McRae's criteria on other popular open source projects. Just over 38% of the core packages McRae examined could not be "readily validated"—a sizable minority, but a minority nonetheless; that gives one hope that the community in general takes source verifiability as a matter worth addressing.

Index entries for this article
Security	Distribution security

Code authenticity checking

Posted May 2, 2013 3:18 UTC (Thu) by pabs (subscriber, #43278) [Link]

The Debian guide for upstreams mentions signing of git tags. Unfortunately it doesn't mention signing release tarballs. I hope some guidelines develop from this discussion so that Debian can link to them.

Code authenticity checking

Posted May 2, 2013 5:46 UTC (Thu) by Comet (subscriber, #11646) [Link]

There's good and there's bad to a decent security approach.

Process for Exim release preparation documented at: https://github.com/Exim/exim/wiki/EximRelease

Git tags are signed, detached PGP signatures distributed alongside the tarballs, release announcement email PGP-signed and contains checksum information in multiple checksum algorithms. Separately, we have a policy of PGP keys owned by people, not role setups, and adding @exim.org UIDs to the keys and cross-signing at face-to-face meetups. This started well, since we sorted that out at a face-to-face meeting.

Our biggest problem right now is that we've newer talent who have done a lot of the work, but none of the very few people who can cut a release has time to see it through. I've cut the past few releases, I'm not going to have time soon and we're months overdue, with many great new features building up.

It turns out that the intersection of the sets of people "will write C code to update an MTA" and "understand PGP and have gotten their key into the strong set, and can meet up with existing developers face-to-face to add @exim.org UIDs" is rather small, and *that's* our major logistical problem right now.

Formalised policy documents? We have those (er, in email minutes from a meeting). What's needed is a way to communicate more broadly what a strong set is, why PGP signatures matter, and to try to grow the usage of PGP to more and more people. The current advocacy approaches "work" (and I do that) but are not scaling well enough. We need a major step change upwards.

Code authenticity checking

Posted May 2, 2013 8:44 UTC (Thu) by talex (guest, #19139) [Link]

0install.net defines an XML format for listing available releases and digests, with a GPG signature on the end. For example, here's the feed for the SERSCIS Access Modeller tool:

http://www.serscis.eu/0install/serscis-access-modeller

(View Page Source to see the GPG signature)

Distributions should be able to update their packages from these feeds automatically if they want to.

> Github and Bitbucket dynamically generate tar archives, which complicates signing releases

0install digests are for the unpacked archive contents (like a Git tree hash, though it predates Git), which avoids this problem.

SOURCE Code authenticity checking

Posted May 2, 2013 12:26 UTC (Thu) by etienne (guest, #25256) [Link] (1 responses)

A better name for the title is "Source code authenticity checking", it does nothing about checking the "compiled code" (in case the RPM has been downloaded from a compromised site, where SRPM was unchanged), and for libraries it does nothing about checking either the library file on the disk, nor the shared library loaded into memory...
Considering the number of people compiling themself...

SOURCE Code authenticity checking

Posted May 4, 2013 23:03 UTC (Sat) by lsl (subscriber, #86508) [Link]

Package verification is a solved problem for RPMs and signatures are checked automatically. A compromised mirror doesn't matter much as long as the user doesn't force installation of a bad package.

But the whole signing infrastructure becomes worthless if some packager uploads a compromised source code archive into the system. That's what the article is about.

Planting the root of trust is another thing. The current expectation is that the initial installation is done with trusted media.

Code authenticity checking

Posted May 2, 2013 19:59 UTC (Thu) by Trou.fr (subscriber, #26289) [Link]

Just a reminder: even basic "the release manager has the OpenPGP key for the project on the keyring of her personal development machine" can help detect backdoors such as the one that was inserted in vsftpd :

http://scarybeastsecurity.blogspot.fr/2011/07/alert-vsftp...

Or :
https://forums.proftpd.org/smf/index.php?topic=5206.0

Or would have helped detect :
http://www.phpmyadmin.net/home_page/security/PMASA-2012-5...

Or this one:
http://www.phpmyfaq.de/advisory_2010-12-15.php

Or this one :
http://piwik.org/blog/2012/11/security-report-piwik-org-w...

And so on... And if distros did automate sig checking, it would help early detection.

Code authenticity checking

Posted May 2, 2013 20:31 UTC (Thu) by jmorris42 (guest, #2203) [Link] (2 responses)

Here is an idea for people to shoot holes in.

If source releases had a machine identifiable way to obtain a signed manifest of file checksums a central repository could collect and monitor them. Once you have that it can spot new signing keys that haven't been signed by the old key (a sign that the signing authority changed as a normal event). It could collect revocations and warn if an old key pops back up. Once one or more such repositories existed package building workflows could integrate automated support for validating source tarballs, git pulls, etc.

So how about for every .tar.gz made available a .release file also appear, with release notes and either just a hash of the .tar.gz or a detailed list of hashes for the contents, followed up with a GPG signature and the full public key that signed it including any attesting signatures.

To implement this baseline all that would need to be created is a utility to crank out the .release files and for someone to create the first repository to track them.

Don't understand the details of git enough to say exactly how it would need to be extended to make .release files available for each tagged release but it should be a fairly straightforward problem as well.

More critical projects could have more strict control over keys and chains of trust, etc. The central trackers could simply note all those details, leaving it to the individual to decide which chains of trust mean what to them. In other words, only define mechanism and not policy.

Code authenticity checking

Posted May 2, 2013 21:56 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

> Don't understand the details of git enough to say exactly how it would need to be extended to make .release files available for each tagged release but it should be a fairly straightforward problem as well.

Well, git could store them as objects within a namespace (e.g., refs/releases/v1.0). I don't know how other version control systems would cope.

Code authenticity checking

Posted May 3, 2013 17:15 UTC (Fri) by talex (guest, #19139) [Link]

> If source releases had a machine identifiable way to obtain a signed manifest of file checksums

Like this?

http://0install.net/interface-spec.html#implementations
http://0install.net/interface-spec.html#signatures

> a central repository could collect and monitor them.

Like this? http://roscidus.com/0mirror/

> To implement this baseline all that would need to be created is a utility to crank out the .release files

Like this? http://0install.net/0release.html

Code authenticity checking

Posted May 2, 2013 20:50 UTC (Thu) by tfheen (subscriber, #17598) [Link] (1 responses)

The comment about pkg-config not being verifiable is wrong. The git tags are signed, and tar.gz releases are just convenience releases, you should really run auto* yourself and build from the real source, git.

(Or just use the binary packages provided by your distribution.)

Code authenticity checking

Posted May 9, 2013 10:15 UTC (Thu) by grawity (subscriber, #80596) [Link]

> (Or just use the binary packages provided by your distribution.)

This doesn't really work if you are the pkg-config package maintainer for said distribution, and want to verify the sources before building binary packages...