|
|
Subscribe / Log in / New account

Network access by Debian package builds

By Nathan Willis
September 14, 2016

The Debian project is known for strictly adhering to its various internal rules and guidelines; a package that some developers feel is in violation of the Debian Free Software Guidelines, for example, is sure to spawn considerable debate, regardless of its popularity or its origin. Recently, a Debian package maintainer realized that a particular package violated a guideline prohibiting network access during the build process, thus sparking a discussion about that guideline and what its ultimate purpose is.

On September 7, Vincent Bernat wrote to the debian-devel list, noting that the python-asyncssh package, when building, runs a unit test that attempts a DNS lookup. That would seem to violate section 4.9 of the Debian packaging policy, which states "For packages in the main archive, no required targets may attempt network access."

The issue was reported by Chris Lamb in a bug tagged as "serious." Fixing that violation, Bernat said, is simple enough—just disable the test in question—but Bernat wondered whether or not the policy rule is genuinely useful. Since the test originates in the upstream project, Debian would have to carry the patch indefinitely, adding what might be deemed considerable overhead for little real gain.

Furthermore, the test in question performs a lookup for a host named fail, which is expected not to succeed, and the purpose of the test is to ensure that the package handles the lookup failure gracefully. The build, Bernat said, works in an isolated network namespace and the package builds reproducibly with or without network access. Consequently, he asked for feedback on ignoring the violation in this case:

I have the impression that enforcing every word of the policy in the hard sense can bring endless serious bugs. [...] I appear as a bad maintainer because I don't feel this is an important bug.

Any thoughts?

Various responses to Bernat's question suggested alternatives to removing the test, but much of the discussion focused on the purpose of the no-network-access rule. Christian Seiler, among others, said the rule was intended to prevent information leaks, which a DNS lookup would certainly cause.

Paul Wise, however, felt that the rule was intended to ensure that nothing outside of the local build environment had any impact on the result of the build process. Steve Langasek concurred with that viewpoint, noting that:

If your package requires the network to build, we have a hard time auditing to make sure that the package actually contains the source for what's built. While some failures may "just" be test cases, it's better to enforce a blanket policy that packages should build without a connection to the public Internet rather than waste time figuring out which failures "really" impact the package contents.

It would appear that there are more than a few packages in the archive that do violate the no-network-access rule. Christoph Biedl said "a certain package (name withheld) did a *lot* of DNS traffic in the test suite, so far nobody has shown concerns". And the test in python-asyncssh, as written, is problematic: if there is a host named fail in the local DNS zone, the lookup will succeed. While unlikely, this is possible; a better test would be to look up a hostname that is guaranteed to be nonexistant by IANA rules (such as .invalid). In addition to being a better test of lookup-failure handling, that change would avoid the risk of an information leak through the lookup request (or, at least, reduce the risk, depending on the behavior of the nameserver).

Russ Allbery contended, though, that there is no genuinely important information leak anyway. He made that comment in reply to Thomas Goirand, who suggested that the issue of attempting network access was valid, but questioned whether "serious" was an appropriate severity level:

I don't think it is a so big issue if a package is doing some network operation, but doesn't fail building if there's no Internet connectivity. The only problem (as Christian mentioned) would be a privacy concern in some cases. In such a case, the severity would be "important", but not "serious" (ie: probably not serious enough to be an RC bug), and it'd be nice if the subject of the bug was reflecting the privacy concern rather than the "no network during build" policy thing (though I can imagine it'd be harder to file the bug).

Others on the list, starting with Gregor Hermann, suggested revising the wording of the rule itself. Allbery proposed two rules, one saying that the package build "must not fail when it doesn't have network access" and another that warns against leaking privacy-related information; several similar variations arose from other participants in the thread. But Adam Borowski replied that attempts to distinguish between different types of network usage are, ultimately, doomed to fail, making such an effort pointless:

As there's no way to distinguish such details automatically, and as data/privacy leaks can be quite surprising, I'd strongly prefer the nice, simple rule of "no attempt to access outside network, period".

If _some_ network accesses are allowed, we can't easily spot the bad ones. With the current wording of the policy, iptables ... -j LOG is all you need for a QA check.

I'd amend the policy to say explicitly "localhost doesn't count as network, DNS lookup do".

Borowski reiterated the privacy-leak angle, saying that even innocent-looking DNS lookups violate "the Dissident Test"—that is, a user performing the build in some location where state-sponsored surveillance is a threat could put themselves at risk of investigation.

But not everyone found the "Dissident Test" argument persuasive. Allbery, in particular, contended that the lookup of a well-known hostname did not reveal significant personally identifiable data, saying:

If you are a dissident building software in an environment where even a DNS query might give away your activity, you seriously need to be using an isolated container or other precautions. It is completely unreasonable and unrealistic to expect all Debian source packages to meet this standard, even if we were trying (which we're not; we've had software that does DNS queries during the build in Debian for twenty years and no one has ever noticed before now), to a level of confidence that a dissident with this type of safety concern would need.

Moreover, Allbery continued, the entire issue is somewhat overblown:

I don't think this argument passes the sniff test for conversations with upstream. We already have enough issues with upstream over licensing, where we've decided that our very aggressive stance is worth the effort. Please let's not pick fights that *aren't* worth the effort and will cause upstream to look at us like we're paranoid nit-pickers. This sort of thing is really bad for cooperation with other projects.

Goirand concurred with that sentiment, noting that Debian already has a contentious relationship with some upstream projects. Zlatan Todoric also agreed, saying "I also feel that we are losing too much energy on this and this is not sustainable long term, nor fun."

The discussion eventually tapered off without a firm conclusion as to whether or not Debian policy should be amended and with no guidelines for a broad approach to assessing future network accesses that occur during package builds. It seems that the status quo will remain in place, then. Each package maintainer will have to individually assess any problematic network-access attempts in build targets, and some of those access attempts may survive for some time if they are determined to pose neither a serious privacy risk nor to impact the result of the build.

On the surface, it might seem like the Debian project has decided to let a minor rules violation slip through the cracks. Bernat had closed the bug as wontfix prior to raising the issue on the discussion list and has not revisited it. But, at another level, the project could be said to have taken the underlying issues—privacy and reproducible builds—seriously and determined that those principles were being upheld. And that would surely be considered Debian behaving as it usually does, with a strict attention to detail.


to post comments

Network access by Debian package builds

Posted Sep 15, 2016 5:35 UTC (Thu) by pabs (subscriber, #43278) [Link]

I think this article mis-characterises my (Paul Wise) position in the thread; which was about *access/communication* not about impact.

Network access by Debian package builds

Posted Sep 15, 2016 9:48 UTC (Thu) by lamby (subscriber, #42621) [Link]

One thing I'd like to underline here is that the bugs in question were about a small subset of packages that build *successfully* regardless of whether there is internet access or not, but still make some connection to the internet. In other words, this was not an discussion on whether builds should permitted to require network access.

Clearly, requiring internet access and downloading third-party code, etc. is extremely suspect, but that's actually a different topic altogether to the rather more subtle one being discussed here.

(Whilst the original bug submitter, I am absent from from aforementioned debian-devel thread as I studiously avoid motivation-sapping games of bug severity "ping-pong" and Policy wording hermeneutics…)

Network access by Debian package builds

Posted Sep 15, 2016 11:02 UTC (Thu) by emk (subscriber, #1128) [Link] (7 responses)

As a non-Debian project maintainer who has _probably_ written a unit test somewhere, in some project, during the last decade, that tests negative DNS lookup, I feel that Debian's goals here are reasonable enough. But if they want to enforce this policy, they should do it centrally in their build system, without patching packages and without bugging upstream. One possible solution might be to run builds in a network namespace that has no access to the outside world. This way, every build would appear to be run offline, and no packages need to be changed.

If Debian feels that setting up a simple network namespace during a build is too much work and too much disruption, I'm not sure that upstream should be obliged to invest the effort required to uphold this policy, either. Technically, it's better to solve this problem once.

Network access by Debian package builds

Posted Sep 15, 2016 11:25 UTC (Thu) by lamby (subscriber, #42621) [Link] (6 responses)

> One possible solution might be to run builds in a network namespace that has no access to the outside world.

Alas not. The issue here is that a developer who "just" builds a package on their own machine without such a restriction will not only reveal themselves privacy-wise (the merits of which are discussed elsewhere), but also runs the risk of using code from the internet just as before with all the obvious reproducibility, reliability and security implications.

Paradoxically, I suspect this latter problem would only be more likely if Debian's official build network added such network namespacing as developers would inclined to think it was a solved issue.

This not only applies to developers rebuilding packages locally, but to all Debian derivatives - they would now have to introduce such a restriction to stay at parity.

Network access by Debian package builds

Posted Sep 15, 2016 16:40 UTC (Thu) by alonz (subscriber, #815) [Link] (3 responses)

My reading of the previous poster was that debian-buildpackage would set up this “no-access” namespace, so it would apply to local builds as well—not just to the Debian build infrastructure.

Network access by Debian package builds

Posted Sep 15, 2016 18:27 UTC (Thu) by flussence (guest, #85566) [Link]

Gentoo has a similar method ($FEATURES=network-sandbox), with a SOCKS proxy available for the rare cases that really need it. It's not on by default, but it probably should be.

Network access by Debian package builds

Posted Sep 15, 2016 20:45 UTC (Thu) by derobert (subscriber, #89569) [Link] (1 responses)

dpkg-buildpackage generally isn't run as root. AFAIK, that means it can't set up a new network namespace. (Or at least unshare(CLONE_NEWNET) fails with EPERM when run as non-root on my box).

Network access by Debian package builds

Posted Sep 16, 2016 12:13 UTC (Fri) by fishface60 (subscriber, #88700) [Link]

There's setuid helper programs like https://github.com/projectatomic/bubblewrap which would allow you to isolate the network.

Network access by Debian package builds

Posted Sep 15, 2016 18:46 UTC (Thu) by emk (subscriber, #1128) [Link] (1 responses)

As alonz suggested, I'm arguing that if Debian cares about forbidding network access during builds, then the debbuild (or whatever the relevant tool is called) should create a network namespace that's isolated from the outside world, and use it when running the package's build script. This is sort of like a network-only Docker container, except using the relevant kernel APIs directly. And this could be enforced for all Debian package builds.

Basically, this problem has an enormously time-consuming and annoying "social" solution that will annoy upstream maintainers and still overlook packages breaking the rules (as has apparently been the case for 20 years). But there's also a relatively simple "technical" solution that will strictly enforce the necessary policy by simulating an offline build machine. This could be as simple as a short C program invoked as "network_sandbox cmd arg1 arg2" inserted in the right place. Then you only have to fix the packages that assume an actual working internet connection, and not the packages that do DNS lookups on "example.invalid" or which talk to localhost.

We have these shiny new kernel features. Docker uses them every day. Let's use them in other places if it provides real benefits.

Network access by Debian package builds

Posted Sep 22, 2016 10:09 UTC (Thu) by rleigh (guest, #14622) [Link]

The underlying schroot tool already has unshare support, so networking could be disabled for the duration of the build step. Support for enabling this functionality needs adding to sbuild. It likely needs a bit of refinement and integration work to make the tools work together properly, but the pieces are all there. It just needs someone to do the work.

Build versus test

Posted Sep 15, 2016 20:00 UTC (Thu) by epa (subscriber, #39769) [Link] (2 responses)

Surely this isn't about building the program, but testing it. The package build process should distinguish between a build step (which can be required not to access the network, as per policy) and a testing step, which happens after a copy of the built package has been made. Test failure might cause the package build to be aborted, but nothing the test code does can affect the content of the package. Then the policy can have more relaxed requirements for the test part, or just require network-using test suites to be flagged specially so paranoid users can avoid running them.

Build versus test

Posted Sep 22, 2016 17:25 UTC (Thu) by pboddie (guest, #50784) [Link] (1 responses)

the python-asyncssh package, when building, runs a unit test that attempts a DNS lookup

I seem to remember a suggestion (by Barry Warsaw, I think) involving a network proxy environment variable workaround that had already been employed to neuter the irritating proliferation of setuptools in Python packaging scripts, with its tendency to want to "dial out" for various things. Such a conflation of concerns has been rife in Python packaging for a long time, so it isn't a surprise to see a network-dependent unit test getting run during a build process for a Python package.

Build versus test

Posted Sep 30, 2016 15:56 UTC (Fri) by pboddie (guest, #50784) [Link]

Here is what seems to get done:

export http_proxy=127.0.0.1:9

Network access by Debian package builds

Posted Sep 21, 2016 17:44 UTC (Wed) by bernat (subscriber, #51658) [Link]

Hey!

I didn't close the bug before the discussion. I closed it after the first post by Russ: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=830568;.... I should have motivated the whole thing but this kind of "disonnance" between fellow maintainers make me uncomfortable and I already had enough of endless debates.


Copyright © 2016, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds