|
|
Log in / Subscribe / Register

Debian to require reproducible builds

Paul Gevers has slipped an interesting bit of news into a "bits from the release team" message:

Aided by the efforts of the Reproducible Builds project, we've decided it's time to say that Debian must ship reproducible packages. Since yesterday, we have enabled our migration software to block migration of new packages that can't be reproduced or existing packages (in testing) that regress in reproducibility.

As Gioele Barabucci pointed out, "reproducible" in this sense is limited to building within an instance of Debian's build environment, which is a tighter requirement than is normally used. It is still a big step forward for reproducible builds.


to post comments

"Reproducible" sounds like a yes/no question but it's not

Posted May 11, 2026 14:20 UTC (Mon) by marcH (subscriber, #57642) [Link] (18 responses)

> "reproducible" in this sense is limited to...

"Reproducible" sounds binary (yes/no) but it's not at all. I wish there would be a more nuanced term but I can't think of any. Also, that ship has sailed.

For instance, maybe I can very easily reproduce your build when we are both using Linux, while it's impossible when only one of us uses Windows. Or maybe it's possible even in that case.

Except when you restrict yourself to a perfectly hermetic build, environment has a gazillion of parameters/inputs and it's practically impossible to get test coverage for all of them.

"Reproducible" sounds like a yes/no question but it's not

Posted May 11, 2026 16:06 UTC (Mon) by farnz (subscriber, #17727) [Link] (17 responses)

Reproducible is, in and of itself, a binary question: if you follow the same process, do you get the same output, or a different output?

I think, though, that what you're reaching for is the complexity of the reproduction process - how hard is it to reproduce my build? If the build is reproducible, as long as you're running on the Linux 6.12.0 kernel on an Intel Xeon E3-1245v2 with between 4 GiB and 12 GiB of RAM, and you're using a 7,200 RPM HDD for mass storage, that represents a build that's hard to reproduce. If I can run on any hardware using a Debian 10 or later release as my base software install and get a package, that's less challenging. It becomes even easier to reproduce the build if you can build on any host OS (Windows, macOS, Linux, FreeBSD etc) and get the same bits.

In other words, if it's easy to reproduce, that's (in some sense) better than if it's hard - if you have to spend $100,000 to replicate my setup, and then you can reproduce the same binary, that's a lot harder to do than if you can reproduce the binary on the whatever hardware and software you have lying around.

"Reproducible" sounds like a yes/no question but it's not

Posted May 11, 2026 16:30 UTC (Mon) by marcH (subscriber, #57642) [Link] (10 responses)

> Reproducible is, in and of itself, a binary question: if you follow the same process, do you get the same output, or a different output?

Define "same process"? That's why it's not a binary question.

> I think, though, that what you're reaching for is the complexity of the reproduction process - how hard is it to reproduce *my* build? (emphasis added)

Right: if it's complex then it's not a yes/no question.

Even worse: most people confuse "Can you reproduce _this_ build?" performed in a relatively well defined environment with the much more vague: is "the build" (!?) reproducible.

"Reproducible" sounds like a yes/no question but it's not

Posted May 11, 2026 16:57 UTC (Mon) by farnz (subscriber, #17727) [Link] (9 responses)

Defining "same process" is simple - it's the process that the person claiming you can reproduce their build says you need to use to get the same binary.

And by your definition, there is no such thing as a yes/no question - even something like "are you in France?" is not a yes/no question, let alone something like "is it possible to travel from Paris to Lille?".

The base question of "is it possible to reproduce a binary build" is yes/no; but, like any other "yes/no" question, there's a whole bunch of detail that comes into "how easy is it to reproduce a build".

"Reproducible" sounds like a yes/no question but it's not

Posted May 11, 2026 20:33 UTC (Mon) by rgmoore (✭ supporter ✭, #75) [Link] (8 responses)

I think a better way of saying it is that the question most people care about isn't whether a build is theoretically reproducible but whether they can reproduce it themselves. As I understand it, Debian's long-term goal with reproducible builds isn't just to let them reproduce builds on their build farm; it's to let any user build a bit-for-bit identical version of the distribution package on their local system. They definitely haven't reached that goal yet.

"Reproducible" sounds like a yes/no question but it's not

Posted May 11, 2026 20:39 UTC (Mon) by marcH (subscriber, #57642) [Link] (6 responses)

> whether a build is theoretically reproducible but...

Define "theoretically reproducible".

Best of luck.

"Reproducible" sounds like a yes/no question but it's not

Posted May 11, 2026 21:09 UTC (Mon) by rgmoore (✭ supporter ✭, #75) [Link] (5 responses)

What I meant was that people don't care if the build can be reproduced by jumping through enough hoops; they want to know if they can do it through some easy to implement process on their own machine. As I understand it, that's what Debian is aiming for. The long-term goal is for reproducibility to be a standard part of the process. Anyone who builds the same source package using the same tools should get the same output except the cryptographic signature. The point of the caveat in the announcement is that they have taken an important step but haven't reached that long-term goal yet.

"Reproducible" sounds like a yes/no question but it's not

Posted May 11, 2026 21:47 UTC (Mon) by marcH (subscriber, #57642) [Link] (4 responses)

> What I meant was that people don't care if the build can be reproduced by jumping through enough hoops; they want to know if they can do it through some easy to implement process on their own machine.

Again, define "enough" and "easy". It's different for different people/use cases/systems. So it does not really mean anything. How many planets should users have to align to successfully reproduce? How much time and effort should it take them? Do they need the same operating system or not? Etc.

> The point of the caveat in the announcement is that they have taken an important step but haven't reached that long-term goal yet.

If you read very slowly Gioele Barabucci's message quoted in the main article https://lwn.net/ml/all/603a3905-a87b-47c2-b834-12e58bed13..., you can see that this "long-term" goal is not formally defined either because the definition relies on the vague "many different environments slightly different from each other" + a bunch of examples. Where does that stop? There is no line in the sand.

> "Given a package it is possible to build it in many different environments, each of which is slightly different from the previous one (for example uses a different timezone, a different language, a different underlying file system, etc)

The term "reproducible" is exactly as vague as "bug-free": it depends. Because reproducibility issues are just compile-time bugs, that's all. Some bugs are very "popular" and hit almost everyone while other reproducibility bugs are more niche and affect very environments and users. In any case it's a _spectrum_ of issues. But everyone knows that "bug-free" is just an ideal/direction.

The lack of a formal definition does not make the concept useless, absolutely not. Just like for all other bugs, the number of open reproducibility issues should be minimized to make users' life better and increase the _chances_ that a given build in a given configuration and environment will be successfully reproduced.

PS: your keep demonstrating my initial point, thanks!

"Reproducible" sounds like a yes/no question but it's not

Posted May 12, 2026 7:30 UTC (Tue) by gioele (subscriber, #61675) [Link]

> If you read very slowly Gioele Barabucci's message quoted in the main article https://lwn.net/ml/all/603a3905-a87b-47c2-b834-12e58bed13..., you can see that this "long-term" goal is not formally defined either because the definition relies on the vague "many different environments slightly different from each other" + a bunch of examples. Where does that stop? There is no line in the sand.

I can confirm from my experience that what marcH says ("There is no line in the sand") is mostly true. The line is there (as implicitly defined by reprotest), but it is on the sand and it may change when the wind changes.

It has happened in the past that a bunch of packages that were "reproducible" one day stopped being "reproducible" once reprotest developed the ability to test a new variation (e.g., different timezones). And another day a bunch of packages became "reproducible" once reprotest stopped testing a certain variation (e.g., build paths). The definition of reproducibility changes as our ability to test variations in the build environment extends and improves.

For this reason I recently suggested that "reproducible" should be qualified with some sort of adjective or qualifier. For example I proposed "stress-tested reproducible" and "in-$distro reproducible". The former highlights the fact that the reproducibility claim comes from a test (that may change in the future), the latter focuses on the fact that a package is reproducible inside the constraints (and default settings) of the sanctioned build environment of a distro.

"Reproducible" sounds like a yes/no question but it's not

Posted May 12, 2026 10:21 UTC (Tue) by burki99 (subscriber, #17149) [Link] (2 responses)

People who insist on „define“ can be a real pain. It seem quite clear to me what the parent poster is hoping for: something along the lines:

apt-get install reproducable-build-system
rebuild pkg-name
compare pkg-locally-built to pkg-from-repo

If that’s feasible for the majority of packages then the goal if bringing reproducible builds not just to the distributions but directly to its users seems in sight.

"Reproducible" sounds like a yes/no question but it's not

Posted May 12, 2026 11:23 UTC (Tue) by marcH (subscriber, #57642) [Link] (1 responses)

> People who insist on „define“ can be a real pain.

On the other hand, people not familiar with a topic aggressively correcting other, experienced people is so pleasant :-)

BTW I'm the one trying to highlight the _lack_ of a formal definition; which is fine. Afraid you missed that.

> something along the lines: ...

... and that is extremely useful, zero doubt about that. I've only been trying to show how misleading the term "reproducible" is, which explains why Gioele made the comment he made. I never expected this clarification to unleash so much XKCD 386 (and to demonstrate my point beyond all expectations).

"Reproducible" sounds like a yes/no question but it's not

Posted May 18, 2026 22:18 UTC (Mon) by NYKevin (subscriber, #129325) [Link]

> BTW I'm the one trying to highlight the _lack_ of a formal definition; which is fine. Afraid you missed that.

Who cares if there's a *formal* definition? Is there some requirement I'm not aware of?

"Reproducible" sounds like a yes/no question but it's not

Posted May 12, 2026 5:41 UTC (Tue) by cjwatson (subscriber, #7322) [Link]

We don't have 100% coverage yet, of course - there's a long tail of problems, and the point of the recent change is to add a ratchet so things don't regress. But for packages where we do have coverage, users can in fact run https://manpages.debian.org/trixie/devscripts/debrebuild.... on their local machines to confirm that our build daemon didn't inject anything nefarious. I think that's a useful property.

"Reproducible" sounds like a yes/no question but it's not

Posted May 11, 2026 17:05 UTC (Mon) by iabervon (subscriber, #722) [Link] (5 responses)

Every build is reproducible, if you "follow the same process" closely enough. It's all a matter of allowing the process to be slightly different ("your RNG doesn't have to produce the same values mine did", "your clock could be different from what mine was in each step", "your username/hostname/directory could be anything you want") and still getting the same output. Exactly what needs to be able to vary is where the shades of meaning come in, and containers allow for specifying things to be part of the process that would traditionally not be reasonable to specify (e.g., "in order to reproduce this build, build in a container where the hostname is the hostname of the physical machine that did the build the first time"), but that isn't necessarily fair game for everyone.

"Reproducible" sounds like a yes/no question but it's not

Posted May 11, 2026 17:08 UTC (Mon) by farnz (subscriber, #17727) [Link] (3 responses)

If the process isn't recorded, you can't "follow the same process", because you don't know what the process is - and this goes doubly for builds with things like timestamps in, where part of "the process" is "ensure that your build system's timekeeping matches the original build host perfectly"/

A reproducible build is one where you can follow the same process, because enough of it is recorded/documented that you can do the same thing and get the same output.

"Reproducible" sounds like a yes/no question but it's not

Posted May 11, 2026 18:37 UTC (Mon) by marcH (subscriber, #57642) [Link] (1 responses)

> because enough of it...

Again: not binary.

"Reproducible" sounds like a yes/no question but it's not

Posted May 11, 2026 18:44 UTC (Mon) by farnz (subscriber, #17727) [Link]

Again, no such thing as a binary question given the restrictions you put on it.

"Reproducible" sounds like a yes/no question but it's not

Posted May 12, 2026 21:31 UTC (Tue) by kpcyrd (subscriber, #183784) [Link]

In case of Debian, the process is very well recorded - the Debian source package specifies build instructions, the exact source code, all patches, and there's also buildinfo files available documenting the exact build environment used by Debian's build servers: https://buildinfos.debian.net/

Having to have a magic clock that can return the same value the build server has observed from their clock is not considered reasonable. Same goes for random number generators, but also "you are not expected to read the build server kernel version" or "your binary shouldn't depend on the order your filesystem driver has responded to readdir".

An openSUSE developer has documented all the things you are expected _not_ to do (although I personally disagree on the "aslr" entry): https://github.com/bmwiedemann/theunreproduciblepackage

"Reproducible" sounds like a yes/no question but it's not

Posted May 11, 2026 23:06 UTC (Mon) by david.a.wheeler (subscriber, #72896) [Link]

No, not quite. If the source code was not malicious, but someone tampered with the build process (like what happened to SolarWinds Orion), then it won't reproduce. Which is a good thing... that means that this kind of attack is countered.

reproducibility within buildd

Posted May 11, 2026 23:43 UTC (Mon) by nickodell (subscriber, #125165) [Link] (2 responses)

As Gioele Barabucci pointed out, "reproducible" in this sense is limited to building within an instance of Debian's build environment, which is a tighter requirement than is normally used.
I'm confused. Wouldn't that be a weaker requirement, not a tighter requirement?

Weaker or tighter?

Posted May 12, 2026 1:36 UTC (Tue) by geuder (subscriber, #62854) [Link]

Depends on the perspective: It's a weaker requirement for those offering a "reproducible" package. It's a tighter requirement for those reproducing a build.

reproducibility within buildd

Posted May 12, 2026 21:14 UTC (Tue) by kpcyrd (subscriber, #183784) [Link]

I believe this refers to buildinfo files, Debian documents the exact build environment used for every build (which packages were installed in which version). Packages are expected to be reproducible in the environment specified in the buildinfo file. I agree the tighter/weaker aspect is confusing.

Gentoo has entered the chat...

Posted May 12, 2026 13:52 UTC (Tue) by NightMonkey (subscriber, #23051) [Link]

I dont know if Gentoo's developers have Reproducible Builda on their radar, but this set of tasks seems well-matched with a distro that is almost entirely source-based in its default setup. :) Other binary-based distros could cherry-pick Gentoo's Portage packages (aka build recipies). Cheers.

DSEE for reproducible builds

Posted May 14, 2026 23:07 UTC (Thu) by jreiser (subscriber, #11027) [Link]

The reproducible build idea was part of the Domain Software Engineering Environment" (DSEE; pronounced "dizzy") which was offered by Apollo Computer Inc around the 1980s and 1990s: thirty to forty years ago. (Apollo Computer and Sun Microsystems were serious competitors in some markets.) Each software component had a Makefile which specified all the tools and versions that were used to build it, and the result of each build contained a unique identifier for that build. Each tool also had such a Makefile (recursively), and the version control system for source code and built components contained the complete history of all components that ever existed. The operating system guaranteed that enough machinery existed to re-create any build in the entire history, using the specified version of each tool at each step. A typical build farm used dozens to hundreds of machines connected by 400MHz token-ring cabling (circular full duplex daisy chain.) It was a chore to keep every machine running during a build of the entire system, which might take several days depending on which point-in-time was chosen to be the epoch.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds