|
|
Subscribe / Log in / New account

Toward a fully reproducible Debian

Toward a fully reproducible Debian

Posted Jun 15, 2018 19:27 UTC (Fri) by Karellen (subscriber, #67644)
In reply to: Toward a fully reproducible Debian by xtifr
Parent article: Toward a fully reproducible Debian

Honest question - why do you want build-time time-stamps? If the software built today is the same as the same software built last week, bit-for-bit, does it actually matter when it was built?


to post comments

Toward a fully reproducible Debian

Posted Jun 15, 2018 19:50 UTC (Fri) by pizza (subscriber, #46) [Link] (15 responses)

It's so you can (relatively) easily tell that something has intentionally changed. Such as when you're actively developing/testing something and don't want to have to bump/change a revision string every time you type 'make'.

Toward a fully reproducible Debian

Posted Jun 15, 2018 20:16 UTC (Fri) by jani (subscriber, #74547) [Link] (11 responses)

> It's so you can (relatively) easily tell that something has intentionally changed.

A timestamp doesn't tell you that. A git commit id does. Use git describe for creating your version string.

Toward a fully reproducible Debian

Posted Jun 15, 2018 22:35 UTC (Fri) by k8to (guest, #15413) [Link] (4 responses)

git hashes cannot be sorted. This is a significant reduction in utility.

Toward a fully reproducible Debian

Posted Jun 15, 2018 22:59 UTC (Fri) by tux3 (subscriber, #101245) [Link] (2 responses)

If that's the only objection, the timestamp of the git hash is as reproducible as the git has itself. And obviously can be sorted.

Toward a fully reproducible Debian

Posted Jun 16, 2018 8:07 UTC (Sat) by josh (subscriber, #17465) [Link] (1 responses)

And that's what reproducible builds typically do; they export SOURCE_DATE_EPOCH in the environment pointing to either a commit timestamp or changelog timestamp, and then anything that wants a timestamp uses that date.

https://reproducible-builds.org/specs/source-date-epoch/

Toward a fully reproducible Debian

Posted Jun 19, 2018 6:18 UTC (Tue) by k8to (guest, #15413) [Link]

Hooray!

Glad our armchair commentary is already addressed.

Toward a fully reproducible Debian

Posted Jun 15, 2018 23:20 UTC (Fri) by hmh (subscriber, #3838) [Link]

git commit ids might not be trivially sortable, but "git describe" output can be made to be. OpenWRT (openwrt.org) has been doing that for quite a while, look at how they do it if you're curious.

But that doesn't even matter: you can just use the timestamp of the vcs commit, and embed that instead of wallclock time wherever you need reproducibility.

Toward a fully reproducible Debian

Posted Jun 16, 2018 14:17 UTC (Sat) by pizza (subscriber, #46) [Link] (5 responses)

That requires a commit to be preformed every time one builds, which I prefer to not do until it at least passes local smoke testing. That workflow is also not how most software is actually developed.

In my experience supporting users, it's quite useful to have both an actual source "timestamp" (actually, two -- one that is derived from revision control at compile time, and another that is derived and fixed when the source was snapshotted/released) plus a build timestamp.

Saves a lot of going back and forth.

Toward a fully reproducible Debian

Posted Jun 16, 2018 16:02 UTC (Sat) by niner (subscriber, #26151) [Link]

In Rakudo Perl 6 we use a hash of the whole source tree to get the "has it changed in any way" information. This replaced timestamps and is even more precise.

Toward a fully reproducible Debian

Posted Jun 18, 2018 12:45 UTC (Mon) by error27 (subscriber, #8346) [Link] (3 responses)

The time stamp tells you very little. Most of the time a git commit works fine and a + character on the end to show it has
been modified locally. That actually gives you more information than just the time stamps.

Toward a fully reproducible Debian

Posted Jun 18, 2018 13:07 UTC (Mon) by pizza (subscriber, #46) [Link] (2 responses)

> The time stamp tells you very little.

And I'm saying (for a third time) that this "very little" is still quite useful in certain contexts.

> Most of the time a git commit works fine and a + character on the end to show it has
been modified locally. That actually gives you more information than just the time stamps.

That only tells you if something has been modified since the last commit, which is simultaneously more info, and less info, than a simple build timestamp can provide.

Toward a fully reproducible Debian

Posted Jun 18, 2018 15:02 UTC (Mon) by johill (subscriber, #25196) [Link]

You could still put a git commit + timestamp if modified, or just plain git commit if unmodified, I guess?

Toward a fully reproducible Debian

Posted Jun 21, 2018 0:19 UTC (Thu) by Comet (subscriber, #11646) [Link]

As an example of the utility: letting applications link against libraries which backport security fixes detect the existence of updates.

For instance, immediately post-heartbleed I wrote an nginx module `nginx-openssl-version` which lets you make into a runtime configuration error a situation where the loaded version of OpenSSL is too old. For various deployment scenarios, I was facing people messing with builds without understanding them and knowing that if nothing was done, we'd regress the Heartbleed fixes. For stuff we deployed, we just included new versions of OpenSSL and were able to pick up off version numbers. Yes, this did save us at least once, thanks to the predicted less-than-informed tinkering.

After publishing the code as open source, the very first issue filed was Debian/Ubuntu support, because those distributions were backporting security fixes and not changing the version number. Frustrating for me, but fair for their goals, with the limits of the version numbers they could propagate through the library. But ... there was a library build date. So the first feature developed after publication was "be able to enforce a minimum build timestamp". Cue happy Debian/Ubuntu users.

Reproducible builds are just fine: the OpenSSL code gets patched, the reproducible timestamp gets updated, everyone's happy. There would only be an issue if a patch were applied in 2018 but the code still claimed that it was built in 2015.

From the other side of the fence, it was fairly simple to adapt even Exim's codebase to be able to use `$SOURCE_DATE_EPOCH` if found in environ during build; this was committed in 2017-04.

Toward a fully reproducible Debian

Posted Jun 15, 2018 23:09 UTC (Fri) by Karellen (subscriber, #67644) [Link] (1 responses)

Actually, now that I think about it, I'm pretty sure that build-time time-stamps are only fixed if the SOURCE_DATE_EPOCH environment variable is set - and is set to the date stored in that variable. If you don't set that, e.g. during development builds, you should get "honest" timestamps as before.

https://reproducible-builds.org/specs/source-date-epoch/

Toward a fully reproducible Debian

Posted Jun 16, 2018 2:09 UTC (Sat) by pabs (subscriber, #43278) [Link]

In general, timestamps should be removed or replaced with timestamps based on the date the source changed. SOURCE_DATE_EPOCH is only a workaround for situations where build timestamps cannot be removed (like in on-disk formats that contain timestamps).

Toward a fully reproducible Debian

Posted Jun 16, 2018 2:01 UTC (Sat) by guillemj (subscriber, #49706) [Link]

The build-time timestamp updates *on-disk* are not banned, nor discouraged. This is actually the inverse, not doing so would break a ton of stuff. What we do is clamp those timestamps when generating the resulting artifacts to a fixed one that gets increased at controlled times (usually during the package release process, and recorded in the debian/changelog, passed over via the referenced SOURCE_DATE_EPOCH spec). In case of dpkg and Debian that includes at least .deb ar members and its tar archives within, or any other timestamp embedded in the included files by the various other tools, which have been improved to support that spec.

Having a changing timestamp in the resulting files makes no sense, as has been mentioned here. But something I've considered important it still recording the build time because that tracks information that is otherwise more difficult to get now. This information is not really relevant for the generated artifacts. But it helps track events that are initiated externally to the contained build-environment. Say, data-corruption in the filesystem, an accidental file removal, etc.
and that's why we still record it out-of-band (see the Build-Date field in <https://manpages.debian.org/sid/deb-buildinfo>).


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds