Merkle trees and build systems
Merkle trees and build systems
Posted May 29, 2020 10:34 UTC (Fri) by drothlis (guest, #89727)In reply to: Merkle trees and build systems by dezgeg
Parent article: Merkle trees and build systems
I can't speak about Nix with any authority because I haven't used it, but here are some thoughts:
Nix's Merkle tree structure, as described by dezgeg's comment, is capturing the *dependencies* of a package, rather than the *contents* of a single package.
In Nix, as far as I can tell, there's no sharing/deduplication of individual files across different packages or different versions of the same package. Section 7.5 of the Nix PhD thesis (which was provided by civodul in another comment) talks about this problem and provides some workarounds for making deployments more network-efficient —such as calculating binary diffs— but this problem simply doesn't exist if you store the contents as Merkle trees.
It also means that you can't share/reuse any work done by intermediate build steps *within* a package -- with Nix the granularity is at the level of each package.
According to the Build Systems à la Carte paper, Nix resolves transitive dependencies, so it only stores the hashes of the terminal inputs, ignoring intermediate dependencies (search for "Deep Constructive Traces" in the paper). For "cloud" build systems this means that you don't have to wait until intermediate targets are built before deciding if you need to build the target (because it may already be in the build farm's cache). The downside is that you can't have early cutoff: Say you've added a comment to a source file, if the compiled ".o" is unchanged then early cutoff means you can stop there because the final build artifact is going to be the same.
Chapter 6 of the Nix thesis talks about making the build outputs "content addressable" (where the checksum describes the build target's contents instead of the build command + dependencies) but that's described as experimental. I don't know if it was ever implemented in Nix. Even if it was, it doesn't use a Merkle tree (as described in the thesis); it serialises the entire directory including all its files' contents (like tar) and then calculates a checksum of this.
P.S. I hope it doesn't sound like I'm dissing Nix. It has many advantages, not least of which: It's an open-source tool that actually exists and anyone can use. My article is about sharing a new technique/idea.
---
Another build system I wanted to mention is BuildStream -- but again, I don't know enough about it, and I ran out of word count & time. I'm unsure of the relationship between BuildStream, BuildGrid, and BuildBox; and they have gone through several architectural changes. These projects work together to provide "cloud" capabilities: farming work out to remote build servers, and providing a cloud-based cache of build artifacts. As far as we can tell they have something like this article's "ostree_mod" that works remotely. They used to use OSTree but now they have developed their own buildbox-casd component. Their reasons for migrating from OSTree, based on a conversation at the London Build Meetup in October 2019:
- They want it to interoperate with HTTP-based CAS ("content addressable store") implementations like Bazel's one (a simple HTTP PUT/GET protocol that can be backed by S3, whatever Google's equivalent is, or any plain old HTTP server).
- There's no ostree push. You have to use ostree pull via SSH reverse proxying, etc.
- They've dropped the GC, instead they expire objects in an LRU fashion. OSTree has the guarantee that if you have a ref, you'll have the contents of that tree. This means you need GC to be able to expire objects without breaking these guarantees. Instead whenever they pull a tree they touch all the files in it, and then expire them in an LRU fashion. They were seeing GC pauses of 24hr with OSTree.
