LWN: Comments on "Pulling Linux up by its bootstraps" https://lwn.net/Articles/983340/ This is a special feed containing comments posted to the individual LWN article titled "Pulling Linux up by its bootstraps". en-us Tue, 07 Oct 2025 14:57:35 +0000 Tue, 07 Oct 2025 14:57:35 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Practicalities https://lwn.net/Articles/986955/ https://lwn.net/Articles/986955/ TRS-80 <div class="FormattedComment"> AMD is saying they'll move to a new open source platform in 2026:<br> <p> <a rel="nofollow" href="https://community.amd.com/t5/business/empowering-the-industry-with-open-system-firmware-amd-opensil/ba-p/599644">https://community.amd.com/t5/business/empowering-the-indu...</a><br> <a rel="nofollow" href="https://www.anandtech.com/show/18853/amd-opensil-planned-to-replace-agesa-firmware-in-client-and-server-in-2026">https://www.anandtech.com/show/18853/amd-opensil-planned-...</a><br> <p> </div> Fri, 23 Aug 2024 08:20:06 +0000 Two compilers before MES https://lwn.net/Articles/985567/ https://lwn.net/Articles/985567/ FransFaase <div class="FormattedComment"> The article does not mention that stage0 contains two compilers for minimal sets of C before MES is compiled. The first is written assembly and thus CPU specific. The second, which also contains a driver program that includes a pre-processor, is written in C-like language of the first.<br> <p> I personally feel that MES is the odd fellow in the whole chain of compilers and I have been looking into whether it would be possible to reduce the number of steps. It turned out that the M2 compilers are kind of buggy. So, I have been thinking about a stack based language with which it would be possible to write a C compiler that can be used to compile TCC.<br> <p> See <a rel="nofollow" href="https://www.iwriteiam.nl/Software.html">https://www.iwriteiam.nl/Software.html</a> for what I have been working on, including some attempts to generate a webpage with all the sources that are used to compile stage0.<br> </div> Wed, 14 Aug 2024 12:53:01 +0000 Practicalities https://lwn.net/Articles/984743/ https://lwn.net/Articles/984743/ paulj <div class="FormattedComment"> You don't have to specifically recognise a compiler or logind. You just have to know how to inject code into some binary that your code handles and transforms. Given nearly all binaries are in some well defined executable format with well-defined entry-points, this is straight-forward. <br> <p> Thomson's paper is explicit that his point relates to anything that handles code (interprets or transforms). <br> </div> Wed, 07 Aug 2024 21:22:34 +0000 Very cool https://lwn.net/Articles/984736/ https://lwn.net/Articles/984736/ naesten <blockquote>Very cool, and amazing progress towards the day when we have people doing reproducible builds from hardware all the way up (note: This requires the home fab projects to make progress too).</blockquote> Even without homebrew hardware, it should still count for <em>something</em> if we get bit-for–bit identical results on a sufficiently wide array of hardware/firmware. Requirements I can think of: <ul> <li>Use a mix of CPU vendors. (Hopefully, AMD and Intel aren't colluding.)</li> <li>Use motherboards with firmware of different lineage. To avoid any chance that they're all using edk2 to implement UEFI, include plenty of legacy BIOS boards.</li> <li>Use different brands of disk drive and video adapter.</li> <li>Include systems of quite different age, to rule out short-term conspiracies.</li> </ul> Another thing that makes me a bit nervous is the fixed sequence of old GCC versions; it would be more confidence-inspiring if several different paths through GCC history were verified to produce the same result. (It could be worse: some compilers only support building using a specific earlier version, possibly checked into the source repository.) Wed, 07 Aug 2024 20:45:56 +0000 Very cool https://lwn.net/Articles/984576/ https://lwn.net/Articles/984576/ chris_se <div class="FormattedComment"> <span class="QuotedText">&gt; I would summarize this interpretation of Thompson as "supply chain attacks don't have to be visible in source code to be effective."</span><br> <p> Regardless of whether Thompson himself meant it like that or not, I really like your summary. It's catchy enough that one could make a t-shirt out of it. :-)<br> </div> Tue, 06 Aug 2024 08:54:11 +0000 Very cool https://lwn.net/Articles/984569/ https://lwn.net/Articles/984569/ NYKevin <div class="FormattedComment"> As I explained upthread,[1] the original attack is, was, and has always been a fantasy, and so it is logical to conclude that Thompson was not speaking literally. I think it is plausible to read Thompson as anticipating the general(!) category of attack which includes the xz backdoor. I would summarize this interpretation of Thompson as "supply chain attacks don't have to be visible in source code to be effective."<br> <p> [1]: <a href="https://lwn.net/Articles/984430/">https://lwn.net/Articles/984430/</a><br> </div> Tue, 06 Aug 2024 03:02:01 +0000 Practicalities https://lwn.net/Articles/984430/ https://lwn.net/Articles/984430/ NYKevin <div class="FormattedComment"> <span class="QuotedText">&gt; Software supply chain attacks are absolutely a concern but they’re going to be much more boring than a super-virus with near oracular abilities to detect and compromise every version of every compiler in use. I think the risk of anything persisting through existing build pipelines for long is remote.</span><br> <p> It should also be emphasized that, due to Rice's theorem, the original hack as described by Thompson is literally impossible. In the general case, you cannot recognize a C compiler (or logind etc.) by looking at its source. You can probably recognize a lightly-modified GCC or Clang by looking at its source, but there will always be some degree of modification where that no longer works (trivially, because you could delete the whole thing and write a new compiler from scratch).<br> <p> Real supply chain attacks are going to look like the xz backdoor, not like the Thompson trusting-trust attack. But I would go further than that: Interpreting Thompson's paper literally is sort of missing the point. Thompson was not *just* warning about this highly convoluted method of compromising logind through the compiler, or even about compromising binaries in general through the compiler. Thompson was warning about the broader and more general principle that the source code is not the final authority for what the machine actually does.<br> <p> xz is a prime example of this principle - if you looked at the OpenSSH source code, you would not see anything suspicious, because the attack wasn't in the OpenSSH source code. For that matter, neither was it present in the xzutils source code (as displayed on GitHub). It was present, in part, in one of the xzutils opaque binary test files, which was named in a way as to suggest that it was a corrupt file for xz to reject as invalid. Another part of the attack was present in a post-configure script. Of course the average developer will not even look at a configure script or makefile unless there is no alternative, but it doesn't matter, because this post-configure script wasn't in Git at all - it was only in the release tarballs. There is a great deal of further complexity after that, but you get the idea. The point is, "read the source code," as an auditing strategy, would never have found this attack, so reading the source code is (by itself) inadequate as a means of auditing.<br> </div> Fri, 02 Aug 2024 22:47:09 +0000 Very cool https://lwn.net/Articles/984347/ https://lwn.net/Articles/984347/ chris_se <div class="FormattedComment"> <span class="QuotedText">&gt; What would make Thompson’s point is a working demonstration of a backdoor that’s durable to even basic countermeasures, or one found in the wild. Yet more science fictions about what an impossibly perfect program could allegedly do aren’t going to cut it.</span><br> <p> I think Thompson's argument is correct in a philosophical sense, but not in a practical sense. I agree with you in that I don't believe that such a super-backdoor doesn't exist.<br> <p> But other supply chain attacks are real (as we've seen with e.g. the XZ backdoor). And I applaud any work that tries to make it harder and harder for such an attack to occur undetected. Methods that can detect vastly more sophisticated (and possibly unrealistic) attacks will also help detect the more realistic ones.<br> <p> I also think that most developers aren't thinking enough about supply chain attacks in the modern world. So I'm very excited about projects that push these types of ideas more into the current zeitgeist.<br> </div> Fri, 02 Aug 2024 08:43:23 +0000 Proof of DDC https://lwn.net/Articles/984323/ https://lwn.net/Articles/984323/ crhodes <div class="FormattedComment"> <span class="QuotedText">&gt; To our knowledge that is the first non-academic proof of DDC.</span><br> <p> The SBCL Common Lisp system is built, in theory, by arbitrary Common Lisp compilers, and is (again in theory) written in portable Common Lisp. In 2014 this was demonstrated by having the built system be bitwise-identical independent of which compiler was used to build it: <a href="http://christophe.rhodes.io/notes/blog/posts/2014/reproducible_builds_-_a_month_ahead_of_schedule/">http://christophe.rhodes.io/notes/blog/posts/2014/reprodu...</a><br> </div> Thu, 01 Aug 2024 20:00:04 +0000 Very cool https://lwn.net/Articles/984320/ https://lwn.net/Articles/984320/ Phantom_Hoover <div class="FormattedComment"> What would make Thompson’s point is a working demonstration of a backdoor that’s durable to even basic countermeasures, or one found in the wild. Yet more science fictions about what an impossibly perfect program could allegedly do aren’t going to cut it.<br> </div> Thu, 01 Aug 2024 19:26:57 +0000 Practicalities https://lwn.net/Articles/984308/ https://lwn.net/Articles/984308/ Phantom_Hoover <div class="FormattedComment"> Yes, this is why I’ve always found claims that Thompson hacks are a practical possibility to be rather overblown. Software supply chain attacks are absolutely a concern but they’re going to be much more boring than a super-virus with near oracular abilities to detect and compromise every version of every compiler in use. I think the risk of anything persisting through existing build pipelines for long is remote.<br> </div> Thu, 01 Aug 2024 18:22:39 +0000 Practicalities https://lwn.net/Articles/984290/ https://lwn.net/Articles/984290/ Cyberax <div class="FormattedComment"> Realistically, the binaries are so small that a non-trivial backdoor that would recognize the compiler source code and infect it is unlikely.<br> </div> Thu, 01 Aug 2024 17:29:18 +0000 BIOS and Coreboot https://lwn.net/Articles/984281/ https://lwn.net/Articles/984281/ farnz <p>Note that Coreboot has the <a href="https://www.coreboot.org/Payloads#SeaBIOS">SeaBIOS</a> payload, which provides the "traditional" BIOS interface using Coreboot services. Means trusting Coreboot and SeaBIOS, but reduces the amount of closed source in your trusted base. Thu, 01 Aug 2024 16:49:57 +0000 Very cool https://lwn.net/Articles/984280/ https://lwn.net/Articles/984280/ paulj <div class="FormattedComment"> Very cool, and amazing progress towards the day when we have people doing reproducible builds from hardware all the way up (note: This requires the home fab projects to make progress too).<br> <p> This doesn't lay Thomson's worries to rest though. The opposite in fact. It _makes his point_. And even then, this is something that most people are not going to be able to do (by skill, or practicalities such as time). Also, I note the reliance on very old software - which is itself a threat, given what we know about shelf-life of cryptographic hashes, e.g. see observations of Valerie Aurora. I wrote a bit more on this and double diverse compiling here: <a href="https://paul.jakma.org/2010/09/20/critique-of-diverse-double-compiling/">https://paul.jakma.org/2010/09/20/critique-of-diverse-dou...</a><br> </div> Thu, 01 Aug 2024 16:47:14 +0000 Why not Forth? https://lwn.net/Articles/984277/ https://lwn.net/Articles/984277/ salewski <div class="FormattedComment"> I, too, am interested in hearing whether a tiny Forth -- or more broadly, whatever else -- was considered. E.g. sectorlisp?<br> <p> <a href="https://justine.lol/sectorlisp2/">https://justine.lol/sectorlisp2/</a><br> <a href="https://github.com/jart/sectorlisp">https://github.com/jart/sectorlisp</a><br> </div> Thu, 01 Aug 2024 16:33:18 +0000 Practicalities https://lwn.net/Articles/984276/ https://lwn.net/Articles/984276/ mjg59 <div class="FormattedComment"> Coreboot only runs on modern x86 using proprietary blobs (FSP for Intel, AGESA for AMD), you need to go back to roughly a decade old hardware to be blobless. Thankfully that's probably still capable enough for (somewhat more slowly) bootstrapping. <br> </div> Thu, 01 Aug 2024 16:26:02 +0000 Why not Forth? https://lwn.net/Articles/984273/ https://lwn.net/Articles/984273/ Wol <div class="FormattedComment"> <span class="QuotedText">&gt; Personally, I found assembly and C simpler and more fun to work in. We only work on what we feel is fun and worth doing.</span><br> <p> If you're not paid, why do anything else? :-)<br> <p> (That said, I think you can put assembly directly into Forth, no problem :-) (It's the getting your head round RPN and the fact everything is back to front is the problem :-)<br> <p> Cheers,<br> Wol<br> </div> Thu, 01 Aug 2024 16:12:50 +0000 Why not Forth? https://lwn.net/Articles/984253/ https://lwn.net/Articles/984253/ oriansj <div class="FormattedComment"> We did bootstrap several FORTHS.<br> <p> Only Virgil Dupras ran with it to create duskOS and collapseOS.<br> <p> They just don't provide a path to Linux and GCC yet.<br> <p> But they are quite excellent FORTH bootstraps and for anyone interested in FORTH bootstrapping I do recommend them heavily.<br> <p> Personally, I found assembly and C simpler and more fun to work in. We only work on what we feel is fun and worth doing.<br> </div> Thu, 01 Aug 2024 15:02:24 +0000 Practicalities https://lwn.net/Articles/984243/ https://lwn.net/Articles/984243/ anton I am sure that you are aware of coreboot, which, as far as I understand it is a free-software replacement for UEFI/BIOS. Of course that has to be built in some trusted way, too, and AFAIK it only runs on some hardware, but that points to two ways of getting rid of UEFI/BIOS: <ol> <li> Use the same techniques that coreboot uses to let your bootstrapping system run on bare coreboot-capable hardware. <li> Do the same stuff for coreboot that you did for Linux (may be made easier by coreboot being derived from some Linux kernel AFAIK), but for full trust you will probably want to do that starting with way 1. </ol> Even if few people have coreboot-capable hardware, those can check that the Linux kernels agree with those built on an UEFI system (but of course that does not protect against UEFI doing something evil when the kernel image is booted). <p>Independent of that, a very cool project! Thu, 01 Aug 2024 14:42:00 +0000 Practicalities https://lwn.net/Articles/984222/ https://lwn.net/Articles/984222/ Phantom_Hoover <div class="FormattedComment"> I guess my question here is whether there’s any practical value to building this work off ‘bare metal’ (which is really a swamp full of mysterious proprietary code). The DDC method is an abstract property; could you not implement it just as well building from a simple virtual machine spec? Given that I’m sure the vast majority of runs of this code will take place in qemu that’s arguably already the case, so I wonder if you’d have a more robust system for DDC verifications if the initial base execution model was easier to independently implement than x86.<br> </div> Thu, 01 Aug 2024 14:17:51 +0000 Why not Forth? https://lwn.net/Articles/984224/ https://lwn.net/Articles/984224/ Wol <div class="FormattedComment"> I just downloaded Going Forth (I used to have a paper copy years ago), and a quick skim gave me the impression a minimal Forth engine fits inside 512 bytes with ease. Everything else is just pulling in text files to build a fully capable Forth environment, which should be able to build a C compiler pretty quickly - even a somewhat complex C compiler.<br> <p> Cheers,<br> Wol<br> </div> Thu, 01 Aug 2024 14:15:58 +0000 Practicalities https://lwn.net/Articles/984215/ https://lwn.net/Articles/984215/ somlo <div class="FormattedComment"> <span class="QuotedText">&gt; Not much prevents this from being ported to arm64 or RISC-V and being run on CPU with no supervisor mode</span><br> <p> You might find this interesting: <a href="https://archive.fosdem.org/2023/schedule/event/rv_selfhosting_all_the_way_down/">https://archive.fosdem.org/2023/schedule/event/rv_selfhos...</a><br> (disclaimer: I'm the author)<br> <p> It's a (functional, albeit slow) proof of concept for equating the trustability of a running computer to that of a bounded, finite set of software, hardware, and toolchain sources.<br> <p> It boots Fedora (so no reason not to keep supervisor mode :) ) and can run yosys/nextpnr to build bitstream for its own underlying FPGA.<br> <p> There's an argument that using FPGAs trades performance in exchange of removing the ASIC foundry's ability to predict where on the die it could insert a silicon-based backdoor...<br> </div> Thu, 01 Aug 2024 13:39:34 +0000 Practicalities https://lwn.net/Articles/984210/ https://lwn.net/Articles/984210/ ecashin <div class="FormattedComment"> It's technically correct that no layer of software can be expected to be 100% secure 100% of the time, and that secure hardware and firmware is hard to get right, but that's not a good reason to abandon security goals for any specific layer. There's a lot of software that depends on a secure Linux kernel, and getting that right makes the job of attackers harder.<br> <p> A cynical way of looking at the situation is this: Attackers going for money will target people that have thrown their hands up and left all the vulnerabilities in place, because it's easier to attack those systems. Attackers with specific political goals will be more persistent, but the more difficult their job, the more costly their operations, and the more likely they'll be discovered or will fail.<br> </div> Thu, 01 Aug 2024 13:06:29 +0000 Practicalities https://lwn.net/Articles/984170/ https://lwn.net/Articles/984170/ fosslinux <div class="FormattedComment"> (hey, an author of live-bootstrap here!)<br> <p> I totally agree that BIOS + UEFI is a gigantic mess that I think most in the bootstrappable community would consider very difficult to trust. The blocker here is that live-bootstrap currently only supports x86, which obviously has a hard BIOS/UEFI dependency. We have people working on riscv/arm support (which is an effective prereq to running this trustable firmware/hardware). This is obviously a long-term ideal but there is a lot of work to get there.<br> <p> Another thought, on "practical value". We can still do DCC (Diverse Double Compiling) style builds of live-bootstrap on a wide variety of different x86 hardware and hopefully see that they all match checksums at the end (meaning they are either all subverted in the same way or not subverted). Of course, in the long term I hope we can have provable trust.<br> <p> Side note: I'm not totally convinced that BIOS/UEFI trust is egregiously worse than trusting higher level software supply chains, working on this has been a real eye-opener on how fragile and untrustable those software supply chains are. But it is surely not ideal.<br> </div> Thu, 01 Aug 2024 12:59:36 +0000 Practicalities https://lwn.net/Articles/984169/ https://lwn.net/Articles/984169/ dottedmag <div class="FormattedComment"> Not much prevents this from being ported to arm64 or RISC-V and being run on CPU with no supervisor mode, except the sheer amount of work. Then one has to trust silicon, of course.<br> </div> Thu, 01 Aug 2024 12:25:02 +0000 Practicalities https://lwn.net/Articles/984167/ https://lwn.net/Articles/984167/ Phantom_Hoover <div class="FormattedComment"> On the one hand, this is extremely cool, and I love it. But the way it all rests on BIOS, which is a bunch of opaque firmware code and also is an obsolete compatibility mode for vastly more complex modern UEFI firmware does make it feel somewhat built on sand, and it makes me seriously question any claimed practical value of this work. While it’s eminently possible to put a Thompson backdoor in any given toolchain, the ‘hermetically sealed’ version that can infect every practical route to bootstrapping a clean toolchain, *and* every auditing tool that might pick up the extremely pervasive backdoors required to do so, seems very implausible to me. I think practically we can already have a lot more confidence in the integrity of gcc and linux binaries than in firmware and hardware, so tightening up the former even more isn’t winning any more real security.<br> </div> Thu, 01 Aug 2024 12:01:31 +0000 Why not Forth? https://lwn.net/Articles/984164/ https://lwn.net/Articles/984164/ dottedmag <div class="FormattedComment"> Where is the discussion taking place usually?<br> <p> Some ideas to bounce that _could_ cut down on the amount of bootstrap work:<br> <p> - Throw away problematic configuration/build systems, especially for old fixed version of software. Their complexity comes from their portability, and here the target is pretty much nailed down. A particular approach that worked well for me to trim down compilation dependencies is to run a configuration script, run the compilation, record all the compilation steps and make a shell file to play them back. This approach has a benefit of having zero logic in the resulting build script, and no maintenance burden for fixed software versions. Another benefit for C and especially C++ software is that a ton of separate compilation commands may be merged into one, and that improves compilation speed.<br> <p> - Do the "good enough" implementation of various tools in assembly/whatever language is able to issue syscalls to short-circuit their dependencies.<br> </div> Thu, 01 Aug 2024 10:16:38 +0000 Proof of DDC https://lwn.net/Articles/984151/ https://lwn.net/Articles/984151/ rahulsundaram <div class="FormattedComment"> <span class="QuotedText">&gt;Apologies, I'm trying to figure out what DDC stands for here, is it this?</span><br> <span class="QuotedText">&gt;* [Diverse Double Compiling](<a href="https://dwheeler.com/trusting-trust/dissertation/html/whe">https://dwheeler.com/trusting-trust/dissertation/html/whe</a>...)</span><br> <p> Yep, that's linked from the news post referenced.<br> </div> Wed, 31 Jul 2024 23:49:20 +0000 Proof of DDC https://lwn.net/Articles/984150/ https://lwn.net/Articles/984150/ ms-tg <div class="FormattedComment"> Apologies, I'm trying to figure out what DDC stands for here, is it this?<br> * [Diverse Double Compiling](<a href="https://dwheeler.com/trusting-trust/dissertation/html/wheeler-trusting-trust-ddc.html">https://dwheeler.com/trusting-trust/dissertation/html/whe...</a>)<br> </div> Wed, 31 Jul 2024 23:16:02 +0000 Proof of DDC https://lwn.net/Articles/984149/ https://lwn.net/Articles/984149/ Foxboron <div class="FormattedComment"> I think it's worth pointing out that the Reproducible Builds project / Bootstrappable Builds project did manage to reproduce the Mes C compiler between 3 different Linux distros and arrived at the same checksum.<br> <p> To our knowledge that is the first non-academic proof of DDC.<br> <p> https://reproducible-builds.org/news/2019/12/21/reproducible-bootstrap-of-mes-c-compiler/<br> </div> Wed, 31 Jul 2024 22:49:12 +0000 Why not Forth? https://lwn.net/Articles/984141/ https://lwn.net/Articles/984141/ daroc <p> There are quite small Forths available — I've come across <a href="https://github.com/cesarblum/sectorforth">sectorforth</a>, which is less than 512 bytes, but I'm sure there are many others. But I admit it didn't occur to me to ask that question when putting the article together. I briefly corresponded with one of the maintainers, so I'll pass on the question and see if they're willing to provide an answer. </p> Wed, 31 Jul 2024 19:59:52 +0000 Why not Forth? https://lwn.net/Articles/984139/ https://lwn.net/Articles/984139/ Wol <div class="FormattedComment"> I'm surprised they didn't start with Forth! How big is a minimal Forth engine that can start pulling in and compiling source? A few KB (if that) of assembler, and you can build a big system pretty quick!<br> <p> I'm not sure how big the ROM in my Jupiter Ace was, but it wasn't much ...<br> <p> Cheers,<br> Wol<br> </div> Wed, 31 Jul 2024 19:29:36 +0000