Old compilers and old bugs

By Jonathan Corbet
January 11, 2021

The kernel project goes out of its way to facilitate building with older toolchains. Building a kernel on a new system can be enough of a challenge as it is; being forced to install a custom toolchain first would not improve the situation. So the kernel developers try to keep it possible to build the kernel with the toolchains shipped by most distributors. There are costs to this policy though, including an inability to use newer compiler features. But, as was seen in a recent episode, building with old compilers can subject developers to old compiler bugs too.

On January 5, Russell King reported on a problem he had been chasing for a long time. Some of his 64-bit Arm systems running 5.4 or later kernels would, on rare occasion, report a checksum failure on the ext4 root filesystem. It could take up to three months of uptime for the problem to manifest itself, making it, as King described it, "unrealistic to bisect". He had, however, found a way to more reliably reproduce the failure, making the task of finding out when the problem was introduced plausible, at least.

Starting with King's findings, a number of developers working in the Arm subsystem looked into the issue; their efforts seemed to point out this commit as the culprit. That change, applied in 2019, relaxed the memory barriers used around I/O accessors, optimizing accesses to I/O memory. Reverting this patch made the problem go away.

Some developers might have applied the revert and called the problem solved, but that is not what happened here. Will Deacon, the author of the patch in question, was convinced of its correctness; if the Arm architecture is behaving as specified, there should be no need for the stronger barriers, so something else was going on. Reverting the patch, in other words, made the issue go away by papering over a real problem somewhere else.

Where might that "somewhere else" be? King suggested that it could be somewhere else in the kernel, in the Arm processor itself, or in the cache-coherent interconnect that ties together processor clusters and memory. He thought that a problem in the hardware was relatively unlikely, and that the bug thus lurked somewhere within the kernel. That, naturally, led to a lot of code examination, especially within the ext4 filesystem.

Two days later, King announced that the problem had been found; it indeed was an issue within the ext4 filesystem, but not of the variety that had been expected. A look at the assembly code generated for ext4_chksum() revealed that the compiler was freeing the function's stack frame prior to the end of the function itself. The last line of the function is:

    return *(u32 *)desc.ctx;

Here, desc is a local variable, living on the stack. The compiled function was resetting the stack pointer above this variable immediately before fetching desc.ctx. That led to a window of exactly one instruction where the function was using stack space that had already been freed.

This is a compiler bug of the worst type. The miscompiled code will work as expected almost every time; there is, after all, no other code trying to allocate stack space in that one-instruction window. All bets are off, though, if an interrupt arrives exactly between the two instructions; then the stack will be overwritten and the load of desc.ctx will be corrupted, leading to the observed checksum failure. This is something that will almost never happen, but when it does things will go badly wrong.

This miscompilation was done by GCC 4.9.4, which was released in August 2016 (4.9.0, the major release on which it is based, came out in April 2014). The relevant bug, though, was reported in 2014 and fixed in November of that year. That fix was seemingly never backported from the (then) under-development 5.x release to 4.9.x, so the 4.9.4 release did not contain it. Interestingly, versions of 4.9.4 shipped by distributors like Red Hat, Android, and Linaro all did have the fix backported, so it only affected developers not using those versions. The bug lurked there for years until finally turning up in King's builds.

One outcome from this episode is a clear illustration of the potential downside of supporting old toolchains. A great deal of effort went into tracking down a bug that had, in fact, been fixed six years ago; that would have not been necessary if developers were not still using 4.9.x compilers.

As it happens, GCC 4.9 is the oldest compiler supported by the kernel, but even that requirement is relatively recent. As of 2018, the kernel still claimed (not entirely truthfully) that it could be built with GCC 3.2, which was released in 2002. As a result of discussions held in 2018, the minimum GCC version was moved forward to 4.6; later it became 4.9.

Fixing GCC 4.9 to address this bug is out of the question; the GCC developers have long since moved on from that release. So, at a minimum, the oldest version of the compiler that can be used for the arm64 architecture will have to be moved forward to 5.1. But that immediately led to the question of whether the oldest version for all architectures should be moved forward.

Ted Ts'o was in favor of that change, but he also pointed out that RHEL 7 (and thus CentOS 7) systems are still stuck with GCC 4.8. As Peter Zijlstra noted, though, it is already necessary to install a newer compiler than the distribution provides to build the kernel on those systems. Arnd Bergmann said that the other known users of GCC 4.9 were Android and Debian 8. Android has since switched over to Clang to build its kernels, and Debian 8 went unsupported at the end of June 2020. So it would appear that relatively few users would be inconvenienced by raising the minimum GCC version to 5.1.

On the other hand, there are some advantages to such a move beyond leaving an unpleasant bug behind. Bergmann argued for this change because it would allow compiling the kernel with -std=gnu11, making it possible to rely on bleeding-edge C11 features. Currently, kernel builds use -std=gnu89, based on the rather less shiny C89 standard. Zijlstra and Deacon both added that moving to 5.1 would allow the removal of a number of workarounds for GCC 4.9 problems.

Given all that, it seems unlikely that there will be much opposition to moving the kernel as a whole to the 5.1 minimum version. That said, Linus Torvalds is unconvinced about the value of such a change and may yet need some convincing. Even if the shift to 5.1 does not happen right away, the writing would seem to be on the wall that GCC 4.9 will not be supported indefinitely. GCC 5.1, released in April 2015, is not the newest thing on the planet either, of course. But hopefully it has fewer lurking bugs while simultaneously making some welcome new features available. Supporting old toolchains has its value, but so does occasionally dropping the oldest of them.

Index entries for this article
Kernel	Build system

Old compilers and old bugs

Posted Jan 11, 2021 22:09 UTC (Mon) by JoeBuck (subscriber, #2330) [Link] (1 responses)

Suggestion: make the patch that relaxed the memory barriers used around I/O accessors conditional: disable it for the ARM platform if the GCC major version is less than 5. Maybe look for other memory barrier operations and see if they might be affected.
<p>
But a given vendor might have backported the GCC fix to 4.9. Guess there could be a config option to force it on.

Old compilers and old bugs

Posted Jan 11, 2021 22:24 UTC (Mon) by zlynx (guest, #2285) [Link]

The real bug was in the early release of the stack frame, and had nothing to do with memory barriers.

The memory barrier probably masked the problem by changing how the compiler ordered its machine code. That doesn't mean that adding a memory barrier would fix the same bug in other locations.

Old compilers and old bugs

Posted Jan 11, 2021 22:34 UTC (Mon) by dxin (guest, #136611) [Link] (1 responses)

Have Android really switched to clang for compiling the kernel? I thought only Google/Pixel made the switch.

Old compilers and old bugs

Posted Jan 12, 2021 0:56 UTC (Tue) by ndesaulniers (subscriber, #110768) [Link]

Android, ChromeOS, google's production kernel used in data centers, and OpenMandriva have switched their kernel builds to LLVM. I plan on providing more info at LinuxConf AU Jan 23; check it out!

Old compilers and old bugs

Posted Jan 12, 2021 8:12 UTC (Tue) by marcH (subscriber, #57642) [Link] (2 responses)

> it would allow compiling the kernel with -std=gnu11, making it possible to rely on bleeding-edge C11 features

bleeding edge features like.... C99 comments! No?

Old compilers and old bugs

Posted Jan 12, 2021 10:25 UTC (Tue) by quietbritishjim (subscriber, #114117) [Link]

I think C++/BCPL style comments are supported by the GNU extensions in -std=gnu89 anyway.

Old compilers and old bugs

Posted Jan 13, 2021 8:39 UTC (Wed) by error27 (subscriber, #8346) [Link]

People have started using C99 comments in the kernel. But it depends on the subsystem.

Old compilers and old bugs

Posted Jan 12, 2021 10:51 UTC (Tue) by vstinner (subscriber, #42675) [Link] (14 responses)

> on rare occasion, report a checksum failure on the ext4 root filesystem. It could take up to three months of uptime for the problem to manifest itself, making it, as King described it, "unrealistic to bisect".

I love such thriller story! With the article, the whole issue perfectly makes sense, but I love the puzzle that people had the solve to put pieces altogether.

I'm always amazed by developers able to spot in the machine/assembly code that the C code is "miscompiled". The compiler is not the usual suspect.

It's also "funny" that it's still possible to use a variable allocated on the stack after the stack pointer is "reset". It's funny since even if the code is "wrong", well, it works in practice "most of the time".

> That led to a window of exactly one instruction where the function was using stack space that had already been freed.

How likely is it to get an interruption at a very precise unique instruction? "Unlikely" if I understood correctly, but "it happens" because of the Murphy's law ;-)

Maybe a special kernel debug mode should send signals frequently to make such bug more likely?

In Python, I wrote functional tests on EINTR handling. The tests use setitimer() to trigger a SIGARLM signal every 100 ms while the test is running, some tests "sleep" for 200 ms and so should get at least one signal.

Old compilers and old bugs

Posted Jan 12, 2021 15:09 UTC (Tue) by eru (subscriber, #2753) [Link]

>It's also "funny" that it's still possible to use a variable allocated on the stack after the stack pointer is "reset". It's funny since even if the code is "wrong", well, it works in practice "most of the time".

This type of bug is not that unusual, but normally it is caused by a programmer returning a pointer to a local variable from a function. Depending on what the caller does, the buggy code may "work" for a long time, and then blow up, when some unrelated code changes are made, or the compiler is changed. Actually this bug was more common in the past. Nowadays smarter compilers and code analysis tools can usually warn about it.

Old compilers and old bugs

Posted Jan 12, 2021 15:58 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (10 responses)

> It's also "funny" that it's still possible to use a variable allocated on the stack after the stack pointer is "reset". It's funny since even if the code is "wrong", well, it works in practice "most of the time".

For "regular userspace code," this may be entirely legal depending on how big your stack frames are. But the kernel uses -mno-red-zone to forbid it, precisely because it needs to be interrupt-safe.

See also this Linus rant (which is the only reason I know these things): https://lkml.org/lkml/2014/7/24/584

Old compilers and old bugs

Posted Jan 12, 2021 16:34 UTC (Tue) by eru (subscriber, #2753) [Link] (4 responses)

This "red zone" below the stack frame is architecture and ABI dependent thing. I have seen it only in x86_64. Plain old x86 does not use stack that way. Don't know about other architectures. But any C code that depends on the stack below the stack frame of the current function remaining undisturbed is clearly buggy.

Old compilers and old bugs

Posted Jan 12, 2021 20:00 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

> I have seen it only in x86_64.

Which is the most popular architecture by far in terms of desktops and servers. Obviously, phones are a different story, but characterizing it as "only" x86_64 sounds a bit odd to me.

> But any C code that depends on the stack below the stack frame of the current function remaining undisturbed is clearly buggy.

The C abstract machine has no notion of "below the stack frame," so this sentence is meaningless.

Under C, a pointer can point to one of five things:

1. A static object.
2. An automatic object.
3. A dynamic object.
4. NULL or one-past-the-end-of-an-array.
5. Anything else.

For cases 1 and 3, the stack frame is obviously irrelevant. Case 2 is only possible while the function is still executing, because when it returns, we immediately transition to case 5 (according to the C abstract machine). And cases 4 and 5 are always illegal to dereference, regardless of whether you have red-zones or not, meaning an optimizing compiler can interpret your code as unreachable and do wacky things even if the arch would have supported it. In fact, it's generally illegal to even *have* a case 5 pointer, because all of the operations you can do to create a case 5 pointer are themselves UB.

Instead, the entire context of this discussion is around *assembler* code, which has (usually) been emitted by a compiler. The compiler is absolutely entitled to use red zoning on archs where it is supported, unless the user instructs it otherwise (with -mno-red-zone). But this has nothing to do with the "bugginess" of the C code it started with. Buggy C code will obviously result in buggy assembler, but there is no reasonable way for C code to cause a red-zoning violation unless it has wandered pretty far into UB territory to begin with. At that point, the C code isn't so much buggy as utterly meaningless.

Old compilers and old bugs

Posted Jan 13, 2021 7:15 UTC (Wed) by jem (subscriber, #24231) [Link] (2 responses)

>But any C code that depends on the stack below the stack frame of the current function remaining undisturbed is clearly buggy.

One could argue that variables in the "red zone" *are* part of the stack frame of the function. The compiler writer didn't want to waste a clock cycle to decrement the stack pointer, but that is just a technicality. On the other hand, any code that assumes the stack area below the stack pointer is free to use for some other purpose is clearly buggy.

Old compilers and old bugs

Posted Jan 13, 2021 9:54 UTC (Wed) by Wol (subscriber, #4433) [Link] (1 responses)

> On the other hand, any code that assumes the stack area below the stack pointer is free to use for some other purpose is clearly buggy.

Are you talking about the code that executes a new function, and hence is assuming that that space is free to to use for allocating a new stack frame? Because that's exactly what this bug is ...

Cheers,
Wol

Old compilers and old bugs

Posted Jan 13, 2021 14:10 UTC (Wed) by jem (subscriber, #24231) [Link]

The bug is in the compiler. The compiler should know what it is doing, for example only use a red zone in leaf functions. Using a red zone is not a bug in itself. Using a red zone the wrong way is a bug.

Old compilers and old bugs

Posted Jan 12, 2021 21:07 UTC (Tue) by cesarb (subscriber, #6266) [Link] (4 responses)

Key part of that Linus post:

> We do *not* follow the x86-64 ABI wrt redzoning, because we *cannot*: interrupts while in kernel mode *will* use the stack without a redzone.

This is a peculiarity of the x86 architecture (which was kept when AMD extended it to 64 bits): to handle an interrupt, the CPU stores the return address and the processor state (flags) in the stack. Other architectures (including IIRC ARM and MIPS) store them in special registers instead, keeping the stack untouched.

Old compilers and old bugs

Posted Jan 13, 2021 17:27 UTC (Wed) by Wol (subscriber, #4433) [Link] (2 responses)

Not saying that's the case, but there's a good reason for storing them on the stack - you can nest interrupts. With the other architectures, you need to stash the old values somewhere safe if you want to nest them.

Cheers,
Wol

Old compilers and old bugs

Posted Jan 13, 2021 19:17 UTC (Wed) by excors (subscriber, #95769) [Link]

ARM M-Profile (admittedly not very relevant for Linux) is another example that automatically stores interrupt return information (plus a few general-purpose registers) on the stack. I believe it typically has two stacks, one for the RTOS and one for threads(/tasks/processes/etc), where the first interrupt will push onto the current thread stack (which is convenient if the interrupt handler wants to context-switch to another thread) and subsequent nested interrupts will push onto the RTOS stack (which means threads don't have to each reserve enough stack space for a hard-to-predict number of nested interrupt contexts).

(The nesting only allows interrupt handlers to be interrupted by a strictly higher priority interrupt, and there's a limited number of priority levels (e.g. I think 8 is common on Cortex-M4), so the nesting can't get arbitrarily deep. It's useful since you can use high-priority interrupts for real-time tasks, with a guarantee the handler will execute within a certain number of cycles and will never be held up by lower-priority interrupt handlers. That wouldn't be possible if the lower-priority handler was required to manually save its state before allowing any nesting.)

Old compilers and old bugs

Posted Jan 25, 2021 19:07 UTC (Mon) by immibis (subscriber, #105511) [Link]

I believe these architectures typically disable interrupts when entering an interrupt. If you want nested interrupts, save the values on the stack yourself and re-enable interrupts - now you have a chance to skip over the red zone before doing so.

I'm not sure how that works with exceptions. I think ARM has a separate register set for "aborts", so you can nest an abort within an IRQ, but nesting an abort within an abort may cause problems for you - of course, if your abort handler was unable to save the state without causing another abort, you *already* have problems. That's basically a triple fault in software.

Old compilers and old bugs

Posted Jan 13, 2021 19:10 UTC (Wed) by jem (subscriber, #24231) [Link]

Note the "while in kernel mode". The kernel is built with -mno-red-zone. The x86_64 does not use the user mode stack to save anything as a result of an interrupt.

Old compilers and old bugs

Posted Jan 12, 2021 18:56 UTC (Tue) by pbonzini (subscriber, #60935) [Link]

> I'm always amazed by developers able to spot in the machine/assembly code that the C code is "miscompiled". The compiler is not the usual suspect.

By the time the compiler becomes the suspect, spotting the miscompiled code is easy compared to the steps before!

(That said, I know you worked on a ProactorEventLoop that was probably just as nasty as this kind of bug. I found your blog post while trying to figure out somr random hangs that only showed up on Windows and only with Python 3.5. Fortunately the 3.5 EOL meant that *my* bug fixed itself just by waiting a few weeks!).

Old compilers and old bugs

Posted Jan 13, 2021 3:55 UTC (Wed) by plugwash (subscriber, #29694) [Link]

> How likely is it to get an interruption at a very precise unique instruction?
Lets try and put an order of magintude on this by making come crude assumptions

1. the processor executes on-average one instruction per clock cycle (IIRC modern cores can perform more than one instruction per clock sometimes, but they can also have to wait for memory)
2. the processor runs at about 2GHz (seems reasonable for an arm64 core).
3. there is a timer interrupt at 1000Hz.

Which would give us an assumed hit-rate from timer interrupts alone of 1 in 2 million executions. That sounds frequent enough to cause problems on a frequently executed codepath, but rare enough to be a massive PITA to debug.

Old compilers and old bugs

Posted Jan 12, 2021 12:07 UTC (Tue) by ncm (guest, #165) [Link] (13 responses)

Much, possibly most of the financial world is well immured in gcc-4.8.x, and shows no inclination ever to hoist themselves out. Many will not use anything newer than 4.8.3, although 4.8.5 is more common, and 4.9.4 has some popularity.

Few in that world use ARM, yet. But quite a large number of other bugs motivated the later 4.8.x releases, and 4.9. One might expect that there would be a careful catalog of Known Bugs, with means to step around them, and reviews of code to ensure they are not stepped in, but such an expectation would be wholly wrong.

Old compilers and old bugs

Posted Jan 12, 2021 15:12 UTC (Tue) by eru (subscriber, #2753) [Link] (8 responses)

Won't they be forced to update the gcc when RHEL7 goes out of support?

Old compilers and old bugs

Posted Jan 12, 2021 16:16 UTC (Tue) by ncm (guest, #165) [Link] (6 responses)

Hahahahahaha no.

Finance people generally *prefer* things that are out of maintenance, because they experience even fewer incoming changes. Any infrastructural bug that seems like it ought to be physically possible to work around reliably is A-OK, if the alternative would involve changing something.

Old compilers and old bugs

Posted Jan 12, 2021 17:35 UTC (Tue) by eru (subscriber, #2753) [Link] (2 responses)

In that case, they probably don't care which GCC version is needed to compile newer kernels, because they are not upgrading the kernel anyway.

Old compilers and old bugs

Posted Jan 12, 2021 19:27 UTC (Tue) by jwarnica (subscriber, #27492) [Link] (1 responses)

I'm confused by the style of argument presented in the article and again in this chain. "Someone wants to use an obsolete dependency because their world is built on mostly obsolete things? OK, let them!" Now, I also don't like the argument in the form of "It compiles with `rm -rf ; git clone`, I don't care about anything else" (but that is another point)

Red Hat (in RHEL vWhatever) or Debian (in vWhatever), or similar, by definition, contains a cohesive set of obsolete (but internally cross compatible) things. If you want to backport a particular patch to your obsolete world, feel free, but then compile it with the included and matching obsolete compiler.

Old compilers and old bugs

Posted Jan 12, 2021 21:06 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

> "It compiles with `rm -rf ; git clone`, I don't care about anything else"

While this is the extreme end of the spectrum, there is a line that needs to be drawn. I don't know how many times bug reports have come down to things like "I have some random path in LD_LIBRARY_PATH because of some other problem I hit months ago" or (just this week) "the build picked up a Python2 tool in a Python3 PYTHONPATH-using environment" that if someone who doesn't know how to gather the required information (or even just read error messages) does have their problem solved by doing a clean build or other from-scratch setup is just the best solution. I don't even know how to document "make sure that tool X that is found is using Python3, not Python2" in a way that someone who couldn't determine that it's what's happening from the error message anyways would be able to effectively act upon.

> "Someone wants to use an obsolete dependency because their world is built on mostly obsolete things? OK, let them!"

Again, there's a limit sometimes as to how varied an environment a single project can reliably support. At some point when someone is asking for a 5 year old deployment of tool X to support other tools that came out yesterday and finding issues, well, what is one to do?

Old compilers and old bugs

Posted Jan 12, 2021 20:31 UTC (Tue) by atai (subscriber, #10977) [Link] (2 responses)

can gcc people release a new version of gcc 4.9 with the bug fix for this one in discussion?

Old compilers and old bugs

Posted Jan 12, 2021 21:09 UTC (Tue) by Wol (subscriber, #4433) [Link]

They already have. Six years ago. People haven't upgraded.

The problem isn't the code, it's peoples' inertia.

Cheers,
Wol

Old compilers and old bugs

Posted Jan 12, 2021 21:10 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

I mean they *could*, but who is going to get the vendor where the 4.9 is available now to also deploy the update? Making the source tarball isn't the end of the story here: anyone using 4.9 today is likely to also need to to go through other EOL pipelines.

Old compilers and old bugs

Posted Jan 12, 2021 23:16 UTC (Tue) by nivedita76 (subscriber, #121790) [Link]

Well, RHEL7 is still technically under maintenance support until 2024, and RHEL8 was released "only" 1.5 years ago. It's actually a bit weird that full support for 7 ended only a few months after 8 was released. RHEL6 still had full support for two years after 7 was released.

Old compilers and old bugs

Posted Jan 12, 2021 18:31 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

Ahh... I remember the good ole days of GCC 2.8!

Old compilers and old bugs

Posted Jan 22, 2021 0:25 UTC (Fri) by nix (subscriber, #2304) [Link] (1 responses)

So recent? The one we were stuck on for years and years was 2.7.2.x, various minor version of which were the major version for two years (if you count egcs as a gcc), or more than four if you wait for GCC 2.95 -- itself more or less a well-tested snapshot released because 3.x was taking so long. That wasn't out until *2001*, almost seven years to the day after 2.7.2.

(There is a nice historical timeline at https://gcc.gnu.org/releases.html so you don't have to dig around on ancient ftp servers because all the ancient historical digging has already been done for us. It's at times like this that I'm reminded of how old this stuff is, how it was made by a generation that is now fast approaching retirement: I'm not exactly young any more, old enough to have had no access to computers at all for the first few years of my life -- but when GCC was first released, I was only ten years old, and it had been under development for some years by then. Free software is older than the children of the microcomputer age. For that matter even the *text editors* that a lot of us use are older than we are: both vi and Emacs date back in some form to roughly 1976.)

Old compilers and old bugs

Posted Jan 22, 2021 0:43 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

Yep, and I also remember GCC 2.96 that didn't exist. Though it did ( https://gcc.gnu.org/gcc-2.96.html ), we were using it in 2005.

Old compilers and old bugs

Posted Jan 12, 2021 18:47 UTC (Tue) by alonz (subscriber, #815) [Link]

I'm surprised that they use GCC at all, and not COBOL...

Old compilers and old bugs

Posted Jan 12, 2021 22:59 UTC (Tue) by ndesaulniers (subscriber, #110768) [Link] (1 responses)

Will Deacon has just posted a patch to raise the minimum required version of GCC to 5.1 for ARCH=arm64: https://lore.kernel.org/lkml/20210112224832.10980-1-will@...

Old compilers and old bugs

Posted Jan 21, 2021 11:59 UTC (Thu) by Hi-Angel (guest, #110915) [Link]

…and it was applied to arm-next; and then a new discussion about raising gcc level to 5.1 appeared, which at this time seems to have succeeded. So the next merge window may well see gcc version being bumped to 5.1

Old compilers and old bugs

Posted Jan 13, 2021 0:04 UTC (Wed) by dvdeug (guest, #10998) [Link] (1 responses)

I've read that some Ada projects, often done with GNAT, require using an exact version of the compiler for the whole existence of the project. On the other hand, part of the reason is that any new patch would be audited on the assembly language level as well as the source code level, making bugs like this much easier to notice. It's a very expensive approach, but when crashes might cause a Congressional investigation, hopefully worth the money.

Old compilers and old bugs

Posted Jan 24, 2021 0:17 UTC (Sun) by david.a.wheeler (subscriber, #72896) [Link]

> I've read that some Ada projects, often done with GNAT, require using an exact version of the compiler for the whole existence of the project. On the other hand, part of the reason is that any new patch would be audited on the assembly language level as well as the source code level, making bugs like this much easier to notice. It's a very expensive approach, but when crashes might cause a Congressional investigation, hopefully worth the money.

The Ada standard includes "pragma reviewable", a compiler mechanism specifically to enable reviews of the generated code:
https://www.adaic.org/resources/add_content/standards/12r...

It's specifically to enable review when you're developing high assurance software (software has to be pretty critical to justify this level of review, but if it's really critical, this does help counter these kinds of errors).