LWN: Comments on "Shrinking the kernel with link-time garbage collection" https://lwn.net/Articles/741494/ This is a special feed containing comments posted to the individual LWN article titled "Shrinking the kernel with link-time garbage collection". en-us Sun, 07 Sep 2025 09:40:07 +0000 Sun, 07 Sep 2025 09:40:07 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Shrinking the kernel with link-time garbage collection https://lwn.net/Articles/847713/ https://lwn.net/Articles/847713/ maskray <div class="FormattedComment"> I managed to convince binutils that generic SHF_LINK_ORDER (not just arm) is useful, in particular, __patchable_function_entries needs it. H.J. Lu kindly implemented it and another nice syntax `,unique` in binutils 2.35.<br> <p> SHF_LINK_ORDER unfortunately has some pitfalls. I have a summary in <a href="https://maskray.me/blog/2021-01-31-metadata-sections-comdat-and-shf-link-order">https://maskray.me/blog/2021-01-31-metadata-sections-comd...</a><br> </div> Mon, 01 Mar 2021 05:01:45 +0000 The asteroid belt https://lwn.net/Articles/743420/ https://lwn.net/Articles/743420/ fest3er <div class="FormattedComment"> (I like to pick nits....)<br> <p> I trust you meant Jupiter....<br> </div> Sun, 07 Jan 2018 04:43:00 +0000 Shrinking the kernel with intelligent config https://lwn.net/Articles/743419/ https://lwn.net/Articles/743419/ fest3er <div class="FormattedComment"> Any reason the output of 'lshw' can't be twerked and combined with a file that lists 'hardware to be added' and 'features to be included' to provide input to a kernel configurator? You don't so much need to know how the hardware is laid out as you need to know which hardware is present. The configurator should then be able to enable just those bits of kit and features and dependencies....<br> </div> Sun, 07 Jan 2018 04:39:10 +0000 Shrinking the kernel with intelligent config https://lwn.net/Articles/743176/ https://lwn.net/Articles/743176/ Wol <div class="FormattedComment"> I've got to have a running kernel first.<br> <p> Plus I think there's other stuff wrong with it from my view.<br> <p> Plus I think it defines things as modules, not loaded in.<br> <p> Plus what happens if the plug-in hardware is still awaiting delivery.<br> <p> Localmodconfig is a good idea, don't get me wrong. But it doesn't satisfy all use cases, and building a kernel specifically configured to a known hardware set is one of them.<br> <p> (Don't forget I run gentoo :-) so building targeted kernels is normal. And I've been floating around the linux scene since kernel 1.3 ... :-)<br> <p> Cheers,<br> Wol<br> </div> Fri, 05 Jan 2018 12:01:01 +0000 Shrinking the kernel with intelligent config https://lwn.net/Articles/743127/ https://lwn.net/Articles/743127/ HelloWorld <div class="FormattedComment"> What's wrong with `make localmodconfig`?<br> </div> Fri, 05 Jan 2018 00:40:10 +0000 Shrinking the kernel with intelligent config https://lwn.net/Articles/743067/ https://lwn.net/Articles/743067/ gregkh <div class="FormattedComment"> <font class="QuotedText">&gt; From what I've picked up no-one's even considering this, but what I'd like to see is the kernel config system made much simpler.</font><br> <p> This comes up all the time, everyone wants it done, no one has actually taken the time to do it :(<br> <p> </div> Thu, 04 Jan 2018 21:31:48 +0000 The asteroid belt https://lwn.net/Articles/743055/ https://lwn.net/Articles/743055/ Wol <div class="FormattedComment"> <font class="QuotedText">&gt; The tiny computing space is just like an asteroid field; numerous projects exist, but they lack the required center of gravity for effective and self-sustained communities to form naturally around them. Consolidation efforts are moving slowly because of that. The end result is a highly fragmented space with relatively few developers per project and, therefore, fewer resources to rely upon when issues come up. Vulnerabilities are more likely to turn into a security nightmare.</font><br> <p> Actually, as an astronomer, they DON'T lack the required centre of gravity (at least, that's not the main problem). The asteroid belt has insufficient distance between it and Mars and Saturn, so should a cluster of asteroids happen to form and start coalescing (as you'd expect), every time Saturn goes past the gravitational disruption pushes them apart again. They're literally torn apart by gravitational tides from nearby larger objects.<br> <p> Which could also be a good analogy as to what happens to these small projects - the developers are heavily subjected to being lured away by larger projects in the same general eco-system.<br> <p> Cheers,<br> Wol<br> </div> Thu, 04 Jan 2018 20:54:18 +0000 Shrinking the kernel with intelligent config https://lwn.net/Articles/743049/ https://lwn.net/Articles/743049/ Wol <div class="FormattedComment"> From what I've picked up no-one's even considering this, but what I'd like to see is the kernel config system made much simpler.<br> <p> It would be nice if there was a hardware section, so you could, say, select "AMD system". This would then ask you "which processor", "which chipset", "which motherboard" etc, and that will select the options to pull in all the drivers for that specific hardware.<br> <p> If you've got fixed hardware, like a router for example, you would then be able to compile all the drivers into the kernel, plus maybe a load of modules for USB for stuff you want to support. It would be great for size reduction - and security - making it far easier to create targeted monolithic kernels with no extra stuff beyond what you need.<br> <p> Cheers,<br> Wol<br> </div> Thu, 04 Jan 2018 20:42:26 +0000 Shrinking the kernel with link-time garbage collection https://lwn.net/Articles/742173/ https://lwn.net/Articles/742173/ smitty_one_each <div class="FormattedComment"> Indeed, 692F will have you jamming to Talking Heads "Burning Down the House", albeit briefly.<br> </div> Sat, 23 Dec 2017 00:18:05 +0000 Shrinking the kernel with link-time garbage collection https://lwn.net/Articles/742150/ https://lwn.net/Articles/742150/ liw <div class="FormattedComment"> During this time of the year, in the northern parts of the northern hemisphere, it is good to repeat that often: 640K ought to be enough for everyone. Any hotter and it'll get really uncomfortable.<br> </div> Fri, 22 Dec 2017 18:37:59 +0000 Shrinking the kernel with link-time garbage collection https://lwn.net/Articles/742143/ https://lwn.net/Articles/742143/ smitty_one_each <div class="FormattedComment"> <font class="QuotedText">&gt; "So what does "tiny" actually mean? Let's define it as a sub-megabyte system or thereabout. "</font><br> <p> "See? See? I told you that nobody really *needed* more than 640K of memory," said Gill Bates.<br> </div> Fri, 22 Dec 2017 16:41:32 +0000 Shrinking the kernel with link-time garbage collection https://lwn.net/Articles/742105/ https://lwn.net/Articles/742105/ linusw <div class="FormattedComment"> I just love this metaphor with the open source gravity field.<br> <p> It comes close to describing how it actually works and correspond with the social law that people like to belong to something and to get recognition from their peers.<br> </div> Fri, 22 Dec 2017 08:55:55 +0000 Shrinking the kernel with link-time garbage collection https://lwn.net/Articles/742095/ https://lwn.net/Articles/742095/ flussence <div class="FormattedComment"> Another thing came to mind re-reading this just now: routers. OpenWRT/LEDE has dropped or declined to support quite a few models as Linux has grown too big for them over the years, and the flash size is comparable to what you'd have available to install coreboot in.<br> </div> Fri, 22 Dec 2017 00:20:43 +0000 Compile as one huge object https://lwn.net/Articles/741720/ https://lwn.net/Articles/741720/ mwsealey <div class="FormattedComment"> Multi-file compilation is a thing. You can actually use both together - the key aspect if LTO as per the article is putting each function and data item in it's own named section, because the linker can only act on individual sections as an unbreakable unit. Multi-file complication would still allow you to do this, but also provide greater inlining opportunities at the compiler level (for functional equivalence or tail-calling) which the linker has less information about.<br> <p> It's probably best to leave the MFC part on a per module/subsystem basis (so, anything that gets a built-in.o) and then LTO both the main kernel binary and any modules. The difficulty is going to be making sure that public functions do not<br> get LTO'd away, while C99 inlining somewhat forces preservation of the 'original' function even if it inlines everything, LTO could strip it out if there's no out-of-line reference to it, and then your modules wouldn't load.<br> <p> It's a pain in the backside, all told, but the results are always worth it.<br> </div> Mon, 18 Dec 2017 16:06:38 +0000 Let's see some numbers! https://lwn.net/Articles/741719/ https://lwn.net/Articles/741719/ npitre <div class="FormattedComment"> Numbers will come in a subsequent article after more methods are exposed then compared.<br> </div> Mon, 18 Dec 2017 16:03:42 +0000 Shrinking the kernel with link-time garbage collection https://lwn.net/Articles/741658/ https://lwn.net/Articles/741658/ pabs <div class="FormattedComment"> This work would probably be useful for Intel's KVM based Clear Containers, now renamed to Kata Containers:<br> <p> <a href="https://katacontainers.io/">https://katacontainers.io/</a><br> </div> Sun, 17 Dec 2017 03:39:59 +0000 Let's see some numbers! https://lwn.net/Articles/741653/ https://lwn.net/Articles/741653/ david.a.wheeler <div class="FormattedComment"> I'd very much like to see before-and-after numbers. If the work reduces memory use by 10 bytes, it probably doesn't matter. I suspect that approaches are more effective when combined, but even so, I'd like to see if the effort appear to have the desired result.<br> </div> Sun, 17 Dec 2017 02:03:28 +0000 Shrinking the kernel with link-time garbage collection https://lwn.net/Articles/741652/ https://lwn.net/Articles/741652/ ThinkRob <p>One "mainstream" (aka not embedded) use might be those of us running coreboot. Some laptops have pretty limited flash, and there's not enough room for a Linux payload *and* another toy or two. So anything that might get the kernel down to the point where it all fits is nice.</p> <p>That said, "mainstream" might be a stretch. I mean, there may be dozens of us. But still... "<i>There are dozens of us! Dozens!</i>" ;)</p> Sun, 17 Dec 2017 01:45:31 +0000 Compile as one huge object https://lwn.net/Articles/741637/ https://lwn.net/Articles/741637/ gutschke <div class="FormattedComment"> Compilers aren't optimized for this use case, and tend to run out of memory if you try to naively load everything into them at the same time. But even if that was taken care of, you are going to run into a lot of conflicts with incompatible reuse of symbols. You should try this one day on any project that is bigger than just a "toy" example. It usually turns out to be insanely painful.<br> <p> You also lose out on a lot of the error messages that you would normally *want* to get, when there are layering violations and code reaches into private implementation details of other parts of the system. So, in practice, you'd have to maintain the code so that it can be compiled in the traditional fashion (that would happen during normal development cycles) and so that it can be compiled as one big file. That's very painful for the developers.<br> <p> As far as I can tell, there are some compilers that internally do something similar to what you are suggesting. And with proper compiler support, it can of course be made to work. But honestly, from the developer's point of view, it ends up looking very similar to what is described in this article. And it would still need the same source-code annotations to help the compiler figure out which references are life and which ones aren't.<br> </div> Sat, 16 Dec 2017 17:44:55 +0000 Compile as one huge object https://lwn.net/Articles/741638/ https://lwn.net/Articles/741638/ pbonzini <div class="FormattedComment"> That's exactly what link-time optimization does.<br> </div> Sat, 16 Dec 2017 17:40:48 +0000 Compile as one huge object https://lwn.net/Articles/741633/ https://lwn.net/Articles/741633/ mpr22 Well, it would at least let you know whether your compiler's execution time is polynomial in the size of the translation unit. Sat, 16 Dec 2017 16:16:23 +0000 Compile as one huge object https://lwn.net/Articles/741629/ https://lwn.net/Articles/741629/ epa <div class="FormattedComment"> I would have expected to skip the linking step altogether by concatenating all the source files into a big lump and then building that with optimization turned on. Some munging might be needed for symbols which are defined in multiple source files, but surely it's manageable.<br> </div> Sat, 16 Dec 2017 12:59:15 +0000 Shrinking the kernel with link-time garbage collection https://lwn.net/Articles/741618/ https://lwn.net/Articles/741618/ flussence <div class="FormattedComment"> This is cool work! It'd be nice for us mainstream users as well as embedded; my vmlinuz images have almost doubled in size since I first put together this desktop…<br> </div> Sat, 16 Dec 2017 07:17:57 +0000 Shrinking the kernel with link-time garbage collection https://lwn.net/Articles/741617/ https://lwn.net/Articles/741617/ foom This "missing forward reference" use case was <a href="https://groups.google.com/d/msg/generic-abi/_CbBM6T6WeM/9nNnwRNHAQAJ">recently discussed</a> on the "generic-abi" email list. The conclusion was that the SHF_LINK_ORDER flag's semantics should be slightly extended to cover this. Unfortunately, it's not yet implemented in the GNU linker nor assembler. However, if you use recent clang and lld, you can see it in action today. (A <a href="https://sourceware.org/ml/binutils/2017-04/msg00000.html">patch</a> was posted for gold, but doesn't seem to have been applied yet) <p> E.g. here's a slightly modified example from article. I've removed the ".reloc" hack, and modified the .section line, in red, to create a section marked with SHF_LINK_ORDER (that's the "o" in the "ao" flag): <p> In file test.s: <pre> .section .text.foobar,"ax" .globl foobar foobar: mov r3, #0 mov r2, #0x5a 1: strt r2, [r0] 2: mov r0, r3 bx lr .section .fixup.text.foobar,"ax" 3: mov r3, #-57 b 2b <b><font color="red"> .section __ex_table.text.foobar, "ao", %progbits,.text.foobar</font></b> 4: .long 1b, 3b </pre> In file test2.c: <pre> int foobar(); int main() { foobar(); } </pre> I run: <pre> $ clang -target arm-linux-gnueabihf -c -o test.o test.s $ clang -target arm-linux-gnueabihf -c -o test2.o test2.c $ clang -target arm-linux-gnueabihf -o test test2.o test.o -Wl,--gc-sections -fuse-ld=lld </pre> Doing an "objdump -x test", you will see that the "__ex_table.text.foobar" section made it into the output file. If you then comment out the call in main() to "foobar()", you'll see the __ex_table disappears too. <p> I'd recommend the kernel folks get the GNU tools to implement these semantics, so that weird no-op reloc tricks aren't needed. Sat, 16 Dec 2017 04:09:23 +0000