LWN: Comments on "Kernel optimization with BOLT" https://lwn.net/Articles/993828/ This is a special feed containing comments posted to the individual LWN article titled "Kernel optimization with BOLT". en-us Sat, 27 Sep 2025 05:02:45 +0000 Sat, 27 Sep 2025 05:02:45 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Cache sizes https://lwn.net/Articles/997481/ https://lwn.net/Articles/997481/ raven667 <div class="FormattedComment"> Some of what you describe is the fact that while software people sort of sort of imagine the computer operates in a virtual realm of abstract logic, hardware is actually a physical electrical device and you can't just abstractly put "more cache" on it the way you could refactor a software program because of the physical reality of electrical circuits and wiring that is the computer.<br> </div> Fri, 08 Nov 2024 14:42:47 +0000 Grope https://lwn.net/Articles/997297/ https://lwn.net/Articles/997297/ paulj <div class="FormattedComment"> The free software community in 1998 was largely young people - from late teen students to 30-somethings (Linus was 29). So they're now in their 40s to 60s.<br> <p> I.e., the set of young people who enjoyed mildly vulgar/shocking-to-norms puns then, are mostly the same set of people as the older set who today find it juvenile.<br> </div> Thu, 07 Nov 2024 15:15:34 +0000 Grope https://lwn.net/Articles/997259/ https://lwn.net/Articles/997259/ cmkrnl <div class="FormattedComment"> Would have sounded lighthearted and witty to a close-knit group of young people in 1998, but today it just sounds juvenile an immature.<br> </div> Wed, 06 Nov 2024 23:35:50 +0000 Intriguing https://lwn.net/Articles/997126/ https://lwn.net/Articles/997126/ rolandog <div class="FormattedComment"> I'm also curious as to whether BOLT is smart enough to distinguish functions that need to run in constant time to prevent timing attacks. (Gotta watch the presentation, though... Maybe it's addressed there).<br> </div> Wed, 06 Nov 2024 12:41:43 +0000 Cache sizes https://lwn.net/Articles/996978/ https://lwn.net/Articles/996978/ anton The L2 cache of many cores is dedicated to the core, too (e.g., on Intel's P-cores for over a decade and on AMD's Zen-Zen5 cores). <p>The reason for keeping the L1 cache small is latency. If the cache grows, the miss rate decreases, but the latency increases. You can see the longer latency nicely in the comparison of L2 sizes and latencies in <a href="https://old.chipsandcheese.com/2024/09/27/lion-cove-intels-p-core-roars/">this article</a>. <p>One reason is that the wires get longer, which increases the time that signals travel. <p>You also want to use a virtually-indexed physically-tagged (VIPT) cache as L1 cache, which allows to perform the TLB access and the cache access in parallel, i.e., with low latency. But that means that the size of a cache way is at most as large as a page; the number of ways is limited (you typically don't see more than 16-way set-associative caches, and a lower number of ways is common in L1 caches), the page size is 4KB on AMD64, which limits the L1 cache sizes to 64KB (and 32KB or 48KB is more common). Apple's Firestorm (M1 P-core) has larger caches (192KB I-cache, 128KB D-cache), but also 16KB pages, which allows a VIPT cache implementation with a 12-way (I) or 8-way (D) set-associative cache. Tue, 05 Nov 2024 08:06:57 +0000 Cache sizes https://lwn.net/Articles/996963/ https://lwn.net/Articles/996963/ himi <div class="FormattedComment"> Out of curiosity, how much of that is because larger I and D caches didn't provide as much of a gain as increasing L3 caches? Particularly since the L1 caches are tightly coupled to each core rather than shared across the ever-increasing number of cores - devoting transistors to increasing L1 caches is going to have a very different cost/benefit mix than devoting them to more computational units or shared caches . . .<br> </div> Mon, 04 Nov 2024 23:48:58 +0000 Cache sizes https://lwn.net/Articles/996840/ https://lwn.net/Articles/996840/ paulj <div class="FormattedComment"> Hell, the AMD *K6* had 32 KiB I and D cache!<br> </div> Mon, 04 Nov 2024 10:45:09 +0000 Cache sizes https://lwn.net/Articles/996729/ https://lwn.net/Articles/996729/ anton The claims made in the article (maybe in the talk) about cache sizes are mostly wrong. <p>I-cache and D-cache typically have similar size, and if they differ, it's not always the D-cache that is larger. E.g., Zen4 has 32KB I-cache and 32KB D-cache, Zen5 and Raptor Cove have 32KB I-cache and 48KB D-cache, and Gracemont has 64KB I-cache and 32KB D-cache. <p>The sizes of L1 caches generally have not grown in the last 20 years; e.g., the 2003 Athlon 64 has 64KB I-cache and 64KB D-cache, and the 2003 Pentium M has 32KB I-cache and 32KB D-cache. Instead, they have added an L3 cache since that time. A number of cores have a microoperation cache in addition to the I-cache, but the sizes are hard to compare. Fri, 01 Nov 2024 18:46:52 +0000 Other BOLT weirdness https://lwn.net/Articles/996727/ https://lwn.net/Articles/996727/ anton The P-cores of Alder Lake don't support AVX-512 (implemented but disabled), either, unless you are using some early firmware. It's a pity that Intel completely disabled that, even in Xeon-E24xx CPUs where the E-cores are disabled. But don't worry, buy an AMD CPU with a Zen4 or Zen5 core, and you will get AVX-512. Fri, 01 Nov 2024 18:20:09 +0000 Grope https://lwn.net/Articles/996169/ https://lwn.net/Articles/996169/ atnot <div class="FormattedComment"> I think it is very amusing how fast I flip from being a filthy degenerate that must be kept away from society for their debauchery to a funless prude as soon as I impinge on peoples Sacred ability to make rape "jokes" and non-consenually touch people at conferences. Oh well.<br> </div> Tue, 29 Oct 2024 10:13:37 +0000 Grope https://lwn.net/Articles/996087/ https://lwn.net/Articles/996087/ LtWorf <div class="FormattedComment"> Moralists never have fun, so hold a grudge against others who have fun instead.<br> </div> Mon, 28 Oct 2024 14:33:14 +0000 Other BOLT weirdness https://lwn.net/Articles/995944/ https://lwn.net/Articles/995944/ intelfx <div class="FormattedComment"> <span class="QuotedText">&gt; AVX-512 is stupid. It doesn't work on efficiency cores on Alder Lake</span><br> <p> Perhaps it rather means that Alder Lake is stupid?<br> </div> Sun, 27 Oct 2024 06:59:56 +0000 Other BOLT weirdness https://lwn.net/Articles/995942/ https://lwn.net/Articles/995942/ Cyberax <div class="FormattedComment"> <span class="QuotedText">&gt; You can use runtime cpuid feature bit detection and identify the running CPU supports AVX512</span><br> <p> AVX-512 is stupid. It doesn't work on efficiency cores on Alder Lake. Even though P-cores support it.<br> </div> Sun, 27 Oct 2024 05:57:27 +0000 Other BOLT weirdness https://lwn.net/Articles/995941/ https://lwn.net/Articles/995941/ kmeyer <div class="FormattedComment"> There is some other BOLT behavior that derives from its use for HHVM: it can replace functions that use AVX512 intrinsics with traps. This is probably not useful for anyone aside from HHVM.<br> <p> <a href="https://github.com/llvm/llvm-project/blob/7b88e7530d4329ff0c7c8638f69b39fa1e540218/bolt/docs/CommandLineArgumentReference.md?plain=1#L352-L355">https://github.com/llvm/llvm-project/blob/7b88e7530d4329f...</a><br> <p> Also, this is ... kind of insane, for library source code? You can use the compile-time feature detection support and identify the compiler target supports AVX512 (-mavx512 or whatever). You can use runtime cpuid feature bit detection and identify the running CPU supports AVX512. But if your binaries have been though BOLT with the -trap-avx512 flag, your AVX-accelerated function will just trap with a ud2 instruction.<br> <p> If you find yourself needing to detect, at runtime, this BOLT bastardization of the binary, this ugly hack seems to work: <a href="https://github.com/facebook/folly/blob/d5e10f9d076838374fb7458fa25844ea93e3538f/folly/detail/TrapOnAvx512.cpp#L32-L34">https://github.com/facebook/folly/blob/d5e10f9d076838374f...</a><br> </div> Sun, 27 Oct 2024 04:35:42 +0000 Intriguing https://lwn.net/Articles/995909/ https://lwn.net/Articles/995909/ jd <div class="FormattedComment"> It would seem like there are now quite an array of tools for optimising in various ways.<br> <p> But one optimisation can potentially interact with another optimisation, and optimal binary reordering may be affected by compiler optimisation which may in turn be potentially affected by optimal binary reordering.<br> <p> I'm trying to figure out from this article how, exactly, you get the most out of this.<br> <p> I'd also be intrigued to know if this technique could be used effectively with the Verified Software Toolchain. VST is fine for producing provably correct binaries, but there's obvious drawbacks to this - there's not a whole lot of optimising you can do and still be certain the binaries are correct.<br> <p> If you can greatly accelerate VST-produced binaries without impacting the proof of correctness in any way, I could imagine scenarios where this could actually be useful.<br> <p> </div> Sat, 26 Oct 2024 12:35:24 +0000 Grope https://lwn.net/Articles/995907/ https://lwn.net/Articles/995907/ intelfx <div class="FormattedComment"> There is nothing the quote or in the original text that would suggest that any kind of _actual_ harasssment, "physical assault", or, worse, "sexual assault" has taken place.<br> <p> <span class="QuotedText">&gt; I'd like to please stay as far away from you as possible.</span><br> <p> The feeling is thus mutual.<br> </div> Sat, 26 Oct 2024 10:01:57 +0000 Grope https://lwn.net/Articles/995906/ https://lwn.net/Articles/995906/ atnot <div class="FormattedComment"> Sorry, if your idea of "lighthearted fun" is getting "harassed" (direct quote) and physically assaulted into watching a presentation about how proud the author is about his sexual assault joke, I'd like to please stay as far away from you as possible.<br> </div> Sat, 26 Oct 2024 09:22:24 +0000 Grope https://lwn.net/Articles/995905/ https://lwn.net/Articles/995905/ intelfx <div class="FormattedComment"> And the problem with these “uncomfortable” “yikes” “whatever this is”, besides people having apparently light-hearted fun, is exactly… what?<br> </div> Sat, 26 Oct 2024 09:06:17 +0000 Grope https://lwn.net/Articles/995903/ https://lwn.net/Articles/995903/ atnot <div class="FormattedComment"> <span class="QuotedText">&gt; Alan Cox has grabbed Miguel and forced him to sit down. The two of them are heading to the front. Apparently the harassment in the hallway had reached too high a level. No! He's escaped! </span><br> <span class="QuotedText">&gt; "rope" is a pun on "cord" but then creates a great word combined with GNU</span><br> <p> yeah this whole thing is just one yikes after another. jesus christ. As uncomfortable as I still am visiting events like this today, at least it's no longer... whatever this is.<br> </div> Sat, 26 Oct 2024 08:08:31 +0000 Grope https://lwn.net/Articles/995882/ https://lwn.net/Articles/995882/ willy <div class="FormattedComment"> It only took 25 years to replace<br> <p> <a href="https://lwn.net/1998/1029/als/rope.html">https://lwn.net/1998/1029/als/rope.html</a><br> <p> (Not sure why it never got released ...)<br> <p> And, damn, that name and the "jokes" being made ... I think we're a bit better now.<br> </div> Sat, 26 Oct 2024 00:33:23 +0000