|
|
Subscribe / Log in / New account

Profile-guided optimization for the kernel

Profile-guided optimization for the kernel

Posted Sep 4, 2020 5:06 UTC (Fri) by rbrito (guest, #66188)
Parent article: Profile-guided optimization for the kernel

I am eagerly waiting for kernels to be regularly compiled with link-time optimization at least, since the prospects of it generating smaller binaries could help a lot with booting more armv5 machines (the space for the kernel file is limited when loading it with the bootloader).

If the kernel also happens to use fewer bytes in memory, that is a very nice side effect that could be left to userspace in memory constrained machines, so much better. In fact, this would also help with the kernel that phones use (I'm thinking of lineageos here). A leaner kernel would mean less thrashing, which can only be a good thing. I hope that Apps that use NDK can also switch to LTO in the relatively near future.

Having distribution kernels with PGO for regular computers enabled seems to also be very nice. I really, really hope that we're not far from that being the bread-and-butter of kernel compilation.

Also talking about compiler optimizations, in the last few days I read about a new approach that was just merged in (see https://github.com/llvm/llvm-project/commit/94faadac) LLVM called machine function splitter. It seems to be roughly based on the observation that not all hot functions have all their parts hot. It would be lovely to get something similar in GCC.

In fact, if you think about it, many of the optimizations (like reducing the code for little machines like armv5) may also help significantly the code for big, compute-intensive, cloud applications and, so, everyone would benefit from that...


to post comments

Profile-guided optimization for the kernel

Posted Sep 4, 2020 22:09 UTC (Fri) by nivedita76 (subscriber, #121790) [Link]

Don't know any details, but the LLVM pass sounds similar to gcc's -freorder-blocks-and-partition

In addition to reordering basic blocks in the compiled function, in order to reduce number of taken branches, partitions hot and cold basic blocks into separate sections of the assembly and .o files, to improve paging and cache locality performance.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds