LWN.net Logo

Partial LTO?

Partial LTO?

Posted Aug 21, 2012 17:19 UTC (Tue) by cesarb (subscriber, #6266)
Parent article: Link-time optimization for the kernel

The kernel already does a lot of partial linking (ld -r) in its build. Could it do partial LTO too? That is, LTO a few of the major directories separately (kernel, mm, fs, net, arch - probably everything but drivers), and do not LTO the final link step. This would give smaller gains, but would also compile faster.


(Log in to post comments)

Partial LTO?

Posted Aug 22, 2012 4:50 UTC (Wed) by jzbiciak (✭ supporter ✭, #5246) [Link]

At what granularity is it partially linking? If it's of the sort where a library of functions gets partially linked separately of all the places that call that library, the impact of LTO should be noticeably lessened.

Partial LTO?

Posted Aug 22, 2012 13:23 UTC (Wed) by nix (subscriber, #2304) [Link]

It's just ld -r, and it's done in every directory in the kernel tree (that has any .o files in it at all), generating either module.o or built-in.o files that are then linked together to generate final modules or the kernel proper (though the kernel proper also has a bunch of other object files, most of the bulk is in built-in.o files). The LTO sections should get carried along with this, and then at full link time the linker plugin should pick them up and do a full LTO.

Partial LTO?

Posted Aug 22, 2012 14:22 UTC (Wed) by jzbiciak (✭ supporter ✭, #5246) [Link]

I kinda figured it was something such as by-directory (it certainly appeared that way last time I built a kernel). I was pretty sure the normal flow carried the serialized GIMPLE down to the end for the final LTO. Where the boundary becomes important is if you try to implement cesarb's partial-LTO suggestion.

If you did a "partial LTO", where you munged all the GIMPLE together and generated a new object in place of the partially linked library, and didn't pass the GIMPLE up to the next level, you'd be leaving many of the most interesting optimizations aside if many come from optimizing library and core code into the bodies of drivers and other leaves.

It doesn't make sense to re-codegen at partial link time unless your intent is to throw away the GIMPLE, or provide a library that could be linked both with and without further LTO (which seems... weird?).

Partial LTO?

Posted Aug 23, 2012 17:08 UTC (Thu) by rriggs (subscriber, #11598) [Link]

If you build and then load *any* kernel modules, it is, by definition, partial LTO.

Partial LTO?

Posted Aug 23, 2012 19:33 UTC (Thu) by jzbiciak (✭ supporter ✭, #5246) [Link]

Fair enough, but only at the module boundary. Isn't this the same as any DSO? I mean, it's not as if LTO is bringing in libc and all the other shared libraries your other shared libraries when you use it in user-space, is it?

The point is, to get the full benefit of LTO in the kernel, you want to be able to inline or optimize across boundaries such as arch/arm/ and kernel/, for example. (At least, if I understood correctly.) If those are both LTO'd at their respective directory boundaries, and only linked traditionally at the last step, you lose those opportunities.

Sure, drivers built as modules miss out. But, my gut feel (which may be wrong!) is that there's still plenty of things that are compiled in that ought to benefit that won't if you limit LTO to the partial link boundaries.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds