User: Password:
|
|
Subscribe / Log in / New account

Link-time optimization for the kernel

Link-time optimization for the kernel

Posted Aug 22, 2012 8:15 UTC (Wed) by jezuch (subscriber, #52988)
Parent article: Link-time optimization for the kernel

AFAIK there's another benefit of LTO: it can merge identical functions and data from multiple object files. This is especially beneficial for C++, where, for example, every time one #includes <cstream>, the compiler generates several stubs which turn out to be identical and unnecessarily duplicated. This and other optimizations usually result in about 10% reduction in the final binary's size.

That said, I notice that the kernel actually grows with LTO. The modules do get smaller - especially large ones like GPU and filesystem drivers - but not vmlinuz.

Oh, and the development version which will become GCC 4.8 reduces memory use even further. So much that it can actually build itself on my machine with 8 gigs of ram, although with some swapping :)


(Log in to post comments)

Link-time optimization for the kernel

Posted Aug 22, 2012 13:34 UTC (Wed) by jwakely (guest, #60262) [Link]

> it can merge identical functions and data from multiple object files

But in practice does it actually do so?

> every time one #includes <cstream>, the compiler generates several stubs which turn out to be identical and unnecessarily duplicated

I assume you mean <iostream>, and AFAIK LTO doesn't change that situation at all (see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44952 for a suggestion to make it do so, and other suggestions that wouldn't rely on LTO.)

Link-time optimization for the kernel

Posted Aug 22, 2012 13:37 UTC (Wed) by andikleen (guest, #39006) [Link]

I don't think that works currently with LTO

There were some linker based approaches for this though (it only really needs a checksum per function)

Link-time optimization for the kernel

Posted Aug 22, 2012 18:19 UTC (Wed) by stevenb (guest, #11536) [Link]

GCC doesn't do this, but AFAIU the gold linker's icf.cc does this. From binutils-2.22:src/gold/icf.cc:

// Identical Code Folding Algorithm
// ----------------------------------
// Detecting identical functions is done here and the basic algorithm
// is as follows. A checksum is computed on each foldable section using
// its contents and relocations. If the symbol name corresponding to
// a relocation is known it is used to compute the checksum. If the
// symbol name is not known the stringified name of the object and the
// section number pointed to by the relocation is used. The checksums
// are stored as keys in a hash map and a section is identical to some
// other section if its checksum is already present in the hash map.
// Checksum collisions are handled by using a multimap and explicitly
// checking the contents when two sections have the same checksum.
//
// However, two functions A and B with identical text but with
// relocations pointing to different foldable sections can be identical if
// the corresponding foldable sections to which their relocations point to
// turn out to be identical. Hence, this checksumming process must be
// done repeatedly until convergence is obtained.

Whether this works with LTO, I don't know. And I suppose it requires -ffunction-sections but I'm not sure about that either.

Not a very helpful post, sorry ;-)

Link-time optimization for the kernel

Posted Aug 22, 2012 22:43 UTC (Wed) by andikleen (guest, #39006) [Link]

gold unfortunately does not work with LTO kernel builds at the moment.
You could only use it without.

Link-time optimization for the kernel

Posted Aug 23, 2012 10:05 UTC (Thu) by jwakely (guest, #60262) [Link]

No need to apologise, I'd somehow missed that gold did ICF and thought Microsoft's was the only mainstream linker to support it. Thanks, Steven!

Link-time optimization for the kernel

Posted Sep 11, 2014 4:34 UTC (Thu) by alison (subscriber, #63752) [Link]

I was just reading about Open Mirage

http://anil.recoil.org/papers/2013-asplos-mirage.pdf

and thinking about the security advantages of the "sealed" unikernel with its Write XOR Execute policy. The paper comments about linking:

"Unikernels link li-
braries that would normally be provided by the host OS, allowing
the Unikernel tools to produce highly compact binaries via the nor-
mal linking mechanism. **Features that are not used in a particular
compilation are not included** and whole-system optimization tech-
niques can be used. In the most specialised mode, all configuration
files are statically evaluated, enabling extensive dead-code elimi-
nation at the cost of having to recompile to reconfigure the service."

Reading that paragraph motivated me to revisit this LTO dicussion. I can't figure out whether LTO commonly includes "dead-code elimination," meaning that functions from a statically linked library that are not anywhere used are not included in the final binary. In other words, are functions either include in-line, or not included at all? Thanks to any experts who wander by and know the answer.

Link-time optimization for the kernel

Posted Sep 17, 2014 21:39 UTC (Wed) by nix (subscriber, #2304) [Link]

Functions that are used nowhere (and, if this is a shared library or a binary compiled with -Wl,--export-dynamic, are not in the external symbol table) are eliminated, but that doesn't mean all the rest are inlined. Most will not be inlined (indeed, some may be cloned without being inlined).

Link-time optimization for the kernel

Posted Aug 23, 2012 10:19 UTC (Thu) by mgedmin (subscriber, #34497) [Link]

Wasn't there recently a post about a kernel bug caused by the compiler merging two different functions that turned out to contain identical code? IIRC it was caused by a check of a function pointer in order to determine an object's type misfiring.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds