|
|
Subscribe / Log in / New account

The Compact C Type Format in the GNU toolchain

The Compact C Type Format in the GNU toolchain

Posted Aug 7, 2019 0:55 UTC (Wed) by nix (subscriber, #2304)
In reply to: The Compact C Type Format in the GNU toolchain by roc
Parent article: The Compact C Type Format in the GNU toolchain

OK but how big was that binary? Things that work well for reasonable-sized programs don't necessarily work well for Firefox or Chromium. If you want people to use CTF to generate FFI glue at runtime, for example, even a small startup penalty is going to cause people to look for alternatives.
Let's try it for an enterprise Linux kernel (because I've got one sitting here waiting). The kernel splits its CTF unusually: let's try the output of the old deduplicator, vmlinux.ctf (types only used by the core kernel) plus its parent shared_ctf.ctf (types used by more than one module, or by at least one module and the core kernel). Put together these are 1509340 bytes compressed, 4267753 bytes uncompressed. With a good deduplicator you need a *big* program for that, though no doubt a C++-capable CTF would find Chromium to be just such.

A thousand cats:

1.26user 0.33system 0:01.52elapsed 104%CPU (0avgtext+0avgdata 3320maxresident)k

A thousand uncompresses (done by hacking libctf to abort on error and free everything immediately after uncompressing). Unsurprisingly gunzip is not free:

34.42user 3.64system 0:38.03elapsed 100%CPU (0avgtext+0avgdata 9472maxresident)k

A thousand dumps of the CTF header redirected to /dev/null (which roughly involves open, decompress, and sweep for indexes etc, do almost no work, close):

35.28user 2.97system 0:38.23elapsed 100%CPU (0avgtext+0avgdata 9468maxresident)k

That's in the noise: if it costs anything, the indexing costs well under 1% of the cost of decompression: and since it increases the efficiency of compression to do this sort of thing, it may in the end *save* time as well as space. (I also tried this with an old-format file: the transparent upgrade pass was also in the noise.)

Note that the CTF link section merging machinery almost entirely resides in libctf and is intended to be reusable by other projects: it's not ld-specific, and you're not restricted to doing CTF merging the exact same way ld does it. Things like Chromium and Firefox might well elect to postprocess themselves and split up their CTF differently, yielding smaller CTF dictionaries customized for their use. (Right now, you can choose to split along boundaries different from translation unit boundaries, lumping TUs together into bigger units, and you can choose an alternative conflict-resolution strategy where rather than placing all types in one big dictionary unless they conflict, we place all types in per-TU subdictionaries unless they are used by more than one TU: so the parent TU gets a lot smaller. The linker doesn't use any of this stuff yet, but in time it might grow options controlling some of this. There's no point yet since most of that depends on a good deduplicator. The one I haven't written yet. :) )

... also of course we'd need clang support for CTF generation and gold and lld support for .ctf section merging *and* C++ support for CTF before Chromium or Firefox would become likely users. That's some way off, I think.


to post comments

The Compact C Type Format in the GNU toolchain

Posted Aug 7, 2019 1:05 UTC (Wed) by roc (subscriber, #30627) [Link]

Those are certainly encouraging results.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds