Meaningful comparison of CTF and DWARF

Posted Aug 6, 2019 21:32 UTC (Tue) by roc (subscriber, #30627)
Parent article: The Compact C Type Format in the GNU toolchain

It would be helpful for someone to explain why CTF is more compact than DWARF at representing the same information.

The article touches on that, but not in a way that makes sense to me:
> a DWARF entry is essentially a program in its own right that can be run to generate the needed information. That makes DWARF flexible, but it's also complicated and verbose;

DWARF is complicated and verbose, but DWARF location expressions (the Turing-complete part) are not the problem, because in practice almost all of them are very simple. Also, most location expressions describe the locations of local variables, which CTF completely ignores.

Typically a lot of DWARF data is present to support stack walking of code without frame pointers, and to walk stacks of inlined functions. (Given CTF backtrace support is "still being designed", that makes size comparisons currently not very meaningful.) A very important question is, if you configured a compiler to emit just enough DWARF to represent what CTF represents today, or after backtrace support is implemented in CTF, how would the DWARF compare to CTF in size, and what would be the cause(s) of that?

The article also says
> The CTF data is naturally smaller, but the format also includes compression to reduce the size requirements further.

DWARF supports compression too --- in at least three different ways, unfortunately --- so no advantage for CTF there.

> The linker then has to take these sections and merge them into a larger section, removing any duplicate information. There is a potential problem, though, in that different object files may define conflicting objects using the same names.

This could be a relevant difference. Today's linkers aren't merging duplicate DWARF information across object files. However, compilers use tricks to try to reduce the duplicate information they emit into object files, and there are tools like dwz to merge duplicate DWARF information after linking. So comparing CTF sizes vs DWARF sizes should include a comparison with dwz-processed DWARF. (Also note that there has been significant resistance in LLD at least to proposals to implement DWARF deduplication in the linker, so it's not a given that this approach will actually become ubiquitous.)

From reading https://gcc.gnu.org/ml/gcc-patches/2019-06/msg01711.html I see that CTF can reference strings in the main ELF symbol table. That's a good idea but it's not clear it's responsible for big size wins (and DWARF could easily be extended to do that).

The fact that CTF doesn't support C++ yet is a problem. In my experience DWARF size for C++ and Rust are a much bigger problem than for C because of their much more complex type systems and much larger mangled identifiers, so comparisons of DWARF vs CTF for those languages are more important. Also those languages may well require a significant increase in complexity of CTF, so it seems to me it would make sense to make sure C++ is supported before promulgating a version of CTF that tools are expected to keep compatibility with.

For C++ and other reasons the dream that CTF will be small enough to be "always present" may founder. The relevant comparison is the size of CTF vs the size of the rest of the ELF binary; it seems likely to me that there will always be people who see an advantage in stripping all kinds of debug info. Identifying isomorphic type subgraphs for merging is likely to slow down the link, and that too will encourage people to disable CTF.

Don't get me wrong, I think DWARF is a terrible format that could be greatly improved in compactness and in other ways, and it would be great if CTF can be that, but as a debugger developer the last thing I want is an XKCD 927 situation. In particular it seems that if we get an ecosystem of tools that depend on CTF, then for tasks like debugging with local variable values which are explicitly outside the scope of CTF, we'll have to build and link binaries with both DWARF and CTF, which doesn't sound good.

I apologize if the issues I've raised are answered in online documentation; I searched, but other than ctfout.h I couldn't find the talk or other relevant information.

Meaningful comparison of CTF and DWARF

Posted Aug 6, 2019 23:39 UTC (Tue) by nix (subscriber, #2304) [Link] (10 responses)

What an excellent comment! This sort of thing is why I subscribe to LWN :)

A very important question is, if you configured a compiler to emit just enough DWARF to represent what CTF represents today, or after backtrace support is implemented in CTF, how would the DWARF compare to CTF in size, and what would be the cause(s) of that?

That's a very interesting question. I'm not aware of any way to configure or indeed modify any extant compiler to do that without fairly major surgery, alas :(

DWARF supports compression too --- in at least three different ways, unfortunately --- so no advantage for CTF there.

Actually, there is. CTF has been designed with compression in mind from the start, and a number of its design decisions are aimed to increase compressibility: e.g. the almost complete eschewing of self-identifiers (in favour of implied identifiers via array offsets) and self-description. Self-identifiers are killers for compression ratios because they are all unique, and that means incompressible.

I see that CTF can reference strings in the main ELF symbol table. That's a good idea but it's not clear it's responsible for big size wins

The size wins are mostly that function and data symbols don't need their names to be repeated. The remaining space in CTF string tables is mostly full of structure member names, which do indeed rarely appear in the ELF strtab. We save a bit more space by sorting the strtab. I tried saving more space by slicing CTF strtab entries up on case changes and underscores (combining them back where every user of any of those chunks used all of them) and pointing to those chunks of strings via a "strtab chunks table". It saved about 50--80% of the strtab space... when uncompressed. When compressed, it cost about 10%. So I ripped it out again. Modern compressors are good. But it is possible that when really long identifiers are involved, as in C++, this chunking trick may work better. I still have the code and can try that out when the time comes. Perhaps we can save space by compressing all of the CTF except for the table of chunk indexes, so as not to pollute the compression dictionary...

Identifying isomorphic type subgraphs for merging is likely to slow down the link

The deduplicator I had before was very slow, but I think I now have an algorithm that is O(n) in the number of nodes, i.e. not too bad. (But that remains to be seen, since it's not written yet and the algorithm may be broken). We will soon be able to use multiple threads in GNU ld too, which should help as well (the algorithm is parallelizable).

In particular it seems that if we get an ecosystem of tools that depend on CTF, then for tasks like debugging with local variable values which are explicitly outside the scope of CTF, we'll have to build and link binaries with both DWARF and CTF, which doesn't sound good.

It is very much our intention to collaborate with the rest of the debugging ecosystem. Right now CTF isn't interacting with any of these things because it is very new, consists entirely of sharp edges and it is evolving rapidly. But there has already been internal discussion about where to put CTF in the future once it settles down a bit, and DWARF has obviously come up as a useful umbrella project. This LWN article is arguably part of that: increasing awareness...

We could perhaps have CTF augment DWARF in some future spec revision, so DWARF could optionally drop some of its type representations when CTF is present and use CTF's instead, much as CTF drops some of its strings and uses the ELF strtab's instead: having DWARF augment CTF is probably impractical because it is usually not present. (A future version of CTF, obviously. This version is definitely transitional: as usual, when you start to upstream things, you get really good review comments from all sorts of people, and those trigger changes...)

Meaningful comparison of CTF and DWARF

Posted Aug 7, 2019 0:03 UTC (Wed) by roc (subscriber, #30627) [Link] (9 responses)

> That's a very interesting question. I'm not aware of any way to configure or indeed modify any extant compiler to do that without fairly major surgery, alas :(

I think it would not be very difficult to prototype a DWARF post-processing tool to do this. (If it was very difficult, that would be interesting!)

I really think this ought to be done for size comparisons to be meaningful. If it were possible to make some simple changes to DWARF to reduce its space usage, and configure DWARF producers emit a subset of DWARF that covers the functionality of CTF with not much more size, I think there would be a compelling argument to just do that instead of CTF.

> CTF has been designed with compression in mind from the start, and a number of its design decisions are aimed to increase compressibility: e.g. the almost complete eschewing of self-identifiers

Can you identify the specific DWARF design decisions that make it compress poorly?

> The deduplicator I had before was very slow, but I think I now have an algorithm that is O(n) in the number of nodes, i.e. not too bad.

Does it handle recursive data structures? I actually implemented this for a C static analysis tool many many years ago, and it was pretty hard to find all possible duplicates efficiently. To find all possible duplicates, you pretty much have to start by optimistically assuming all types with the same name are equivalent and then split those equivalence classes when you find they aren't, cascading the splits backwards along your type membership graph.

You can of course give up on finding all possible duplicates, but that would raise the question of how well your deduplicator works on C++ and complex applications.

Of course, a deduplicator for DWARF would face the same issues. I wonder what dwz does...

> We will soon be able to use multiple threads in GNU ld too, which should help as well (the algorithm is parallelizable).

Good. Though, when measuring overhead, it would be more meaningful in the context of a linker that's actually fast, e.g. LLD rather than GNU ld.

> We could perhaps have CTF augment DWARF in some future spec revision, so DWARF could optionally drop some of its type representations when CTF is present and use CTF's instead

That sounds pretty complicated unfortunately.

Meaningful comparison of CTF and DWARF

Posted Aug 7, 2019 12:33 UTC (Wed) by nix (subscriber, #2304) [Link] (8 responses)

> I really think this ought to be done for size comparisons to be meaningful.

I agree! Anyone want to try? :)

> If it were possible to make some simple changes to DWARF to reduce its space usage, and configure DWARF producers emit a subset of DWARF that covers the functionality of CTF with not much more size, I think there would be a compelling argument to just do that instead of CTF.

I'd think we'd need simpler ways to read it as well: better libraries that do what libctf does, but for DWARF. libdwfl is very nice but it doesn't give you a view of the C language's type system, or any language's type system: it gives you a view of DWARF, and you have to dig around in the DWARF spec yourself to figure out how to interpret that. Even for the simple cases with no interpreter this is fiddly and often has spec-version variation (the various tools I wrote never gained DWARF 4 type signature support, for instance, because it was just too much work that gained me *nothing*, just running to stay in place).

And when you get to questions that honestly in most languages usually have really simple answers like "what is the offset of this structure member?" either you suddenly need a full DWARF exprloc interpreter, which libdwfl did *not* provide last I saw, or (far more commonly) one just looks to see what one's personal compiler usually emits for offsets and jams in an expectation that that is there. So now you have a tool that only works for some DWARF versions and only for the output of some compilers. And that is the *norm* for programs trying to use DWARF to do type introspection that aren't dedicated full-featured debuggers written by people who've been immersed in this stuff for years. I can't imagine doing it in under a thousand lines, even with libdwfl's help.

It should not be this hard! The actual norm is that people don't bother trying to introspect into the C type system because it's too difficult if they have to use DWARF, and resort to preprocessor kludges or shared headers or out-of-band communication methods, none of which do remotely as good a job as a nice compact easy-to-read type description would do. CTF is not *just* about compactness. It's about letting normal mortals get useful results out, too.

(And, yes, obviously adding toolchain support for CTF means that CTF only works for the output of compilers with that support, too. It is possible to convert DWARF to CTF as well, and there are multiple tools that do it: but this does mean that you need to generate DWARF first, and if you don't otherwise want it this slows down the link really quite a lot, *on top* of any time hit you take to deduplicate the resulting CTF.)

Meaningful comparison of CTF and DWARF

Posted Aug 7, 2019 22:44 UTC (Wed) by roc (subscriber, #30627) [Link] (3 responses)

> And when you get to questions that honestly in most languages usually have really simple answers like "what is the offset of this structure member?"

I agree implementing a full DWARF location expression interpreter for field offsets is sad. AFAIK the only case in practice where non-constant location expressions are used is C++ virtual base classes. That raises the question: will CTF support C++ virtual base classes? And if so, then how?

If I were you I would simply add C++ virtual base classes as a feature to CTF and ensure CTF encodes enough information for tools to find the VBC offsets at runtime.

But of course we could also effectively retrofit that approach to DWARF by adding a requirement that member location expressions have restricted forms, and ensuring DWARF producers follow that requirement.

> So now you have a tool that only works for some DWARF versions and only for the output of some compilers

It's certainly a problem if consumers independently come up with their own definitions of the allowed forms of member locations. If consumers work together to standardize a single definition, it's no worse than defining a new format or a new version of an existing format. In fact, it's better, assuming you standardize a definition that matches what tools already do.

Meaningful comparison of CTF and DWARF

Posted Aug 9, 2019 11:40 UTC (Fri) by nix (subscriber, #2304) [Link] (2 responses)

> I agree implementing a full DWARF location expression interpreter for field offsets is sad. AFAIK the only case in practice where non-constant location expressions are used is C++ virtual base classes. That raises the question: will CTF support C++ virtual base classes? And if so, then how?

The killer here for me is 'in practice'. That, again, means you are depending on what compilers happen to do now, with no guarantee at all that this will remain true in the future.

CTF doesn't support C++ yet. When it does, I won't consider it done until it supports all of it, even the tricky parts. The one bit I am wondering if I can somehow avoid having to do is ctf_lookup_by_name(), which takes a sort of crippled cut-down C identifier and parses it to yield the terminal type node you're interested in. Doing that with C++ name lookup rules seems likely to be a flaming nightmare... but then maybe the rules don't need to be perfect: they're already very far from perfect even for C and nobody ever noticed in the 17-year lifespan of libctf so far. A lot of programs seem likely to do a lot of type node navigation without wanting libctf to help with parsing anyway, taking the names of individual types and parsing them themselves rather than trusting libctf to do it.

Meaningful comparison of CTF and DWARF

Posted Aug 9, 2019 12:03 UTC (Fri) by roc (subscriber, #30627) [Link] (1 responses)

> That, again, means you are depending on what compilers happen to do now, with no guarantee at all that this will remain true in the future.

We could guarantee this will remain true in the future by promulgating and enforcing a rule saying it should. This is no different in principle to promulgating rules about how CTF should be implemented in tools. In fact it's easier, because current code doesn't have to change.

> When it does, I won't consider it done until it supports all of it, even the tricky parts.

Then I think you ought to do it now, because it will impact the design of CTF quite a bit.

For example, the actual worst thing about virtual base classes is that "what is the offset of this field within this type?" is no longer statically knowable. It depends on inspecting the run-time value of the object. This will have significant impact on your library API.

Meaningful comparison of CTF and DWARF

Posted Aug 9, 2019 22:47 UTC (Fri) by nix (subscriber, #2304) [Link]

I think getting a spec working, getting the linker stuff in etc is far more important, and much faster to do, than the enormous job that is adding C++ support, sorry. It's on the todo list, but it is definitely *not* at the top.

(The library API will need to grow for C++ support in any case.)

Meaningful comparison of CTF and DWARF

Posted Aug 7, 2019 23:00 UTC (Wed) by roc (subscriber, #30627) [Link] (3 responses)

> I agree! Anyone want to try? :)

I would, but no-one's going to pay me to do it, so I probably won't. It's also perhaps not the right time for someone to do it, since CTF is a moving target with backtraces and C++ etc not being supported yet.

I'm quite worried about how this situation is going to evolve over time, especially for situations where full debuginfo is needed. You have said that somehow DWARF could/should be extended so that it can reuse the information contained in the CTF section. That sounds hard and complicated. It also sounds brittle for the future, because if successful, the scope of CTF is likely to increase. Users will find that CTF fits their needs except they just need one more bit of information that's present in DWARF and not CTF, and you will face enormous pressure to add that one little feature, because that's so much easier for those CTF users than switching to full DWARF. As the scope of CTF grows, the overlap with DWARF will grow and the size penalty will grow. I guess you could evolve the "DWARF reusing CTF" spec in lockstep but that would be a big burden on the ecosystem and likely won't happen since the CTF maintainers presumably don't much care about DWARF.

Meaningful comparison of CTF and DWARF

Posted Aug 9, 2019 11:45 UTC (Fri) by nix (subscriber, #2304) [Link] (2 responses)

What you say is true: scope creep is a problem in all software systems. It's just something to watch out for, I suppose. But note that CTF has an internal section layout system, so even if it does creep like hell there won't be a size penalty if the extra bits are in sections which are optional (there are many optional sections already). There would still be a complexity penalty, though, which matters a great deal to me.

I do want to keep CTF tightly focused on type introspection. The backtrace stuff is going to be in an optional CTF section and will only be generated if you ask for it: it hasn't had any implications for other, existing parts of CTF in any of my various half-designs so far. It just uses them. It could potentially be an entirely separate library, generating a separate ELF section, though given how painful it is to add more ELF sections I'm very strongly inclined to keep it inside the .ctf ELF section in a new CTF section instead, so the ELF machinery need not know it is there.

> and likely won't happen since the CTF maintainers presumably don't much care about DWARF

Well, I *am* the CTF maintainer right now and I do care about DWARF. :) It's hard to avoid caring about it since if you want to do debugging of anything with, say, scopes in it, DWARF is the right thing, not CTF.

Meaningful comparison of CTF and DWARF

Posted Aug 14, 2019 18:35 UTC (Wed) by khim (subscriber, #9252) [Link] (1 responses)

> I do want to keep CTF tightly focused on type introspection.

BTW, if it's about introspection... would it be possible to know if a given structure should be passed on stack or in registers? I mean something like this: compare foo, bar and baz. Especially bar and baz. These types have more-or-less identical properties... yet one is passed (and returned!) using registers and other is passed (and returned!) on stack.

Meaningful comparison of CTF and DWARF

Posted Aug 14, 2019 19:18 UTC (Wed) by excors (subscriber, #95769) [Link]

Looks like that's defined by http://itanium-cxx-abi.github.io/cxx-abi/abi.html#value-p... ? Working out whether a type is non-trivial seems non-trivial, though presumably it's easy for the C++ compiler, so it sounds like useful information to expose.