The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
Posted Aug 6, 2019 23:02 UTC (Tue) by roc (subscriber, #30627)In reply to: The Compact C Type Format in the GNU toolchain by nix
Parent article: The Compact C Type Format in the GNU toolchain
OTOH it will be a long time, if ever, before almost every binary requires CTF to be present. In the meantime people will want to strip CTF and you won't be able to stop tools adding support for that and people configuring their builds to do so by default.
Posted Aug 6, 2019 23:54 UTC (Tue)
by nix (subscriber, #2304)
[Link] (34 responses)
Fundamentally, there's a *reason* strip(1) doesn't strip CTF by default: it should hardly save any space and it rips out something that offers facilities not otherwise available. The format will be useless if it's stripped out routinely, and it should be small enough that *most* people don't bother. People who need to hunt for every last byte and are willing to use obscure options to do so probably both have a reason and are used to coping with the resulting breakage. (It will certainly break Objective Caml programs to strip out non-loaded sections that you don't recognise, for instance.)
(However... if you really want separated debugging information, we *do* have a CTF archive format that is specifically intended for sticking big piles of CTF into for later mmapping out. If people really want separated debug info, we could in theory arrange to dump all the CTF on the system into a .ctfa, and remove items from the CTFA on package uninstallation, and have libctf know to look there to pick it up -- or just look in /usr/lib/ctf/ -- a tree like /usr/lib/debug/ -- or whatever. My worry is that if you did that, people would soon say oh let's put it in a separate package! And now it's never present and it's useless. Having CTF in a separate file is not really a problem, though it doesn't buy you anything that I can see. Having it in a separate *package* that is not installed when the package is... that's a problem. That's what makes life so hard for systemwide debuggers now: the DWARF is never there.)
Posted Aug 7, 2019 0:11 UTC (Wed)
by roc (subscriber, #30627)
[Link] (33 responses)
> That's what makes life so hard for systemwide debuggers now: the DWARF is never there.
It's not super hard to have debuggers automatically fetch and use system debuginfo packages. Pernosco does this. Even Fedora's gdb gets you most of the way there by telling you the command you need to run. We don't need new formats to solve this particular problem. (OK, to tell the truth, there is one other problem that needs to be fixed: you need an archive of all debuginfo for all versions of packages so you can debug the non-latest version of a package. We've built that for Pernosco too.)
Posted Aug 7, 2019 0:30 UTC (Wed)
by nix (subscriber, #2304)
[Link] (15 responses)
Posted Aug 7, 2019 0:40 UTC (Wed)
by roc (subscriber, #30627)
[Link] (12 responses)
However, you are pretending it is *needed* when for most binaries it currently is not.
Posted Aug 7, 2019 10:52 UTC (Wed)
by nix (subscriber, #2304)
[Link] (11 responses)
I have... painful experiences here. Back when we were converting DWARF to CTF at kernel link time and linking it into kernel modules, we had to actually *hack RPM at build time* via PATH shuffling and patching of /usr/lib/rpm/find-debuginfo.sh to even make it possible for RPM to not just strip out all non-loaded sections on the grounds that they must be unnecessary, no matter what size they were or whether RPM had never seen them before, including ripping all the CTF that we'd just gone to some lengths to link in.
To me that just seems like unwise behaviour on the part of a packaging system. RPM didn't know what that section was: why was it removing it? It might have been necessary. It *was* necessary for what we were doing, and RPM just removed it without so much as a by-your-leave. So... guess why strip(1) doesn't remove CTF? I don't want anyone who's actually using CTF to have to go through anything like that again just so they can package their own software without it being randomly broken by the packaging system.
Posted Aug 7, 2019 11:45 UTC (Wed)
by roc (subscriber, #30627)
[Link] (1 responses)
But it also seems like that would apply to DWARF debuginfo too. Why ask the compiler to generate DWARF if you're going to strip it out? Yet here we are.
Posted Aug 7, 2019 12:19 UTC (Wed)
by nix (subscriber, #2304)
[Link]
I'd guess that debuginfo, in a world where debuggers are a special thing that is explicitly run by human beings when things go wrong, is something huge that is only *needed* when things go wrong, when there will be a human around who can install the necessary big packages. But you never want to compile something without any debug info for use in a production environment because if things go wrong you then have no debuginfo to use to diagnose it! So -g -O2 has become a sort of de facto standard for CFLAGS.
Of course the "you only need it when things go wrong" attitude has now been retarding the development of always-on systemwide debugging tools for something like fifteen years; but nobody wants to add extensive debuginfo shrinking machinery because it will slow down the link for something that is only rarely needed. It seems to me that the only *reason* debuginfo is only rarely needed is that tools that use debuginfo routinely cannot be developed because it can never be relied on to be present, because it is too big... it's a vicious circle.
Posted Aug 7, 2019 12:19 UTC (Wed)
by mjw (subscriber, #16740)
[Link] (8 responses)
> To me that just seems like unwise behaviour on the part of a packaging system. RPM didn't know what that section was: why was it removing it? It might have been necessary. It *was* necessary for what we were doing, and RPM just removed it without so much as a by-your-leave.
I might be responsible for that. But it is simply that RPM follows normal ELF rules for stripping [*] (unless you give define macros to give find-debuginfo.sh additional arguments [**]). In general any non-allocated section can be stripped away (or put into a separate .debug file). Because that simply means that the section isn't needed at runtime.
> So... guess why strip(1) doesn't remove CTF? I don't want anyone who's actually using CTF to have to go through anything like that again just so they can package their own software without it being randomly broken by the packaging system.
Sorry, RPM uses elfutils eu-strip, which will not have special magic to treat .ctf sections specially.
But I do like CTF and I do hope it will become the default one day. Not to replace DWARF (it should be a companion to that), but to replace .gnu_debugdata [***]. Which is used by various tools now to have an "extra symbol table".
So lets talk how to integrate this with RPM/elfutils/systemtap/etc. Maybe on the elfutils and/or binutils mailinglist?
[*] http://www.linker-aliens.org/blogs/ali/entry/how_to_strip...
Posted Aug 7, 2019 21:48 UTC (Wed)
by nix (subscriber, #2304)
[Link]
Posted Aug 7, 2019 21:55 UTC (Wed)
by nix (subscriber, #2304)
[Link] (6 responses)
(This is not the only tool missing support right now, of course: gold can't link CTF sections either. But I plan to add that and I did also plan to submit changes to elfutils to stop eu-strip throwing the section out. I'm rather unhappy to discover that this is pre-emptively rejected.)
(To deal with the problems of dynamic symbol tables getting stripped out of binaries, Solaris defined .ldynsym, which appears to be much what .gnu_debugdata is, only it's just a symbol table rather than a whole LZMA-compressed ELF object containing a symbol table.)
Plus of course there's not much chance of CTF becoming the default if you insist on stripping it out of executables so nothing that needs it can ever find it. ;)
Posted Aug 7, 2019 22:28 UTC (Wed)
by mjw (subscriber, #16740)
[Link] (5 responses)
It shouldn't be hard to keep it, if a package or distro decides that is the thing they want.
So all we need to do is define some macro that packages can set for find-debuginfo.sh to do "the right thing" and then a package or distro can decide to make that the default.
> I did also plan to submit changes to elfutils to stop eu-strip throwing the section out. I'm rather unhappy to discover that this is pre-emptively rejected.
That is not my intention. Note that I am a not a native English speaker. My apologies if I seem to come over as negative.
> CTF does not contain a symbol table, since that would be a waste of space since ELF already has one. Instead, it relies on the ELF symtab. Its function and data object sections are 1:1 ordered in the same order as the ELF symtab (basically, you traverse the ELF symtab and whenever you pass another function symbol, you match it to a function info section entry: whenever you pass another data symbol, you match it to another data object section entry). This saves quite a lot of space: data object section entries in particular are only four bytes each (one type ID).
OK. So how do you deal with .symtab being stripped away by default then?
> Plus of course there's not much chance of CTF becoming the default if you insist on stripping it out of executables so nothing that needs it can ever find it. ;)
Really, I don't understand why you think that is my intention. I might not fully understand all details yet. But I am actually interested in making CTF into something useful.
Will you be at the GNU Tools Cauldron in Montréal, Canada next week?
Posted Aug 7, 2019 22:32 UTC (Wed)
by mjw (subscriber, #16740)
[Link] (1 responses)
Sorry, next month. (September 12 to 15, 2019)
Posted Aug 9, 2019 11:33 UTC (Fri)
by nix (subscriber, #2304)
[Link]
Posted Aug 9, 2019 0:51 UTC (Fri)
by himi (subscriber, #340)
[Link] (1 responses)
Posted Aug 9, 2019 11:21 UTC (Fri)
by nix (subscriber, #2304)
[Link]
Posted Aug 9, 2019 11:32 UTC (Fri)
by nix (subscriber, #2304)
[Link]
... now why didn't I think of that? Probably because when I was doing this back when I had *multiple* sections to deal with, with names like .ctf.*, so telling other things what the sections were called this time would have been quite painful. We have our own internal container format now precisely to avoid this sort of problem, so we could use this quite well.
> That is not my intention. Note that I am a not a native English speaker. My apologies if I seem to come over as negative.
Sorry, I completely misparsed your sentence! (See my comment a couple of hops down). Phew, that had me panicking a bit for a moment. :)
> Now that we have it maybe we should make .symtab a compressed section?
Compressed sections in GNU ld at least seem a bit ad hoc. I think you'd need to do quite a lot of work to bfd_elf_final_link and environs to make it possible to have allocated sections that other sections depend upon that are also compressed: every existing section with content that changes after layout time (earlier in bfd_elf_final_link than strtab / symtab layout time) either has unchanging size or is non-allocated (and even there, there are fairly dreadful hacks around .zdebug, which I'm afraid I made a little bit worse with .ctf :) ).
I'm not sure .symtab would compress terribly well, either -- it has a lot of fields with "ID-like" content that only appears once, and thus compresses rather badly. (CTF goes to some lengths to avoid content like this for just that reason). The strtab would certainly compress better, but I can see why you don't compress it -- you don't want to impose decompression costs on the whole strtab on every execution.
> Really, I don't understand why you think that is my intention.
A really terrible misparsing of an ambiguous sentence on my part, and you know how hard it is to find alternate meanings of a sentence when you've already fixated on one that is dreadful :) Sorry!
> Will you be at the GNU Tools Cauldron in Montréal, Canada next week?
Alas, I'm going to LPC instead. I'm spending the next two weeks listening to chamber music in the North York moors...
Posted Aug 7, 2019 3:44 UTC (Wed)
by josh (subscriber, #17465)
[Link] (1 responses)
Posted Aug 7, 2019 10:19 UTC (Wed)
by khim (subscriber, #9252)
[Link]
Posted Aug 7, 2019 15:53 UTC (Wed)
by luto (guest, #39314)
[Link] (16 responses)
I would love for the kernel to be able to drop something in /usr/lib/debuginfo.d/find_vdso_debuginfo.sh, for example.
Posted Aug 7, 2019 15:59 UTC (Wed)
by fuhchee (guest, #40059)
[Link] (15 responses)
https://sourceware.org/git/?p=elfutils.git;a=tree;f=dbgse...
Posted Aug 7, 2019 16:34 UTC (Wed)
by luto (guest, #39314)
[Link] (3 responses)
Posted Aug 7, 2019 16:42 UTC (Wed)
by fuhchee (guest, #40059)
[Link] (2 responses)
% dbgserver -F /path/to/base/directory
should find executables / debuginfo / corresponding sources
Posted Aug 7, 2019 16:53 UTC (Wed)
by luto (guest, #39314)
[Link] (1 responses)
As an admin, I would much rather *not* have a systemwide daemon for this, since that implies a path by which one user can attempt to attack another user or the system as a whole. I'd rather if each user could, on demand, start up their own instance of whatever libraries and programs are needed to make debugging work. Nothing here should require any form of privilege.
Posted Aug 7, 2019 17:47 UTC (Wed)
by fuhchee (guest, #40059)
[Link]
Any of the above.
>As an admin, I would much rather *not* have a systemwide daemon for this, since that implies a path by which one user can attempt to attack another user or the system as a whole.
Fair enough, though DoS is perhaps the worst of the possible attacks.
> Nothing here should require any form of privilege.
Right.
Posted Aug 7, 2019 18:05 UTC (Wed)
by madscientist (subscriber, #16861)
[Link] (6 responses)
If it were possible to easily "register" debuginfo files created through Jenkins or some other build service without having to turn them into distro packages, then allow tools (i.e., GDB) to download them more or less invisibly when needed, that would be really nice.
Also is this limited to just debuginfo files?
The other big problem we have is cores being generated on remote systems which are using system libraries other than the local ones: in this situation we need to obtain the remote system's libc.so and other necessary system libraries. It would be really, really nice if we could register shared libraries from different systems, perhaps indexed via a hash of some kind, then have GDB automatically download them as well.
Of course, before this can be done we need to ensure that the core file contains enough information about the shared libraries to perform the lookup, which I doubt it does today, so this is requires more work in other places... however it would be good if the design of this service was able to be extended in this way in the future if/when it becomes feasible. For example in our system we use Google coredumper library rather than the kernel to dump cores and this allows us to add "notes" into the generated core file, so we could take advantage of this without Linux kernel support.
Posted Aug 7, 2019 19:18 UTC (Wed)
by fuhchee (guest, #40059)
[Link] (5 responses)
It seems like we're all thinking roughly alike. Exciting times!
Posted Aug 7, 2019 19:41 UTC (Wed)
by madscientist (subscriber, #16861)
[Link] (4 responses)
I compile a program on my build system (I use a sysroot to ensure that it links against sufficiently older system libraries that it can run "anywhere"). I send my program out to run tests some other system running some random distribution completely different than the one it was built on, which is using a different GNU libc, etc. Maybe Travis, or AWS, or just a local test farm.
It fails and a core is generated. To debug that core I need my program, the debuginfo from my program (if the program is stripped), the core file, and the system libraries from the system it was running on when the core is generated.
I can't see any way that a buildid compiled into my binary can be sufficient to retrieve the runtime system libraries.
Posted Aug 7, 2019 19:52 UTC (Wed)
by fuhchee (guest, #40059)
[Link] (3 responses)
Posted Aug 7, 2019 20:22 UTC (Wed)
by madscientist (subscriber, #16861)
[Link] (2 responses)
I think it would also be helpful if the client interface provided separate lookup and download methods rather than forcing them to both be a single method (there can be a simplified "do both" method as well if wanted). I can easily imagine situations where we want to know whether a given buildid exists on the server without actually downloading it.
For example, suppose I have a suite of test servers running random environments; during test runs a core is generated. I want to know if the program under test and/or system libraries for this system already exist in the debug server or not: I just want to look them up but not download them. If they don't exist perhaps I'll include them along with the core file when I bundle up the build results. If they do exist I don't need to add them.
Or perhaps I have an automated way for the test system to upload binaries and/or system libraries that aren't already on the debug server (I understand that upload is not in scope for this project and would need some other process) but I don't want to bother uploading things that I already have so I need to be able to check.
A simple program that uses the client interface to look up and/or download files would be very useful, as an example if nothing else (and probably for people who would like to add scripting to systems where it's not so simple to recode them to use it).
Cheers!
Posted Aug 7, 2019 21:09 UTC (Wed)
by fuhchee (guest, #40059)
[Link] (1 responses)
Yes.
> I think it would also be helpful if the client interface provided separate lookup and download methods
Will consider that ... though there may be better ways to service the needs you outline. Deduplication at upload time should be easy too. Re. optimizing packaging of core dumps ... not sure how much sense that makes. The core dump recipient could consult the same debuginfo servers too; or you could preemptively package all the files. Will think on it more.
> A simple program that uses the client interface to look up and/or download files would be very useful
It just appeared in the repo! We employ only the most talented psychics and keyboard monks.
Posted Aug 7, 2019 21:29 UTC (Wed)
by madscientist (subscriber, #16861)
[Link]
If you mean deduplication by the server that's probably helpful but it's a lot of wasted effort to upload 10's or 100's of MB of libraries, binaries, etc., only to have it tossed on the floor as duplicate. Consider a build farm with 200 systems, which are upgraded via apt-get update or whatever at random intervals so they have different system libraries, different program instances, etc... having every system upload all its files for every core even though the system libraries might only change once every few weeks or less seems like overkill.
> Re. optimizing packaging of core dumps ... not sure how much sense that makes. The core dump recipient could consult the same debuginfo servers too; or you could preemptively package all the files.
For this I wasn't thinking that the dbgserver code would do that, I was thinking about scripting that users are using with their test clients to bundle results of failures so they can be uploaded to a test server for further investigation. Our current scripting already preemptively packages all the files: what I'd like to be able to do is detect when some/all of these items are not needed and skip that to reduce the size of uploaded artifacts.
When you're talking about moving content into/out of AWS or other cloud providers, the amount of data sent over the network directly equates to $$ spent and reducing it is always welcome.
Thanks for working on this, it'll be very cool!
Posted Aug 7, 2019 22:01 UTC (Wed)
by nix (subscriber, #2304)
[Link] (1 responses)
I wonder if it can handle more than just debuginfo... musing about having libctf automatically launch dbgserver queries for missing CTF sections now -- so people can have separated CTF if they really want *and* it is as if it were always present. Best of all worlds! For that matter they can also do both -- perhaps an option at CTF generation time which automatically emits separated CTF *if* its size passes some threshold, or some percentage threshold of the total binary size, or the .text size, or something. Of course then you'd have to arrange for the dbgserver to see it, but presumably whatever method is used for separated debuginfo would work for this too.
Posted Aug 8, 2019 12:28 UTC (Thu)
by fuhchee (guest, #40059)
[Link]
It should be a small amount of extra effort to extend it sideways to a 'ctf' sibling to 'debuginfo'.
Posted Aug 7, 2019 22:26 UTC (Wed)
by thoughtpolice (subscriber, #87455)
[Link] (1 responses)
Essentially, every version of every package has a unique hash. We build a reverse mapping from the buildids of the binaries in a package to its unique hash, and upload that metadata along with the package to the package server. We then patch GDB (and elfutils) to look in a specific directory for debug info. This directory is a FUSE filesystem, and when any tool tries to look in `.build-id/...` for the debug info -- it does a query to the package server, obtaining the unique package ID containing the symbols, and transparently installs them through the package manager. It is effectively a version of Microsoft Symbol Server, which is basically what people want, from what I can tell...
Perhaps we could replace dwarffs with something like dbgserver, however. Or integrate them so there's a single UX. We could for instance, perhaps replace the client tooling with a separate "backend" for our case, and the tools can all just work around that instead...
Posted Aug 7, 2019 23:25 UTC (Wed)
by fuhchee (guest, #40059)
[Link]
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
I think it's much more important for your goals to convince distro vendors to cooperate with you than to play tricks with header flags pretending CTF is not really debug info.
It's not. It's type introspection info. It's no more debug info than C++ RTTI is. Programs can perfectly well introspect their own types without being debuggers in any sense.
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
[**] https://gnu.wildebeest.org/blog/mjw/2017/06/30/fedora-rpm...
[***] https://fedoraproject.org/wiki/Features/MiniDebugInfo
The Compact C Type Format in the GNU toolchain
In general any non-allocated section can be stripped away (or put into a separate .debug file). Because that simply means that the section isn't needed at runtime.
Well, it means it isn't needed by the executable loader. I was *forced* to make libctf non-loadable by internal constraints in ld (roughly, that you cannot simultaneously have an allocated section whose size is not known before bfd_elf_final_link() and that the symtab and strtab are not laid out until halfway through that function and that CTF needs to know the offsets of all strings in the strtab and the order of symbols, *and* it's compressed so its content affects its size: so by extension the section may not be allocated). That doesn't mean it's not going to be used by programs at runtime. It is. (Well, assuming anyone uses it at all. :) ). They load it out of the binary as needed using BFD.
The Compact C Type Format in the GNU toolchain
Sorry, RPM uses elfutils eu-strip, which will not have special magic to treat .ctf sections specially.
I guess that means CTF will always be stripped out of RPMs, and libctf and the CTF format by extension will be useless on RPM systems. This seems unfortunate. Is it really so hard to mark .ctf sections as not stripped? If it takes more than a couple of lines, something seems to me to be wrong.
But I do like CTF and I do hope it will become the default one day. Not to replace DWARF (it should be a companion to that), but to replace .gnu_debugdata [***]. Which is used by various tools now to have an "extra symbol table".
That won't work, I'm afraid. CTF does not contain a symbol table, since that would be a waste of space since ELF already has one. Instead, it relies on the ELF symtab. Its function and data object sections are 1:1 ordered in the same order as the ELF symtab (basically, you traverse the ELF symtab and whenever you pass another function symbol, you match it to a function info section entry: whenever you pass another data symbol, you match it to another data object section entry). This saves quite a lot of space: data object section entries in particular are only four bytes each (one type ID).
The Compact C Type Format in the GNU toolchain
For example rust packages do something like:
%global _find_debuginfo_opts --keep-section .rustc
>
> (To deal with the problems of dynamic symbol tables getting stripped out of binaries, Solaris defined .ldynsym, which appears to be much what .gnu_debugdata is, only it's just a symbol table rather than a whole LZMA-compressed ELF object containing a symbol table.)
Would it be an idea to adopt the .ldynsym from Solaris?
.gnu_debugdata was defined before we had compressed ELF sections in the standard.
Now that we have it maybe we should make .symtab a compressed section?
https://gnu.wildebeest.org/blog/mjw/2016/01/13/elf-libelf...
It might be easier to talk some ideas over in person.
https://gcc.gnu.org/wiki/cauldron2019
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
Sorry, RPM uses elfutils eu-strip, which will not have special magic to treat .ctf sections specially.
This can be read as meaning "no version of eu-strip will ever have the special magic", rather than what I believe you meant: "any eu-strip you find in the world right now will not have the necessary special magic".
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
Hm... we must be talking about something different. Let be more clear.
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
I see. So dbgserver_find_executable() is intended to be used with shared libs as well? Or is this part not quite complete?
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain