The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
Posted Aug 6, 2019 19:25 UTC (Tue) by clugstj (subscriber, #4020)Parent article: The Compact C Type Format in the GNU toolchain
Posted Aug 6, 2019 19:48 UTC (Tue)
by nix (subscriber, #2304)
[Link]
There is certainly nothing fundamentally *stopping* us growing support for more languages.
Posted Aug 6, 2019 20:37 UTC (Tue)
by wahern (subscriber, #37304)
[Link] (46 responses)
What we need is a critical mass of CTF annotated binaries such that a few projects can make the leap of depending on automagic, type-safe FFI. Then it becomes self-reinforcing. Once CTF is reliably pervasive, who knows what could come next in terms of support for high-order type systems. It's taking that first step that is critical.
If I had my way I'd make built-in DWARF annotations mandatory, but economy of space concerns always seem to win the day even though engineers spend countless hours, days, and even years of time struggling to diagnose and debug production issues because of the lack of annotations.[1] I see nothing but upsides with CTF.
[1] Yes, you can packages DWARF annotations on the side, but it's complicated and there's too much friction involved, especially with all the new half-baked build systems out there. Even when projects are capable of doing it they don't because people systematically underestimate the costs of missing debug symbols.
Posted Aug 6, 2019 21:24 UTC (Tue)
by nix (subscriber, #2304)
[Link] (45 responses)
btw, in re the sizes above: the 1.6GiB -> 7MiB figure is for the old non-ld-toolchain deduplicator: the deduplicator in the linker patches above doesn't really deserve the name because it does no cross-TU deduplication whatsoever. A quick check with the GNU ld patches I have in preparation now (still with no deduplicator) shows GNU ld itself clocking in at:
So with a dreadful deduplicator that I have spent literally no effort on at all, the CTF is a bit smaller than the .text, and perhaps 10% the size of the DWARF. It will only go down as the deduplicator is written, as the file format improves and as better compressors are added (lzma support seems likely, since binutils can already use it for .gnu_debugdata). If CTF ends up adding more than 1% to the size of executables once all this is done I will be quite surprised. Throwing the old kernel-type-focused deduplicator at this file produces CTF 58207 bytes long, already a radical reduction. I expect the ld deduplicator, once I write it, to do a better job.
Posted Aug 6, 2019 21:42 UTC (Tue)
by josh (subscriber, #17465)
[Link] (6 responses)
But meanwhile, I don't look forward to figuring out a whole new set of incantations to strip it out, for applications and systems that *do* have real space considerations to deal with. I'd still like to have support for "separate debug symbols", and then have the option of much smaller debug symbols.
Posted Aug 6, 2019 22:53 UTC (Tue)
by nix (subscriber, #2304)
[Link] (5 responses)
If people start "saving space" by forcibly stripping this section out, it's useless. It's meant to become as reliably present as, say, the .dynsym section is today. Again, this is *not* debugging information: it's *introspection* information. It's not just meant to be used to find problems in programs that have gone wrong: it's meant for C programs to be able to see their own types, and the types of programs they are interacting with, at runtime, as part of their normal operation. That's not just something that's useful for debugging.
Posted Aug 6, 2019 23:05 UTC (Tue)
by roc (subscriber, #30627)
[Link] (1 responses)
Posted Aug 6, 2019 23:42 UTC (Tue)
by nix (subscriber, #2304)
[Link]
Posted Aug 6, 2019 23:22 UTC (Tue)
by josh (subscriber, #17465)
[Link] (1 responses)
Posted Aug 6, 2019 23:46 UTC (Tue)
by nix (subscriber, #2304)
[Link]
In my experience, binary growth is both inevitable and routine: it even happens when constant effort is imposed to prevent it: the most you can do is slow it down, unless you want to eschew all new features forever. Even busybox grows slowly over time. Even that paragon of the optimizer's art, Elite, grew until it could no longer have known bugs fixed because fixing them would take a few bytes that just weren't there any more. Everything grows.
(Note that the .ctf section is *not* a loaded section. So it won't slow down program startup, make it use more memory or address space, or anything like that. It's loaded as needed by the things that use it.)
Posted Mar 23, 2024 6:08 UTC (Sat)
by adityagurajada (guest, #170330)
[Link]
I am particularly interested in the comments of this responder who says: "... this is *not* debugging information: it's *introspection* information. ... [ it's meant for C programs to be able to see their own types, and the types of programs they are interacting with, at runtime, as part of their normal operation."
I've been thinking and doodling a lot on how-to build C-type system information into a C-program, so that one can query for that information at "run-time". And, do stuff like build automated print methods to pretty-print run-time structures (as an example).
This responder has hit the nail on the head. I'm wondering if you have any tooling that makes this possible. Or, if you would like to get connected off-line, outside this forum, to brainstorm the things necessary to build such a kind of infrastructure.
(How do we share email-IDs via this forum? I'm new to this group.)
Thanks in advance, --AdityA>
Posted Aug 6, 2019 21:55 UTC (Tue)
by roc (subscriber, #30627)
[Link] (37 responses)
That's a neat trick, but inevitably it will lead to stripping tools getting --strip-really-unneeded options, which just means more complexity for the ecosystem going forward.
Posted Aug 6, 2019 22:55 UTC (Tue)
by nix (subscriber, #2304)
[Link] (36 responses)
Posted Aug 6, 2019 23:02 UTC (Tue)
by roc (subscriber, #30627)
[Link] (35 responses)
OTOH it will be a long time, if ever, before almost every binary requires CTF to be present. In the meantime people will want to strip CTF and you won't be able to stop tools adding support for that and people configuring their builds to do so by default.
Posted Aug 6, 2019 23:54 UTC (Tue)
by nix (subscriber, #2304)
[Link] (34 responses)
Fundamentally, there's a *reason* strip(1) doesn't strip CTF by default: it should hardly save any space and it rips out something that offers facilities not otherwise available. The format will be useless if it's stripped out routinely, and it should be small enough that *most* people don't bother. People who need to hunt for every last byte and are willing to use obscure options to do so probably both have a reason and are used to coping with the resulting breakage. (It will certainly break Objective Caml programs to strip out non-loaded sections that you don't recognise, for instance.)
(However... if you really want separated debugging information, we *do* have a CTF archive format that is specifically intended for sticking big piles of CTF into for later mmapping out. If people really want separated debug info, we could in theory arrange to dump all the CTF on the system into a .ctfa, and remove items from the CTFA on package uninstallation, and have libctf know to look there to pick it up -- or just look in /usr/lib/ctf/ -- a tree like /usr/lib/debug/ -- or whatever. My worry is that if you did that, people would soon say oh let's put it in a separate package! And now it's never present and it's useless. Having CTF in a separate file is not really a problem, though it doesn't buy you anything that I can see. Having it in a separate *package* that is not installed when the package is... that's a problem. That's what makes life so hard for systemwide debuggers now: the DWARF is never there.)
Posted Aug 7, 2019 0:11 UTC (Wed)
by roc (subscriber, #30627)
[Link] (33 responses)
> That's what makes life so hard for systemwide debuggers now: the DWARF is never there.
It's not super hard to have debuggers automatically fetch and use system debuginfo packages. Pernosco does this. Even Fedora's gdb gets you most of the way there by telling you the command you need to run. We don't need new formats to solve this particular problem. (OK, to tell the truth, there is one other problem that needs to be fixed: you need an archive of all debuginfo for all versions of packages so you can debug the non-latest version of a package. We've built that for Pernosco too.)
Posted Aug 7, 2019 0:30 UTC (Wed)
by nix (subscriber, #2304)
[Link] (15 responses)
Posted Aug 7, 2019 0:40 UTC (Wed)
by roc (subscriber, #30627)
[Link] (12 responses)
However, you are pretending it is *needed* when for most binaries it currently is not.
Posted Aug 7, 2019 10:52 UTC (Wed)
by nix (subscriber, #2304)
[Link] (11 responses)
I have... painful experiences here. Back when we were converting DWARF to CTF at kernel link time and linking it into kernel modules, we had to actually *hack RPM at build time* via PATH shuffling and patching of /usr/lib/rpm/find-debuginfo.sh to even make it possible for RPM to not just strip out all non-loaded sections on the grounds that they must be unnecessary, no matter what size they were or whether RPM had never seen them before, including ripping all the CTF that we'd just gone to some lengths to link in.
To me that just seems like unwise behaviour on the part of a packaging system. RPM didn't know what that section was: why was it removing it? It might have been necessary. It *was* necessary for what we were doing, and RPM just removed it without so much as a by-your-leave. So... guess why strip(1) doesn't remove CTF? I don't want anyone who's actually using CTF to have to go through anything like that again just so they can package their own software without it being randomly broken by the packaging system.
Posted Aug 7, 2019 11:45 UTC (Wed)
by roc (subscriber, #30627)
[Link] (1 responses)
But it also seems like that would apply to DWARF debuginfo too. Why ask the compiler to generate DWARF if you're going to strip it out? Yet here we are.
Posted Aug 7, 2019 12:19 UTC (Wed)
by nix (subscriber, #2304)
[Link]
I'd guess that debuginfo, in a world where debuggers are a special thing that is explicitly run by human beings when things go wrong, is something huge that is only *needed* when things go wrong, when there will be a human around who can install the necessary big packages. But you never want to compile something without any debug info for use in a production environment because if things go wrong you then have no debuginfo to use to diagnose it! So -g -O2 has become a sort of de facto standard for CFLAGS.
Of course the "you only need it when things go wrong" attitude has now been retarding the development of always-on systemwide debugging tools for something like fifteen years; but nobody wants to add extensive debuginfo shrinking machinery because it will slow down the link for something that is only rarely needed. It seems to me that the only *reason* debuginfo is only rarely needed is that tools that use debuginfo routinely cannot be developed because it can never be relied on to be present, because it is too big... it's a vicious circle.
Posted Aug 7, 2019 12:19 UTC (Wed)
by mjw (subscriber, #16740)
[Link] (8 responses)
> To me that just seems like unwise behaviour on the part of a packaging system. RPM didn't know what that section was: why was it removing it? It might have been necessary. It *was* necessary for what we were doing, and RPM just removed it without so much as a by-your-leave.
I might be responsible for that. But it is simply that RPM follows normal ELF rules for stripping [*] (unless you give define macros to give find-debuginfo.sh additional arguments [**]). In general any non-allocated section can be stripped away (or put into a separate .debug file). Because that simply means that the section isn't needed at runtime.
> So... guess why strip(1) doesn't remove CTF? I don't want anyone who's actually using CTF to have to go through anything like that again just so they can package their own software without it being randomly broken by the packaging system.
Sorry, RPM uses elfutils eu-strip, which will not have special magic to treat .ctf sections specially.
But I do like CTF and I do hope it will become the default one day. Not to replace DWARF (it should be a companion to that), but to replace .gnu_debugdata [***]. Which is used by various tools now to have an "extra symbol table".
So lets talk how to integrate this with RPM/elfutils/systemtap/etc. Maybe on the elfutils and/or binutils mailinglist?
[*] http://www.linker-aliens.org/blogs/ali/entry/how_to_strip...
Posted Aug 7, 2019 21:48 UTC (Wed)
by nix (subscriber, #2304)
[Link]
Posted Aug 7, 2019 21:55 UTC (Wed)
by nix (subscriber, #2304)
[Link] (6 responses)
(This is not the only tool missing support right now, of course: gold can't link CTF sections either. But I plan to add that and I did also plan to submit changes to elfutils to stop eu-strip throwing the section out. I'm rather unhappy to discover that this is pre-emptively rejected.)
(To deal with the problems of dynamic symbol tables getting stripped out of binaries, Solaris defined .ldynsym, which appears to be much what .gnu_debugdata is, only it's just a symbol table rather than a whole LZMA-compressed ELF object containing a symbol table.)
Plus of course there's not much chance of CTF becoming the default if you insist on stripping it out of executables so nothing that needs it can ever find it. ;)
Posted Aug 7, 2019 22:28 UTC (Wed)
by mjw (subscriber, #16740)
[Link] (5 responses)
It shouldn't be hard to keep it, if a package or distro decides that is the thing they want.
So all we need to do is define some macro that packages can set for find-debuginfo.sh to do "the right thing" and then a package or distro can decide to make that the default.
> I did also plan to submit changes to elfutils to stop eu-strip throwing the section out. I'm rather unhappy to discover that this is pre-emptively rejected.
That is not my intention. Note that I am a not a native English speaker. My apologies if I seem to come over as negative.
> CTF does not contain a symbol table, since that would be a waste of space since ELF already has one. Instead, it relies on the ELF symtab. Its function and data object sections are 1:1 ordered in the same order as the ELF symtab (basically, you traverse the ELF symtab and whenever you pass another function symbol, you match it to a function info section entry: whenever you pass another data symbol, you match it to another data object section entry). This saves quite a lot of space: data object section entries in particular are only four bytes each (one type ID).
OK. So how do you deal with .symtab being stripped away by default then?
> Plus of course there's not much chance of CTF becoming the default if you insist on stripping it out of executables so nothing that needs it can ever find it. ;)
Really, I don't understand why you think that is my intention. I might not fully understand all details yet. But I am actually interested in making CTF into something useful.
Will you be at the GNU Tools Cauldron in Montréal, Canada next week?
Posted Aug 7, 2019 22:32 UTC (Wed)
by mjw (subscriber, #16740)
[Link] (1 responses)
Sorry, next month. (September 12 to 15, 2019)
Posted Aug 9, 2019 11:33 UTC (Fri)
by nix (subscriber, #2304)
[Link]
Posted Aug 9, 2019 0:51 UTC (Fri)
by himi (subscriber, #340)
[Link] (1 responses)
Posted Aug 9, 2019 11:21 UTC (Fri)
by nix (subscriber, #2304)
[Link]
Posted Aug 9, 2019 11:32 UTC (Fri)
by nix (subscriber, #2304)
[Link]
... now why didn't I think of that? Probably because when I was doing this back when I had *multiple* sections to deal with, with names like .ctf.*, so telling other things what the sections were called this time would have been quite painful. We have our own internal container format now precisely to avoid this sort of problem, so we could use this quite well.
> That is not my intention. Note that I am a not a native English speaker. My apologies if I seem to come over as negative.
Sorry, I completely misparsed your sentence! (See my comment a couple of hops down). Phew, that had me panicking a bit for a moment. :)
> Now that we have it maybe we should make .symtab a compressed section?
Compressed sections in GNU ld at least seem a bit ad hoc. I think you'd need to do quite a lot of work to bfd_elf_final_link and environs to make it possible to have allocated sections that other sections depend upon that are also compressed: every existing section with content that changes after layout time (earlier in bfd_elf_final_link than strtab / symtab layout time) either has unchanging size or is non-allocated (and even there, there are fairly dreadful hacks around .zdebug, which I'm afraid I made a little bit worse with .ctf :) ).
I'm not sure .symtab would compress terribly well, either -- it has a lot of fields with "ID-like" content that only appears once, and thus compresses rather badly. (CTF goes to some lengths to avoid content like this for just that reason). The strtab would certainly compress better, but I can see why you don't compress it -- you don't want to impose decompression costs on the whole strtab on every execution.
> Really, I don't understand why you think that is my intention.
A really terrible misparsing of an ambiguous sentence on my part, and you know how hard it is to find alternate meanings of a sentence when you've already fixated on one that is dreadful :) Sorry!
> Will you be at the GNU Tools Cauldron in Montréal, Canada next week?
Alas, I'm going to LPC instead. I'm spending the next two weeks listening to chamber music in the North York moors...
Posted Aug 7, 2019 3:44 UTC (Wed)
by josh (subscriber, #17465)
[Link] (1 responses)
Posted Aug 7, 2019 10:19 UTC (Wed)
by khim (subscriber, #9252)
[Link]
Posted Aug 7, 2019 15:53 UTC (Wed)
by luto (guest, #39314)
[Link] (16 responses)
I would love for the kernel to be able to drop something in /usr/lib/debuginfo.d/find_vdso_debuginfo.sh, for example.
Posted Aug 7, 2019 15:59 UTC (Wed)
by fuhchee (guest, #40059)
[Link] (15 responses)
https://sourceware.org/git/?p=elfutils.git;a=tree;f=dbgse...
Posted Aug 7, 2019 16:34 UTC (Wed)
by luto (guest, #39314)
[Link] (3 responses)
Posted Aug 7, 2019 16:42 UTC (Wed)
by fuhchee (guest, #40059)
[Link] (2 responses)
% dbgserver -F /path/to/base/directory
should find executables / debuginfo / corresponding sources
Posted Aug 7, 2019 16:53 UTC (Wed)
by luto (guest, #39314)
[Link] (1 responses)
As an admin, I would much rather *not* have a systemwide daemon for this, since that implies a path by which one user can attempt to attack another user or the system as a whole. I'd rather if each user could, on demand, start up their own instance of whatever libraries and programs are needed to make debugging work. Nothing here should require any form of privilege.
Posted Aug 7, 2019 17:47 UTC (Wed)
by fuhchee (guest, #40059)
[Link]
Any of the above.
>As an admin, I would much rather *not* have a systemwide daemon for this, since that implies a path by which one user can attempt to attack another user or the system as a whole.
Fair enough, though DoS is perhaps the worst of the possible attacks.
> Nothing here should require any form of privilege.
Right.
Posted Aug 7, 2019 18:05 UTC (Wed)
by madscientist (subscriber, #16861)
[Link] (6 responses)
If it were possible to easily "register" debuginfo files created through Jenkins or some other build service without having to turn them into distro packages, then allow tools (i.e., GDB) to download them more or less invisibly when needed, that would be really nice.
Also is this limited to just debuginfo files?
The other big problem we have is cores being generated on remote systems which are using system libraries other than the local ones: in this situation we need to obtain the remote system's libc.so and other necessary system libraries. It would be really, really nice if we could register shared libraries from different systems, perhaps indexed via a hash of some kind, then have GDB automatically download them as well.
Of course, before this can be done we need to ensure that the core file contains enough information about the shared libraries to perform the lookup, which I doubt it does today, so this is requires more work in other places... however it would be good if the design of this service was able to be extended in this way in the future if/when it becomes feasible. For example in our system we use Google coredumper library rather than the kernel to dump cores and this allows us to add "notes" into the generated core file, so we could take advantage of this without Linux kernel support.
Posted Aug 7, 2019 19:18 UTC (Wed)
by fuhchee (guest, #40059)
[Link] (5 responses)
It seems like we're all thinking roughly alike. Exciting times!
Posted Aug 7, 2019 19:41 UTC (Wed)
by madscientist (subscriber, #16861)
[Link] (4 responses)
I compile a program on my build system (I use a sysroot to ensure that it links against sufficiently older system libraries that it can run "anywhere"). I send my program out to run tests some other system running some random distribution completely different than the one it was built on, which is using a different GNU libc, etc. Maybe Travis, or AWS, or just a local test farm.
It fails and a core is generated. To debug that core I need my program, the debuginfo from my program (if the program is stripped), the core file, and the system libraries from the system it was running on when the core is generated.
I can't see any way that a buildid compiled into my binary can be sufficient to retrieve the runtime system libraries.
Posted Aug 7, 2019 19:52 UTC (Wed)
by fuhchee (guest, #40059)
[Link] (3 responses)
Posted Aug 7, 2019 20:22 UTC (Wed)
by madscientist (subscriber, #16861)
[Link] (2 responses)
I think it would also be helpful if the client interface provided separate lookup and download methods rather than forcing them to both be a single method (there can be a simplified "do both" method as well if wanted). I can easily imagine situations where we want to know whether a given buildid exists on the server without actually downloading it.
For example, suppose I have a suite of test servers running random environments; during test runs a core is generated. I want to know if the program under test and/or system libraries for this system already exist in the debug server or not: I just want to look them up but not download them. If they don't exist perhaps I'll include them along with the core file when I bundle up the build results. If they do exist I don't need to add them.
Or perhaps I have an automated way for the test system to upload binaries and/or system libraries that aren't already on the debug server (I understand that upload is not in scope for this project and would need some other process) but I don't want to bother uploading things that I already have so I need to be able to check.
A simple program that uses the client interface to look up and/or download files would be very useful, as an example if nothing else (and probably for people who would like to add scripting to systems where it's not so simple to recode them to use it).
Cheers!
Posted Aug 7, 2019 21:09 UTC (Wed)
by fuhchee (guest, #40059)
[Link] (1 responses)
Yes.
> I think it would also be helpful if the client interface provided separate lookup and download methods
Will consider that ... though there may be better ways to service the needs you outline. Deduplication at upload time should be easy too. Re. optimizing packaging of core dumps ... not sure how much sense that makes. The core dump recipient could consult the same debuginfo servers too; or you could preemptively package all the files. Will think on it more.
> A simple program that uses the client interface to look up and/or download files would be very useful
It just appeared in the repo! We employ only the most talented psychics and keyboard monks.
Posted Aug 7, 2019 21:29 UTC (Wed)
by madscientist (subscriber, #16861)
[Link]
If you mean deduplication by the server that's probably helpful but it's a lot of wasted effort to upload 10's or 100's of MB of libraries, binaries, etc., only to have it tossed on the floor as duplicate. Consider a build farm with 200 systems, which are upgraded via apt-get update or whatever at random intervals so they have different system libraries, different program instances, etc... having every system upload all its files for every core even though the system libraries might only change once every few weeks or less seems like overkill.
> Re. optimizing packaging of core dumps ... not sure how much sense that makes. The core dump recipient could consult the same debuginfo servers too; or you could preemptively package all the files.
For this I wasn't thinking that the dbgserver code would do that, I was thinking about scripting that users are using with their test clients to bundle results of failures so they can be uploaded to a test server for further investigation. Our current scripting already preemptively packages all the files: what I'd like to be able to do is detect when some/all of these items are not needed and skip that to reduce the size of uploaded artifacts.
When you're talking about moving content into/out of AWS or other cloud providers, the amount of data sent over the network directly equates to $$ spent and reducing it is always welcome.
Thanks for working on this, it'll be very cool!
Posted Aug 7, 2019 22:01 UTC (Wed)
by nix (subscriber, #2304)
[Link] (1 responses)
I wonder if it can handle more than just debuginfo... musing about having libctf automatically launch dbgserver queries for missing CTF sections now -- so people can have separated CTF if they really want *and* it is as if it were always present. Best of all worlds! For that matter they can also do both -- perhaps an option at CTF generation time which automatically emits separated CTF *if* its size passes some threshold, or some percentage threshold of the total binary size, or the .text size, or something. Of course then you'd have to arrange for the dbgserver to see it, but presumably whatever method is used for separated debuginfo would work for this too.
Posted Aug 8, 2019 12:28 UTC (Thu)
by fuhchee (guest, #40059)
[Link]
It should be a small amount of extra effort to extend it sideways to a 'ctf' sibling to 'debuginfo'.
Posted Aug 7, 2019 22:26 UTC (Wed)
by thoughtpolice (subscriber, #87455)
[Link] (1 responses)
Essentially, every version of every package has a unique hash. We build a reverse mapping from the buildids of the binaries in a package to its unique hash, and upload that metadata along with the package to the package server. We then patch GDB (and elfutils) to look in a specific directory for debug info. This directory is a FUSE filesystem, and when any tool tries to look in `.build-id/...` for the debug info -- it does a query to the package server, obtaining the unique package ID containing the symbols, and transparently installs them through the package manager. It is effectively a version of Microsoft Symbol Server, which is basically what people want, from what I can tell...
Perhaps we could replace dwarffs with something like dbgserver, however. Or integrate them so there's a single UX. We could for instance, perhaps replace the client tooling with a separate "backend" for our case, and the tools can all just work around that instead...
Posted Aug 7, 2019 23:25 UTC (Wed)
by fuhchee (guest, #40059)
[Link]
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
If CTF annotations are built into Debian and RPM binary packages by default
They are explicitly not marked as debugging sections in the linker and are even kept in when --strip-unneeded for that exact reason :) if CTF is stripped out, it's useless. You might as well use DWARF in that case.
section size addr
.interp 28 4195040
.note.gnu.build-id 36 4195068
.note.ABI-tag 32 4195104
.gnu.hash 224 4195136
.dynsym 6384 4195360
.dynstr 3914 4201744
.gnu.version 532 4205658
.gnu.version_r 112 4206192
.rela.dyn 408 4206304
.rela.plt 5784 4206712
.init 26 4214784
.plt 3872 4214816
.plt.got 8 4218688
.text 289042 4218704
.fini 9 4507748
.rodata 1229488 4509696
.eh_frame_hdr 6804 5739184
.eh_frame 43248 5745992
.tbss 8 5794512
.init_array 8 5794512
.fini_array 8 5794520
.data.rel.ro 768 5794528
.dynamic 496 5795296
.got 32 5795792
.got.plt 1952 5795840
.data 4604 5797792
.bss 6416 5802400
.comment 91 0
.debug_aranges 2320 0
.debug_info 1732570 0
.debug_abbrev 57088 0
.debug_line 241926 0
.debug_str 138861 0
.debug_loc 563108 0
.debug_ranges 50320 0
.ctf 180737 5817008
Total 4571264
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
Run-time Introspection capability using the Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
I think it's much more important for your goals to convince distro vendors to cooperate with you than to play tricks with header flags pretending CTF is not really debug info.
It's not. It's type introspection info. It's no more debug info than C++ RTTI is. Programs can perfectly well introspect their own types without being debuggers in any sense.
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
[**] https://gnu.wildebeest.org/blog/mjw/2017/06/30/fedora-rpm...
[***] https://fedoraproject.org/wiki/Features/MiniDebugInfo
The Compact C Type Format in the GNU toolchain
In general any non-allocated section can be stripped away (or put into a separate .debug file). Because that simply means that the section isn't needed at runtime.
Well, it means it isn't needed by the executable loader. I was *forced* to make libctf non-loadable by internal constraints in ld (roughly, that you cannot simultaneously have an allocated section whose size is not known before bfd_elf_final_link() and that the symtab and strtab are not laid out until halfway through that function and that CTF needs to know the offsets of all strings in the strtab and the order of symbols, *and* it's compressed so its content affects its size: so by extension the section may not be allocated). That doesn't mean it's not going to be used by programs at runtime. It is. (Well, assuming anyone uses it at all. :) ). They load it out of the binary as needed using BFD.
The Compact C Type Format in the GNU toolchain
Sorry, RPM uses elfutils eu-strip, which will not have special magic to treat .ctf sections specially.
I guess that means CTF will always be stripped out of RPMs, and libctf and the CTF format by extension will be useless on RPM systems. This seems unfortunate. Is it really so hard to mark .ctf sections as not stripped? If it takes more than a couple of lines, something seems to me to be wrong.
But I do like CTF and I do hope it will become the default one day. Not to replace DWARF (it should be a companion to that), but to replace .gnu_debugdata [***]. Which is used by various tools now to have an "extra symbol table".
That won't work, I'm afraid. CTF does not contain a symbol table, since that would be a waste of space since ELF already has one. Instead, it relies on the ELF symtab. Its function and data object sections are 1:1 ordered in the same order as the ELF symtab (basically, you traverse the ELF symtab and whenever you pass another function symbol, you match it to a function info section entry: whenever you pass another data symbol, you match it to another data object section entry). This saves quite a lot of space: data object section entries in particular are only four bytes each (one type ID).
The Compact C Type Format in the GNU toolchain
For example rust packages do something like:
%global _find_debuginfo_opts --keep-section .rustc
>
> (To deal with the problems of dynamic symbol tables getting stripped out of binaries, Solaris defined .ldynsym, which appears to be much what .gnu_debugdata is, only it's just a symbol table rather than a whole LZMA-compressed ELF object containing a symbol table.)
Would it be an idea to adopt the .ldynsym from Solaris?
.gnu_debugdata was defined before we had compressed ELF sections in the standard.
Now that we have it maybe we should make .symtab a compressed section?
https://gnu.wildebeest.org/blog/mjw/2016/01/13/elf-libelf...
It might be easier to talk some ideas over in person.
https://gcc.gnu.org/wiki/cauldron2019
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
Sorry, RPM uses elfutils eu-strip, which will not have special magic to treat .ctf sections specially.
This can be read as meaning "no version of eu-strip will ever have the special magic", rather than what I believe you meant: "any eu-strip you find in the world right now will not have the necessary special magic".
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
Hm... we must be talking about something different. Let be more clear.
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
I see. So dbgserver_find_executable() is intended to be used with shared libs as well? Or is this part not quite complete?
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain
The Compact C Type Format in the GNU toolchain