An update on pahole
Pahole (originally "Poke-a-hole") is a Swiss Army knife for exploring and editing debug information. Pahole is also currently involved in the kernel's build process to rearrange the information produced by various compilers into a form useful to the BPF verifier, although there are plans to render it unnecessary. Pahole maintainer Arnaldo Carvalho de Melo shared some status updates about the project at the 2025 Linux Storage, Filesystem, Memory-Management, and BPF Summit. Interested readers can find his slides here.
![Arnaldo Carvalho de Melo [Arnaldo Carvalho de Melo]](https://static.lwn.net/images/2025/arnaldo-carvalho-de-melo.png)
Pahole has several uses in kernel development, including inspecting the layout of kernel structures, finding cache-line misalignment problems, and collecting statistics from the kernel's debugging information. The reason that Melo was presenting in the BPF track, however, is that pahole is also currently responsible for taking the debugging information generated during a build of the kernel and (if the user has enabled BPF) converting it into BTF for the verifier to make use of when loading a BPF program. Because the verifier, GCC, and Clang have all been evolving rapidly in this area, the BPF developers have turned to pahole to paper over the gaps in expectation between the different tools. [Melo later clarified that the compilers had not supported emitting BTF for non-BPF targets for many years, which was the initial motivation for including pahole in the build.]
Melo began with the announcement that pahole has a new co-maintainer: Alan Maguire, who will be helping Melo process patches. Melo hopes to use the resulting free time to be able to get pahole's release cadence sped up to match the kernel's; this would let a particular kernel be associated with a particular version of pahole, which would simplify toolchain management for many kernel developers. He would also like to set up continuous-integration testing for pahole.
Since last year, Melo said that there had been lots of work contributed to
the project, including about 140 patches. Having testing infrastructure to catch
problems with proposed patches automatically would be helpful in dealing with
that volume of changes, he said. He intends to
start by adapting the design of libbpf's continuous-integration setup, although
he was warned by the libbpf maintainers "not to repeat the same
mistakes
".
The pahole test suite is in reasonably good shape, but Melo wanted to encourage people to add more tests, especially ones comparing the BTF generated by different compilers.
Various updates
At that point Melo quickly went through a long list of in-progress pahole features,
with relatively little discussion of each one. Kees Cook [Correction: the
work was based on Cook's ideas, but not done by him.] The pahole
developers have been working on
adding support for flexible arrays to pahole; the program has all of the needed information to
calculate sizes for structures that contain them at this point.
Improvements to BTF handling include new metadata, support for
bpf_fastcall, and
resilient split BTF (which greatly improves the quality of BTF in loadable
modules).
Rust support has
been "fixed" in the sense that pahole now ignores most Rust debug information and
will no longer crash upon encountering some. It will attempt to reconstruct as
much information as it can, based on the existing C++ support, but Melo
called the resulting information "pretty useless
".
Pahole will also show what is skipped
because it could not be understood and issue a warning if the user attempts to encode BTF from a
kernel object written in Rust.
Pahole can also now transform information related to global variables (although this support is off by default), which makes the kernel's debugging information about 30% larger and covers around 76,000 variables. The most common variables are associated with tracepoints (~3,000 variables), trace events (~2,000 variables), or static keys (~1,000 variables).
There are still large numbers of variables that are "uninteresting
" and
therefore not included in pahole's output,
Melo said. Currently, pahole has a hard-coded set of filters to determine
whether a variable is worth including, but he would like to move that to a
separate configuration file. Perhaps such a configuration file could even live
in the kernel sources, so that kernel developers could tweak it for their needs.
Tweaking BTF
The current role of pahole in the kernel's build system is to read in DWARF debug information and output BTF. But that is far from the only workflow it supports. The tool can also ingest or output C Type Format information (a format with the goal of being a BTF superset suitable for use with programs other than the kernel, although the current version is not quite a superset). As of recently, pahole can parse BTF as well. This means that the tool can be used to modify BTF, Melo said. This can be used to do deduplication, correcting the output of buggy compilers, etc.
Melo proposed that, in the future, the kernel should be compiled with GCC's -gbtf option, which causes it to emit debugging information in BTF format. Then binutils will handle deduplicating the BTF while linking kernel objects together, before pahole performs final fixups. In this way, the conversion of DWARF to BTF will eventually be removed, which will speed up the process of building the kernel.
Alexei Starovoitov asked how far away GCC was from being able to build the kernel with -gbtf; David Faust said that compiling was possible right now and that the process would only fail at the linking step because ld doesn't yet know how to link BTF. Starovoitov asked whether anyone was working on that and Elena Zannoni confirmed that someone was. She expected that work to be complete in about a week.
That answer seemed to please Starovoitov, who thought that having compilers generate BTF natively, without using pahole to convert debugging information from DWARF format, would be a substantial speedup. The rest of the attendees agreed, although José Marchesi said that the real benefit was not tying BTF to information that is representable in DWARF. Melo agreed, saying that the conversion step using pahole was always a temporary measure. Starovoitov asked what would happen to the other fixes that pahole applies. Melo answered that pahole would still be part of the kernel build process, it will just have less to do.
[When originally published, this article referred to Arnaldo Carvalho de Melo as "Carvalho" after the first appearance of his name. In response to a reader question, I reached out to Melo, who informed me that his last name should be "Melo". The article has been updated accordingly.]
Index entries for this article | |
---|---|
Kernel | Development tools |
Conference | Storage, Filesystem, Memory-Management and BPF Summit/2025 |
Posted Jun 4, 2025 15:01 UTC (Wed)
by nix (subscriber, #2304)
[Link] (1 responses)
(I expect there to be bugs. I will fix them. But... it does work.)
Posted Jun 17, 2025 14:57 UTC (Tue)
by nix (subscriber, #2304)
[Link]
(More normal use cases like enterprise distro kernel configs and the like still have pahole taking longer and using more memory than ctf-dedup, but not anything like *that* much. Still, I half-expected libctf to be slower than pahole given how much fairly-low-hanging optimization fruit is left in ctf-dedup, so that was a happy surprise for me :) )
Proof of concept BTF toolchain deduplication is out there
Proof of concept BTF toolchain deduplication is out there