DebugFS on Rust
DebugFS is the kernel's anything-goes, no-rules interface: whenever a kernel developer needs quick access to internal details of the kernel to debug a problem, or to implement an experimental control interface, they can expose them via DebugFS. This is possible because DebugFS is not subject to the normal rules for user-space-interface stability, nor to the rules about exposing sensitive kernel information. Supporting DebugFS in Rust drivers is an important step toward being able to debug real drivers on real hardware. Matthew Maurer spoke at Kangrejos 2025 about his recently merged DebugFS bindings for Rust.
Maurer began with an overview of DebugFS, including the things that make
implementing a Rust API tricky. DebugFS files should outlive the private data
that they allow access to, in case someone holds a file descriptor open after
the underlying object has gone away. Also, DebugFS directory entries can be removed
at any time, or will be automatically removed when
the parent directory entry is destroyed. "That will come
back to haunt us.
" Finally, DebugFS directories have to be manually
torn down; they aren't scoped to an individual kernel module.
All of this comes together to make a set of lifetime constraints that's difficult to faithfully model in Rust. At first, Maurer thought to implement a DebugFS file as a weak reference-counted pointer to a Rust trait object. That doesn't work for several reasons, including the fact that DebugFS files don't have a destruction callback. Also, DebugFS gives files one word of private data — normally used as a pointer to the object they are concerned with — but Rust pointers to trait objects are two words wide (one pointer to the object, and one to its virtual method table).
These problems aren't insurmountable — Maurer could have just added an additional pointer indirection — but that wouldn't be elegant. He wanted to find a solution that naturally fits with the lifecycle of a DebugFS directory entry, while only having one word of private data and minimal overhead. The design that Maurer ended up proposing was to have the directory entries reference-counted such that they are not destroyed until all of their child objects have been dropped, and the directory itself has been dropped. To accomplish this, two different interfaces would be exposed to Rust: a simple one for DebugFS directories with simple lifetimes, as well as a more complex, general one.
The simpler API, which Maurer called the "File API", has the DebugFS file actually own its associated data. Exposing some existing Rust data is as simple as wrapping it in a debugfs::File<T>; by default, the read and write operations for the file will convert the value to or from a string and read or update it as appropriate. The programmer can attach their own callbacks, instead, to implement custom behaviors. The downside is that there is no way to have multiple files reference the same data (without some internal reference-counted pointer), and it's not possible to conditionally provide a file based on whether some run-time value is true or false.
The more complex API, the "Scope API", allows multiple files to refer to the same data, to refer to multiple separate structures in any combination, to create files conditionally, etc. In turn, it can't delete individual subdirectories or files — the whole DebugFS directory needs to be released at once.
Maurer went through examples of how to use each API; while a bit complex, the use of the file API could be substantially simplified if Rust gains built-in in-place initialization. Neither API was terribly surprising — but the obscure contortions (read: cool hacks) required to make them work efficiently were considerably more interesting.
Pointer smuggling
As previously mentioned, DebugFS provides only a single word of private data for file structures, which is ordinarily a pointer to the underlying data for the DebugFS file, a property that Maurer wanted to preserve. But part of the utility of DebugFS is that the developer can override the file operations with arbitrary functions; that makes it easy to trigger actions in a driver in response to reads or writes to a DebugFS file. It would be possible to do this by making the user of DebugFS fill out a struct file_operations, but Maurer wanted a less verbose API. The ergonomic way to encode this in the Rust APIs is to allow the programmer to attach a function or closure to the debugfs::File object. Somehow, those function pointers need to make their way into the file_operations structure used by DebugFS. But Maurer also didn't want the API to need to allocate space for the structure at run time — he wanted the appropriate structure to be generated statically, at compile time, making the entire Rust DebugFS interface allocation-free.
Maurer's solution relies on the fact that, in Rust, every function and closure
has its own unique type at compile time. This is done because it makes it easier
for LLVM the Rust compiler to apply certain optimizations — a call through Rust function pointer can often be
lowered to a direct jump or a jump through a dispatch table, instead of a call
through an actual pointer. This makes Rust function types unique zero-sized
types: there is no actual data associated with them, because the type is enough
for the compiler to determine the address of the function.
The *_callback_file() functions in his new API, which take callbacks to implement the read and write operations on a file, don't actually store the provided function pointers anywhere. Instead, the type of the callback is passed as a generic argument to the code that fills out instances of the file_operations structure. When the Rust code is monomorphized during compilation, a different file_operations structure is generated for each file that uses a different set of callbacks. The generic code turns the type of the function back into a pointer to the actual function itself, and calls it. Since the conversion is done at compile time, the pointer to the callback never actually has to be stored anywhere outside the file_operations structure at run time. This trick effectively "smuggles" the function pointer through the type system, which lets Maurer pass off the work of constructing all of the needed file_operations structures to the compiler's monomorphization implementation and avoid allocating.
The reaction to this explanation was mixed. While everyone present agreed that it was clever, and permitted writing a nice API, there was some sentiment that it might be too clever. Gary Guo pointed out one potential problem with the (unsafe) code that Maurer wrote to turn a function type back into an actual function pointer: while it was correct for function types, attempting to use it with other zero-sized types could cause undefined behavior, because it didn't ensure that internal invariants of the type are checked.
There are some zero-sized types where the actual address of the value is important, Guo explained. For example, a programmer could create a zero-sized type representing that the data at a particular address is readable. Alice Ryhl suggested restricting the function to only operate on types that implement the Copy trait, since they can't have invariants that rely on having a stable address. Maurer replied that he wasn't worried in this case, because the function was intended as an internal implementation detail of the DebugFS interface, but agreed that in the general case requiring the type to implement the Copy trait would make sense. One of the assembled developers asked Pierre-Emmanuel Patry whether he anticipated supporting code like this to be a problem for gccrs; he did not think that it would impose any additional burden, since some parts of the standard library already rely on the behavior of function types.
Andreas Hindborg asked for more details on why smuggling a pointer through the type system like this was permitted — specifically, why Maurer had claimed that the type needed to be "inhabited" for the trick to work. Zero-sized types can either have one valid value (the typical case), or no valid values, Maurer explained. So, if someone tried to use his trick to create a pointer to a type that exists, but where constructing a value of the type is impossible, they could break Rust's type system — which is why the helper function is unsafe.
Hindborg asked whether the pointer-smuggling trick was documented anywhere.
Maurer replied: "It's well documented in the code
", to general
laughter. Guo asked whether they could just change the DebugFS C structure to
have two pointers, and avoid this whole workaround. Maurer passed the question
off to Greg Kroah-Hartman, who answered that he didn't think they could, because
it would impact the layout of the inode structure, which is
widely used outside DebugFS. In his opinion, this was a case of "you
optimized for fun
" — the equivalent C code just allocates and eats the cost
of an additional pointer indirection. But he didn't think there was anything
wrong with odd techniques being used here; in many ways, it's what DebugFS is
there for.
Ultimately, the pointer-smuggling solution did remain in the final patch set that was merged for the 6.18 kernel. The trick is unlikely to be adapted for use in wider contexts in the kernel's Rust bindings, though.
| Index entries for this article | |
|---|---|
| Kernel | Development tools/Rust |
| Conference | Kangrejos/2025 |
