RVKMS and Rust KMS bindings
At the 2024 X.Org Developers
Conference (XDC), Lyude Paul gave a talk on the work she has been doing
as part of the Nova
project, which is an effort build an NVIDIA
GPU driver in Rust. She wanted to provide an introduction to RVKMS, which
is being used to develop Rust kernel mode setting (KMS)
bindings; RVKMS is a port of the virtual KMS (VKMS)
driver to Rust. In addition, she wanted to give her opinion on Rust, and why she
thinks it is
a "game-changer for the kernel
", noting that the reasons are not
related to the oft-mentioned, "headline" feature of the language: memory
safety.
The Nova driver is written in Rust in part because of the lack of a stable firmware ABI for NVIDIA GPU system processors (GSPs). Handling that in C is difficult, Paul said. The inspiration came from the Asahi driver for Apple GPUs, which uses a similar approach to handle unstable firmware ABIs. In addition, the Nova project can help prove Rust's readiness for the kernel by getting its drivers upstream, which will help make it easier for projects like Asahi get their work upstream as well.
Writing a kernel driver for a new device is challenging and takes time.
For Nova, there is also a need to develop the Rust bindings for a kernel
graphics driver. "Luckily, a lot of this has already been done in
Asahi
". There are already lots of bindings available, though they are
not yet upstream; doing so entails figuring out if there are changes needed in
those bindings and getting them accepted into the kernel.
The Asahi bindings do not cover kernel mode setting, however, which is
surprising; KMS is one of the only parts of that driver that is written in
C. So there are no KMS bindings to use for Nova and it is still too early
in Nova development to add KMS support to it. On the other hand, though,
"KMS is a large enough surface that we wanted to be able to work on this
sooner than later, and ideally in parallel to the rest of Nova
".
RVKMS
So, while Nova was working toward needing KMS, the team decided that Paul
would port a KMS driver to Rust in order to create the necessary bindings.
VKMS was chosen because "it's a pretty simple driver, it doesn't require
any specific hardware
". VKMS "pretends to be a display device
";
it also supports CRC generation and writeback
connectors, which can be used for testing.
![Lyude Paul [Lyude Paul]](https://static.lwn.net/images/2024/xdc-paul-sm.png)
For the Rust port, RVKMS, "it's very early in development, driver-wise;
it doesn't do a whole ton yet
". At this point it can basically just
"register a KMS driver and set up VBLANK
emulation using high-resolution timers
". Eventually, she hopes that the driver will
have CRC generation and connector writeback, as well.
Even though it is still early in RVKMS development, it has already proved
"very useful in making progress with these bindings
". Paul said
that she tried to anticipate the needs of other KMS drivers, such as i915 and
nouveau, and not just focus on RVKMS, when designing the API. Most of her
time has been spent on the bindings, rather than RVKMS itself, which is
still quite small.
There are several goals for the KMS bindings; one is to prevent undefined behavior by using
safe code. Another is to make incorrect implementations of the KMS API
nearly impossible; "Rust gives us a lot of tools to actually be able to
prove that the way things are implemented are correct at compile time.
"
The API should be ergonomic, as well; preventing mistakes should not make
for code that is messier or more difficult to write. The intention is to
mostly only support atomic mode setting,
though there will "probably be some basic support for the various legacy helpers
"
KMS bindings
The KMS bindings are currently working on top of the direct rendering
management (DRM) bindings from Asahi and Nova. Unlike the KMS API in C,
the Rust KMS bindings "are mostly in control of the order of operations
during device registration
". In order to support KMS in a Rust driver,
it is only necessary to implement the kernel::drm::kms::Kms trait,
which "handles calling things in the right order, registering the
device, and that sort of thing
".
Paul then went into a fair amount of detail on the KMS bindings, which I will try to relay, though my graphics and Rust knowledge may not be fully up to the task. The YouTube video of the talk and her slides will be of interest to those seeking more information. Background material on the Linux graphics stack can be found in part one of our two-part series looking at it; for this talk, part two may be the most relevant piece. The Wikipedia article on DRM and its section on the KMS device model may also be useful, especially for some of the terminology.
There are two main parts to the Kms trait, she
said. mode_config_info() is used for static information, like
minimum and maximum resolution, various cursor capabilities, and others.
create_objects() provides "access to a special
UnregisteredKmsDevice type
" that can be used to create both
static (e.g. "cathode-ray-tube controller" (CRTC), plane) and non-static
(e.g. connectors) objects. In the future, hooks for customizing the initial
mode setting will likely be added, but those are not needed for the virtual
display provided by RVKMS.
"One of the neat things
" with the bindings is that drivers
implementing the Kms trait, get a KmsDriver trait
implemented automatically. That allows KMS-dependent methods to only be
available to drivers that actually implement Kms. So all
of the bindings can just assume that KMS is always present and set up,
instead of having run-time checking and adding error paths.
Mode objects
DRM has the concept of a "mode object" that is exposed to user space through an object ID. Mode objects can have a reference count and be created at any time, or not have a reference count, but those can only be created before driver registration. The ModeObject trait is used to represent them. Reference-counted objects fit in nicely with Rust lifetime requirements; an RcModeObject trait is used for those to reduce the reference-counting boilerplate needed.
Static objects, such as CRTCs and planes, typically share the lifetime of a device and are more challenging to handle because that does not easily map to Rust lifetimes. The StaticModeObject and KmsRef traits are used for those types of objects; KmsRef acts as a reference count on the parent device, while allowing access to the static object, which allows owned references to the static objects.
Implementing CRTCs, planes, and other components of that sort turned out to
be "a bit more
complicated than one might expect
", she said. Most drivers do not use
the DRM structures unmodified, and instead embed them into driver-private
structures; for
example, in VKMS, the vkms_crtc structure embeds
drm_crtc. They contain and track driver-private information,
including display state and static information. Drivers often have
multiple subclasses of these types of objects; for example, both i915 and
nouveau have multiple types of connectors, encoders, and others.
It turns out that "this is not the first time we've had to do something
like this
"; Asahi had to do something similar for its Graphics
Execution Manager (GEM) support. In GEM infrastructure, this type of
subclassing, where driver-private data is maintained with the object, is
common. The needs for KMS subclassing are more variable than for GEM,
because the technique is used more widely, but the Asahi work provided a
good starting point, she said.
In the KMS bindings, there are traits for the different object types, such as
DriverCrtc and DriverEncoder; drivers can have multiple
implementations of them as needed. Driver data can be stored in the
objects either by passing immutable data to the constructor or at any other
point using send and sync
containers. KMS drivers typically switch between the common
representation (e.g. drm_crtc) and the driver-specific one
(vkms_crtc), which is also possible with the KMS Rust bindings.
There are some operations that should apply to all instances of the class
and others that are only for the specific subclass.
So there is a "fully-typed interface
" that provides access to the private data and the
common DRM methods and an opaque interface that only provides access to the
common methods.
The same mechanism is used for atomic states, with fully-typed and opaque
interfaces, which can be switched between at run time. If access to the
private data is needed, objects can be fallibly converted to
fully-typed. That required support for consistent vtable
memory locations, "which is not something that Rust has by default
",
since constants are normally inlined, rather than stored as static
data. A Rust macro (#[unique]) was added to make that work.
Atomic commits
"Things diverge a bit
" for atomic commits due to Rust's requirements.
The Rust data-aliasing rules allow having an infinite number of immutable
references to an object or a single mutable reference at any given time.
If the atomic callbacks for checking, updating, and the like only affected
the object they were associated with, it would be easy to handle, but that
is not the case. The callbacks often iterate through the state of other
objects, not just the one that the callback belongs to.
She originally started implementing the callbacks using just references,
but that did not really work at all. Instead, she took inspiration from RefCell,
which is a "Rust API for handling situations where the data-aliasing
rules aren't exactly ideal
". Mutable and immutable borrows still
exist, but they are checked at run time rather than compile time.
When working with the atomic state, most of the code will use the AtomicStateMutator container object, which is a wrapper around an AtomicState object. There are always immutable references to the container available, and it manages handing out borrows for callbacks that want to examine or change the state. There can only be a single borrow for each state, but a callback can hold borrows for multiple states. Borrowing is fallible, but the interface is meant to be ergonomic; for example, callbacks are made with a pre-borrowed state, so that the callback does not need to obtain it.
In order to enforce the order of operations and protect states from
mutation once they are made visible outside of the atomic commit, the
bindings use the typestate pattern.
This is a feature that is not unique to Rust, but is not common in other
languages; "Rust generally makes it a lot easier to work with than other
languages
". It allows the bindings to "encode the run-time state of
something into compile-time state
"; the idea is that the object is
represented by a different type at every stage of its lifetime. It provides
"a very powerful tool to actually enforce API correctness
", Paul said.
For example, AtomicCommitTail is an AtomicState wrapper
that lets the driver developer control the order in which commits are
executed. It does so mostly by using tokens for each step of the process; the
tokens prove that a certain prerequisite has been done. The checking is
done at compile time and "it lets you make it impossible to write an
incomplete atomic_commit_tail() [callback] that actually
compiles
". The code has to "perform every step and you have to
perform them in the correct order, otherwise the code just doesn't
compile
".
KMS drivers have lots of optional features, she said; for example, VBLANK is used everywhere to some extent, but some hardware does not have a VBLANK interrupt, so it must be emulated in the DRM core. The Rust bindings can use traits to only allow drivers that implement VBLANK to access the appropriate methods; other drivers will not be able to call those methods. If it implements the DriverCrtcVblank trait, it will have access to the VBLANK-exclusive methods; that pattern can be extended for other optional pieces of functionality.
Paul closed the first part of her talk with thanks to various people and groups who have helped make RVKMS and the KMS bindings possible: the Asahi project, MaĆra Canal, and her co-workers at Red Hat working on Nova. From there, she moved on to talk about her experience with Rust.
Rust experiences
"I won't be talking about memory safety
", she said; one of the big
mistakes made when people are trying to advocate for Rust is to entirely
focus on memory safety. Kernel developers already know that C is unsafe,
so pushing hard on the memory-safety point often sounds like the Rust
advocates are talking down to the kernel developers. That is one of the
reasons that she avoided looking at Rust for years. Instead, she believes
that there are more compelling arguments for bringing Rust to the kernel.
"Rust can be a kernel maintainer
"; a huge part of being a maintainer
is to stop bad patterns in the code. That is time-consuming, and requires
constantly re-explaining problems, while hoping nothing important was
missed. "It can make you snippy; it can burn through your
motivation
".
Rust can help with that, because it provides a lot of tools to enforce code
patterns that would have needed to be corrected over email. It is "a lot more
capable than anything we were really ever able to do in C
". The uses
of the
typestate pattern are a good example of that; they have little, usually no,
run-time cost. There is an upfront cost to Rust, in learning the language
and in rethinking how code is written to fit into the new model, but
"the potential for saving time long term is kind of astounding
".
People often wonder about how to work with unsafe code, but its presence does not really change much in her experience. For one thing, unsafe code also acts as an enforcement tool; a "safety contract" must be present in the comments for unsafe code or the compiler will complain. That requires those writing unsafe code to think about and document why and how they are violating the language invariants, which gives reviewers and maintainers something to verify. Unsafe acts as a marker for a place where more scrutiny is needed.
"It's sort of wild what the end result of this is"; when writing RVKMS, she spent almost no time debugging: around 24 hours over a few months of development. Writing drivers in C has always been a loop of adding a bunch of code, then spending a day or more debugging problems of various sorts (missed null checks, forgotten initialization, thread-safety issues, etc.), and going back to adding code. That is not how things go with Rust; "
if things compile, a lot of times it will actually work, which is a very weird concept and is almost unbelievable until you've actually dealt with it yourself".
Before Paul started working with Rust, she was put off by a lot of the
patterns used, such as a lack of null, having to always handle option returns, and
"tons of types, that sounds kind of like a nightmare
". It turns out
that "Rust is ergonomic enough
" that you end up not really thinking
about those things once a set of bindings has been developed. Much of the
time, it also
"almost feels obvious what the right design is
". Most of the
Rust constructs have lots of shortcuts for making them "as legible and
simple as possible
". Once you get past the design stage, you rarely
need to think about all of the different types; "a lot of the time, the
language just sort of handles it for you
".
She is not a fan of comparisons to C++, in part because
"Rust is kind of a shockingly small language
". It is definitely
complicated and difficult to "wrap your head around at first
", but
its scope is limited, unlike C++ and other languages, which feel more like
a framework than a language, she said. The Rust standard library is built
around the "keep it simple, stupid" (KISS) philosophy, but it is also
constantly being iterated on to make it easier to use, while not
sacrificing compatibility. Once you get used to the Rust way of doing
things, the correct way to do
something generally feels like the obvious way to do it as well.
She concluded her talk with a question: "would you rather repeat
yourself on the mailing list a million times
" to stop the same
mistakes, "or would you rather just have the compiler do it?
" She
suggested: "Give Rust a try
".
Q&A
An audience member asked about how the Rust code would fare in the face of
changes to the DRM API in the kernel. Paul said that refactoring Rust code
"tends to be very easy, even with a lot of subtly more complicated
changes than you might have to work around in C
". It is not free, of
course, but refactoring in Rust is not any harder than it is for C.
Another question was about Rust development finding problems in the existing C APIs and code; Paul said that has happened and she thinks Rust is helpful in that regard because it forces people to clearly think things through. DRM, though, has been pretty well thought-out, she said, so most of what she has seen has been elsewhere in the kernel; in the response to a separate question, she reiterated that DRM was never really an impediment to the Rust work, in part because it is so well designed and documented.
Adding functionality to DRM using Rust was also asked about; does it make sense to do so? Paul said that it would make sense because Rust forces the developer to think about things up front, rather than to just get something working quickly and deal with locking or other problems as they arise. That leads to the "if it compiles, it will likely work" nature of Rust code. But, calling Rust from C is difficult, at least for now, so that would limit the ability to use any new Rust features from existing C drivers and other code.
Another question was about getting started today on a KMS driver; would she suggest doing that in C or in Rust? For now, she would recommend C, though that may change eventually. The problem is that there are a lot of missing bindings at this point and whenever she adds functionality to RVKMS, she ends up adding more bindings. Designing bindings requires more overall knowledge of DRM and other KMS drivers in addition to Rust itself. Once most of the bindings are available, though, starting out with Rust will be a reasonable approach.
The last question was about compile time, which is often a problem for
larger Rust projects. Paul said that she was "actually surprisingly
happy
" with the compile time at this point, but it is probably too
early to make that determination. Once more Rust code is added into the
mix, that will be when the compile-time problem pops up.
[ I would like to thank LWN's travel sponsor, the Linux Foundation, for travel assistance to Montreal for XDC. ]
Index entries for this article | |
---|---|
Kernel | Development tools/Rust |
Kernel | Device drivers/Graphics |
Conference | X.Org Developers Conference/2024 |
Posted Nov 21, 2024 8:13 UTC (Thu)
by MKesper (subscriber, #38539)
[Link]
Other recent mentions of Nova and its role for Rust and kernel drivers