|
|
Subscribe / Log in / New account

How to write Rust in the kernel: part 2

By Daroc Alden
June 27, 2025

Rust in the kernel

In 2023, Fujita Tomonori wrote a Rust version of the existing driver for the Asix AX88796B embedded Ethernet controller. At slightly more than 100 lines, it's about as simple as a driver can be, and therefore is a useful touchstone for the differences between writing Rust and C in the kernel. Looking at the Rust syntax, types, and APIs used by the driver and contrasting them with the C version will help illustrate those differences.

Readers who are already conversant with Rust may find this article retreads some basics, but it is my hope that it can still serve as a useful reference for implementing simple drivers in Rust. The C version and the Rust version of the AX88796B driver are remarkably similar, but there are still some important differences that could trip up a developer performing a naive rewrite from one to the other.

The setup

The least-different thing between the two versions is the legalities. The Rust driver starts with an SPDX comment asserting that the file is covered by the GPL, as many files in the kernel do. Below that is a documentation comment:

    //! Rust Asix PHYs driver
    //!
    //! C version of this driver: [`drivers/net/phy/ax88796b.c`](./ax88796b.c)

As mentioned in the previous article, comments starting with //! contain documentation that applies to the entire file. The next few lines are a use statement, the Rust analogue of #include:

    use kernel::{
        c_str,
        net::phy::{self, reg::C22, DeviceId, Driver},
        prelude::*,
        uapi,
    };

Like C, Rust modules are located starting from a search path and then continuing down a directory tree. Unlike C, a use statement can selectively import only some items defined in a module. For example, DeviceId is not a separate module, but rather a specific item inside the kernel::net::phy module. By importing both kernel::net::phy::DeviceId and kernel::net::phy as a whole, the Rust module can refer to DeviceId directly, and anything else from the PHY module as phy::name. These items can always be referred to by their full paths; a use statement just introduces a shorter local alias. If a name would be ambiguous, the compiler will complain.

All of these imported items come from the kernel crate (Rust library), which contains the bindings between the main kernel and Rust code. In a user-space Rust project, a program would usually also have some imports from std, Rust's standard library, but that isn't possible in the kernel, since the kernel needs more precise control over allocation and other details that the standard library abstracts away. Kernel C developers can't use functions from libc in the kernel for much the same reason. The kernel::prelude module contains kernel replacements for many common standard-library functions; the remainder can be found in core, the subset of std that doesn't allocate.

In the C version of the driver, the next step is to define some constants representing the three different, but related, devices this driver supports: the AX88772A, the AX88772C, and the AX88796B. In Rust, items do not have to be declared before use — the entire file is considered at once. Therefore, Fujita chose to reorder things slightly to keep the code for each board in its own section; the types for each board (PhyAX88772A and so on) are defined later. The next part of the Rust driver is a macro invocation that sets up the necessary symbols for a PHY driver:

    kernel::module_phy_driver! {
        drivers: [PhyAX88772A, PhyAX88772C, PhyAX88796B],
        device_table: [
            DeviceId::new_with_driver::<PhyAX88772A>(),
            DeviceId::new_with_driver::<PhyAX88772C>(),
            DeviceId::new_with_driver::<PhyAX88796B>()
        ],
        name: "rust_asix_phy",
        authors: ["FUJITA Tomonori <fujita.tomonori@gmail.com>"],
        description: "Rust Asix PHYs driver",
        license: "GPL",
    }

Rust macros come in two general kinds: attribute macros, which are written #[macro_name] and modify the item that they appear before, and normal macros, which are written macro_name!(). There is also a less common variant of attribute macros written #![macro_name] which applies to the definition that they appear within. Normal macros can use any matching set of braces to enclose their arguments, but can always be recognized by the mandatory exclamation mark between the name and the braces. The convention is to use parentheses for macros that return a value and braces for macros that are invoked to define a structure (as is the case here), but that is not actually required. Invoking the macro with parentheses would have the same result, but it would make it less obvious to other Rust programmers what is happening.

The drivers argument to the macro contains the names of the three board types this driver covers. Each driver has to be associated with information such as the name of the device and the PHY device ID that it should be active for. In the C version of the driver, this is handled by a separate table:

    static struct phy_driver asix_driver[] = { ... };

In the Rust code, this information is stored in the code for each board (see below), since all PHY drivers need to provide it. Overall, the kernel::module_phy_driver!{} macro serves the same role as the module_phy_driver() macro in C.

Next, the Rust driver defines two constants that the code uses later:

    const BMCR_SPEED100: u16 = uapi::BMCR_SPEED100 as u16;
    const BMCR_FULLDPLX: u16 = uapi::BMCR_FULLDPLX as u16;

Every declaration of a value (as opposed to a data structure) in Rust starts with either const or let. The former are compile-time constants — like a simple #define in C. Types are mandatory for const definitions, but optional for let ones. In either case, the type always appears separated from the name by a colon. So, in this case, both constants are u16 values, Rust's unsigned 16-bit integer type. The as u16 part at the end is a cast, since the original uapi::BMCR_* constants being referenced are defined in C and assumed to be 32 or 64 bits by default, depending on the platform.

An actual function

The final piece of code before the actual drivers is a shared function for performing a soft reset on Asix PHYs:

    // Performs a software PHY reset using the standard
    // BMCR_RESET bit and poll for the reset bit to be cleared.
    // Toggle BMCR_RESET bit off to accommodate broken AX8796B
    // PHY implementation such as used on the Individual
    // Computers' X-Surf 100 Zorro card.
    fn asix_soft_reset(dev: &mut phy::Device) -> Result {
        dev.write(C22::BMCR, 0)?;
        dev.genphy_soft_reset()
    }

There's a few things to notice about this function. First of all, the comment above it is not a documentation comment. This isn't a problem because this function is also private — since it was declared with fn instead of pub fn, it's not visible outside this one module. The C equivalent would be a static function. In Rust, the default is the opposite way around, with functions being private (static) unless declared otherwise.

The argument to the function is an &mut phy::Device called dev. References (written with an &) are in many ways Rust's most prominent feature; they are like pointers, but with compile-time guarantees that certain classes of bugs (such as concurrent mutable access without synchronization) can't happen. In this case, asix_soft_reset() takes a mutable reference (&mut). The compiler guarantees that no other function can have a reference to the same phy::Device at the same time. This means that the body of the function can clear the BMCR pin and trigger a soft reset without worrying about concurrent interference.

The last part of the function to understand is the return type, Result, and the "try" operator, ?. In C, a function that could fail often indicates this by returning a special sentinel value, typically a negative number. In Rust, the same thing is true, but the sentinel value is called Err instead, and is one possible value of the Result enumeration. The other value is Ok, which indicates success. Both Err and Ok can carry additional information, but the default in the kernel is for Err to carry an error number, and for Ok to have no additional information.

The pattern of checking for an error and then immediately propagating it to a function's caller is so common that Rust introduced the try operator as a shortcut. Consider the same function from the C version of the driver:

    static int asix_soft_reset(struct phy_device *phydev)
    {
	    int ret;

	    /* Asix PHY won't reset unless reset bit toggles */
	    ret = phy_write(phydev, MII_BMCR, 0);
	    if (ret < 0)
		    return ret;

	    return genphy_soft_reset(phydev);
    }

It performs the same two potentially fallible library function calls, but needs an extra statement to propagate the potential error. In the Rust version, if the first call returns an Err, the try operator automatically returns it. For the second call, note how the line does not end with a semicolon — this means the value of the function call is also the return value of the function as a whole, and therefore any errors will also be returned to the caller. The missing semicolon is not easy to forget, however, because adding it in will make the compiler complain that the function does not return a Result.

The main driver

The actual driver code differs slightly for the three different boards. The simplest is the AX88786B, the implementation of which starts on line 124:

    struct PhyAX88796B;

This is an empty structure. An actual instance of this type has no storage associated with it — it doesn't take up space in other structures, size_of() reports 0, and it has no padding — but there can still be global data for the type as a whole (such as debugging information). In this case, an empty structure is used to implement the Driver abstraction, in order to bundle all of the needed data and functions for a PHY driver together. When the compiler is asked to produce functions that apply to a PhyAX88796B (which the module_phy_driver!{} macro does), it will use this definition:

    #[vtable]
    impl Driver for PhyAX88796B {
        const NAME: &'static CStr = c_str!("Asix Electronics AX88796B");
        const PHY_DEVICE_ID: DeviceId =
            DeviceId::new_with_model_mask(0x003b1841);

        fn soft_reset(dev: &mut phy::Device) -> Result {
            asix_soft_reset(dev)
        }
    }

The constant and function definitions work in the same way as above. The type of NAME uses a static reference ("&'static CStr"), which is a reference that is valid for the entire lifetime of the program. The C equivalent is a const pointer to the data section of the executable: it is never allocated, freed, or modified, and is therefore fine to dereference anywhere in the program.

The new Rust feature in this part of the driver is the impl block, which is used to implement a trait. Often, a program will have multiple different parts that conform to the same interface. For example, all PHY drivers need to provide a name, associated device ID, and some functions implementing driver operations. In Rust, this kind of common interface is represented by a trait, which lets the compiler perform static type dispatch to select the right implementation based on how the trait functions are called.

C, of course, does not work like this (although _Generic can sometimes be used to implement type dispatch manually). In the kernel's C code, PHY drivers are represented by a structure that contains data and function pointers. The #[vtable] macro converts a Rust trait into a singular C structure full of function pointers. Up above, in the call to module_phy_driver!{}, the reference to the PhyAX88796B type lets the compiler find the right Driver implementation, and from there produce the correct C structure to integrate with the C PHY driver infrastructure.

There are obviously more functions involved in implementing a complete PHY driver. Luckily, these functions are often the same between different devices, because there is a standard interface for PHY devices. The C PHY driver code will fall back to a generic implementation if a more specific function isn't present in the driver's definition, so the AX88796B code can leave them out. The other two devices supported in this driver specify more custom functions to work around hardware quirks, but those functions are not much more complicated than what has already been shown.

Summary

Steps to implement a PHY driver ...

... in C:... in Rust:
Write module boilerplate (licensing and authorship information, #include statements, etc.). Write module boilerplate (licensing and authorship information, use statements, a call to module_phy_driver!{}).
Implement the needed functions for the driver, skipping functions that can use the generic PHY code. Implement the needed functions for the driver, skipping functions that can use the generic PHY code.
Bundle the functions along with a name, optional flags, and PHY device ID into a struct phy_driver and register it with the PHY subsystem. Bundle the functions along with a name, optional flags, and PHY device ID into a trait; the #[vtable] macro converts it into the right form for the PHY subsystem.

Of course, many drivers have specific hardware concerns or other complications; kernel software is distinguished by its complexity and concern with low-level details. The next article in this series will look at the design of the interface between the C and Rust code in the kernel, as well as the process of adding new bindings when necessary.



to post comments

Not a big fan of #vtable

Posted Jun 29, 2025 0:18 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link] (5 responses)

Hm. I'm not a big fan of #vtable and the autoregistration. The C code with its manual tables was easier to understand.

Not a big fan of #vtable

Posted Jun 29, 2025 3:20 UTC (Sun) by iabervon (subscriber, #722) [Link] (4 responses)

I don't think the registration is more automatic than in the C code; the call to module_phy_driver is just at the top of the file instead of at the bottom.

The "impl Driver for PhyAX88772C" thing is the usual way to associate a bunch of functions that are related together in Rust with what they're used for, and the odd "#[vtable]" thing is just making the idiomatic Rust code produce what the C code needs.

Not a big fan of #vtable

Posted Jun 29, 2025 3:23 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

What will happen if I mis-spell `resume` as `resum`? I haven't checked the macro source, but I think it'll just silently ignore this function?

Not a big fan of #vtable

Posted Jun 29, 2025 3:56 UTC (Sun) by iabervon (subscriber, #722) [Link] (2 responses)

That'll produce an error even before you get to the macro; you can't put anything in "impl Driver for Something" that isn't part of the Driver trait.

Not a big fan of #vtable

Posted Jun 29, 2025 7:31 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

Indeed there's an opportunity (at least as of the nightly on Compiler Explorer) for a small diagnostic improvement

error[E0407]: method `resum` is not a member of trait `Driver`

... but it has no suggestions for how to fix this, whereas in some other contexts it looks for similar symbols and would have suggested we meant `resume` instead, meaning it spells out what we got wrong.

In my experience improving error diagnostics in rustc is very achievable for a person who has some Rust but maybe isn't yet comfortable writing scary bit-banging code with it, and it's a huge confidence boost when the tests pass (rustc has lots of tests for this stuff) and the reviewers send your work to be integrated into a future compiler version. Unlike fiddling with a hardware driver, no reboots are required and people don't need identical hardware to test it.

Not a big fan of #vtable

Posted Jun 29, 2025 9:02 UTC (Sun) by NYKevin (subscriber, #129325) [Link]

Before anyone asks: You also cannot omit any item that the trait declares, unless the trait provides a (default) definition. Otherwise, the implementation is considered incomplete and you get a compile error.

Are casts encouraged in Rust?

Posted Jun 29, 2025 19:37 UTC (Sun) by alx.manpages (subscriber, #145117) [Link] (17 responses)

const BMCR_SPEED100: u16 = uapi::BMCR_SPEED100 as u16;

sounds like C's

constexpr int16_t BMCR_SPEED100 = (int16_t) BMCR_SPEED100;

Casts are strongly discouraged in C due to the silence they impose to compiler diagnostics about value change during conversion (they can drop bits). It is common to want a total amount of zero casts in a program, when possible (and often it is possible). Are casts not a problem in Rust?

Are casts encouraged in Rust?

Posted Jun 29, 2025 19:39 UTC (Sun) by alx.manpages (subscriber, #145117) [Link]

Actually, should have written something like

constexpr int16_t BMCR_SPEED100 = (int16_t) uapi_BMCR_SPEED100

Are casts encouraged in Rust?

Posted Jun 29, 2025 20:28 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (10 responses)

Rust has no implicit integer promotions, so all integer conversions must be written explicitly, usually in one of three forms:

* uapi::BMCR_SPEED100.into() - If the value is of a type that can be losslessly converted into the desired type, this is done. Otherwise, it is a compile error. The compiler attempts to infer the type we're converting into from context, but it can be spelled out explicitly (and this must be done if it is ambiguous). This can be equivalently written as u16::from(uapi::BMCR_SPEED100) (which is always unambiguous if we assume the argument's type is known).
* uapi::BMCR_SPEED100.try_into() - Returns Result<u16>, an enum that is either Ok(some value) or Err(some error). You can then handle the possible error value in various ways. For numeric conversions, you get an error if it would overflow or lose precision. Again, you can equivalently write u16::try_from(uapi::BMCR_SPEED100).
* uapi::BMCR_SPEED100 as T - A type cast. For numeric types, overflow is handled by truncation (for more details, see [1]). You may only do this for a relatively short list of specific fundamental types, and generally only in cases where it's "obvious" how the compiler should handle it. More importantly, all of these operations are safe and cannot produce UB by themselves (but you can cast between unrelated raw pointer types, and the compiler will just let it happen because raw pointers have no safety properties until you try to dereference them).

Generally, "as" casting is dispreferred compared to the other two options. Despite being safe (in the sense of no UB), it can lose data as you point out.

Unfortunately, neither into() nor try_into() are allowed in const contexts, because those are general conversion traits for converting between arbitrary types (i.e. there's no guarantee they can be evaluated at compile time for all types, so you're not allowed to use them for any types). This is a limitation that the Rust folks are interested in lifting, but it has (apparently) been through multiple rounds of bikeshedding and is still under discussion (see [2]), so I imagine that the Rust-for-Linux folks will not have enabled the relevant unstable feature gate (I did not actually check).

Technical nitpick: It is possible to define into() or try_into() without defining the corresponding from(), and then they are not actually equivalent. This is discouraged in the Into<T> and From<T> docs, which clearly spell out that you should define from() and then the into() will be defined automatically.

[1]: https://doc.rust-lang.org/reference/expressions/operator-...
[2]: https://github.com/rust-lang/rust/issues/67792

Are casts encouraged in Rust?

Posted Jun 30, 2025 2:13 UTC (Mon) by alx.manpages (subscriber, #145117) [Link] (9 responses)

Thanks! If they're trying to fix it I guess that's good news.

> Despite being safe (in the sense of no UB), it can lose data as you point out.

Which amounts to being usafe (regardless of it being memory-safe). Logic bugs can be security bugs.

Are casts encouraged in Rust?

Posted Jun 30, 2025 2:45 UTC (Mon) by NYKevin (subscriber, #129325) [Link]

Rust uses the word "safe" in a highly specific way. If an operation cannot cause UB, either alone or in conjunction with other safe operations, then it is considered safe. The word "safe" does not mean correct, valid, reasonable, or a good idea.

The reason for this framing is so that unsafe operations can be protected by the unsafe keyword. Determining whether an operation is "correct" is clearly beyond the capabilities of a compiler to prove in full generality, so "no UB" is considered an acceptable substitute. When correctness is desired, the usual approach is to make incorrect states impossible to represent, by constructing a type in such a way that all of its valid instances represent valid states or operations. Since producing an invalid instance of a type (e.g. an enum instance which is not any of the enum's variants, a bool with a value other than true or false, or any uninitialized value that occupies nonzero space and isn't MaybeUninit<T>) is considered UB in Rust, this has the practical effect of tying correctness to safety for the purpose of that specific type. If you find a way to do something like this over an entire program, then in principle you can use the Curry-Howard isomorphism to convert that type construction into a proof of correctness, which in turn could be used to formally verify the program. But that kind of construction can get very complicated, and may not be worth it in all situations, hence the existence of unsafe as an escape hatch.

Are casts encouraged in Rust?

Posted Jun 30, 2025 7:38 UTC (Mon) by tialaramex (subscriber, #21167) [Link]

Lots of useful ideas, perhaps beyond what a novice Rust programmer would realise, are trapped behind the problem that trait implementations in Rust aren't today able to be marked constant.

An example beyond what you might expect is that for-each loops can't be constant. Rust's for-each loops always de-sugar into use of the trait call IntoIterator::into_iter to make whatever you've got into an iterator to begin with. This happens even if what you've provided is already an iterator, such conversion is just the identity function so your optimiser will elide the work - so until that trait implementation can be constant itself, the entire for-each loop feature isn't available in constants. You can write a while loop, for example, and that works fine, but not a for-each loop.

Are casts encouraged in Rust?

Posted Jun 30, 2025 19:29 UTC (Mon) by iabervon (subscriber, #722) [Link] (6 responses)

I think there's a different set of considerations for compile-time constants than for runtime casts; having "const x: u16 = 65536 as u16;" (or some other syntax for it) shouldn't result in an executable that unconditionally halts immediately with an error message, it should result in a compile error because that's not a sensible way to make a constant 0, but it also can't do anything else.

Are casts encouraged in Rust?

Posted Jun 30, 2025 19:53 UTC (Mon) by alx.manpages (subscriber, #145117) [Link] (5 responses)

In C both compile-time (constexpr) and run-time get a diagnostic, but they're different categories of diagnostics, so you can decide which to turn on and/or off.
alx@debian:~/tmp$ cat c.c | grep -nT ^
                  1:	#include <stdint.h>
                  2:	#include <stdlib.h>
                  3:
                  4:	constexpr uint16_t  X = 65536;
                  5:	constexpr uint16_t  Y = (uint16_t) 65536;
                  6:
                  7:	int
                  8:	main(void)
                  9:	{
                 10:		uint32_t  zz = rand(); // generate a run-time u32 for line 14
                 11:
                 12:		const uint16_t  x = 65536;
                 13:		const uint16_t  y = (uint16_t) 65536;
                 14:		const uint16_t  z = zz;
                 15:	}
alx@debian:~/tmp$ clang -Weverything -Wno-unused -Wno-pre-c23-compat -Wno-c++98-compat -std=c23 c.c 
c.c:4:25: error: constexpr initializer evaluates to 65536 which is not exactly representable in type 'const uint16_t' (aka 'const unsigned short')
    4 | constexpr uint16_t  X = 65536;
      |                         ^
c.c:4:25: warning: implicit conversion from 'int' to 'uint16_t' (aka 'unsigned short') changes value from 65536 to 0 [-Wconstant-conversion]
    4 | constexpr uint16_t  X = 65536;
      |                     ~   ^~~~~
c.c:10:17: warning: implicit conversion changes signedness: 'int' to 'uint32_t' (aka 'unsigned int') [-Wsign-conversion]
   10 |         uint32_t  zz = rand(); // generate a run-time u32 for line 14
      |                   ~~   ^~~~~~
c.c:14:22: warning: implicit conversion loses integer precision: 'uint32_t' (aka 'unsigned int') to 'uint16_t' (aka 'unsigned short') [-Wimplicit-int-conversion]
   14 |         const uint16_t  z = zz;
      |                         ~   ^~
c.c:12:22: warning: implicit conversion from 'int' to 'uint16_t' (aka 'unsigned short') changes value from 65536 to 0 [-Wconstant-conversion]
   12 |         const uint16_t  x = 65536;
      |                         ~   ^~~~~
4 warnings and 1 error generated.
I think this is more sensible than the Rust approach.

Are casts encouraged in Rust?

Posted Jun 30, 2025 20:51 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (2 responses)

> I think this is more sensible than the Rust approach.

I disagree.

1. The first error ("constexpr initializer evaluates to 65536...") is also a hard error in Rust by default (see https://doc.rust-lang.org/rustc/lints/listing/deny-by-def...). You can turn it into a warning if you really want to, but I've never heard of anyone choosing to do that.
2. Any warning that mentions an "implicit conversion" in C is a hard error in Rust, and can't be turned into a warning, because Rust simply does not implement those implicit conversions in the first place. You have to write explicit casts or into()/try_into().

Since I can't imagine you are disagreeing with Rust's handling of (1), that leaves us with (2). But I have to say, I have never encountered a situation where C's implicit conversions were anything other than a headache to deal with. I do not want the language magicking my data into a different type without telling me, especially in contexts where I never even asked for a conversion, such as the following:

uint16_t x = 1; // 1 converted from int to uint16_t
uint16_t y = x + 1; // Both operands converted to int, added, then converted back to uint16_t.
// Sure, *this* case is trivial and safe, but is that true every time you write something like this?

In Rust, the literal 1 is ambiguous, but would be interpreted as 1u16 in context (i.e. "a u16 with the value 1"), which is exactly what a reasonable person would expect it to mean.

Are casts encouraged in Rust?

Posted Jun 30, 2025 22:56 UTC (Mon) by alx.manpages (subscriber, #145117) [Link] (1 responses)

This is an interesting discussion. You have a point, which I also made with someone else recently.

Your right in calling C's implicit conversions messy, but it's not true of all of them.

C has three types of implicit conversions: - Integer promotions. These trigger for any fundamental type narrower than an int. I've called them a "cancer" myself recently. They are there because of historical reasons. I would remove them from the language if I could, but of course we can't at this time.

It is bad that a uint16_t is promoted to an int on almost every situation, which even changes its signedness.

The good thing is that few people actually use narrow integers like short, int16_t, or uint8_t.

The better thing is that the new _BitInt(N) integers added in C23 don't have integer promotions: a _BitInt(16) will not be promoted to an int.

So, I'd say we've partially solved this issue in C. Although we're not over. We also need to be able to specify literals of such types. I've written a proposal for the C Committee (and an extension request to both GCC and Clang) for that:

<https://github.com/llvm/llvm-project/issues/129256>

- Usual arithmetic conversions.

When adding, comparing or otherwise using two different types of integers in an operator that takes two operands, these trigger.

So, if you have
int   a = 42;
long  b = 7;

if (a < b)
    return a;
you'll get the usual arithmetic conversions to turn that int into a long. Since both retain the original signedness, this conversion is harmless, and doesn't trigger any diagnostics at all. This is a good conversion.

If you had that comparison between integers of different signedness, you could get a warning with -Wsign-compare or -Wsign-conversion (depending if you're comparing them or adding/multiplying/... them, but they're both essentially the same thing).
alx@debian:~/tmp$ cat c.c 
int
main(void)
{
	int            a = 42;
	unsigned long  b = 7;

	if (a < b)
		return 0;
}
alx@debian:~/tmp$ clang -Weverything c.c 
c.c:7:8: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
    7 |         if (a < b)
      |             ~ ^ ~
1 warning generated.
and
alx@debian:~/tmp$ cat c.c
int
main(void)
{
	unsigned int  a = 42;
	int           b = 7;

	if (a < b)
		return 0;
}
alx@debian:~/tmp$ clang -Weverything c.c 
c.c:7:8: warning: comparison of integers of different signs: 'unsigned int' and 'int' [-Wsign-compare]
    7 |         if (a < b)
      |             ~ ^ ~
1 warning generated.
This is sadly not turned on on -Wall -Wextra, but this is one diagnostic that you'd usually want, and most of the times it uncovers subtle bugs.

I said "could", because that diagnostic is not always triggered. It triggers if there can be information loss. There's a case where there can't be information loss: the unsigned integer is turned into a wider signed integer type that can represent all of the values that the unsigned integer can hold:
alx@debian:~/tmp$ cat c.c 
int
main(void)
{
	unsigned int  a = 42;
	long          b = 7;

	if (a < b)
		return 0;
}
alx@debian:~/tmp$ clang -Weverything c.c 
alx@debian:~/tmp$ 


This is another good conversion you want to happen. It's good that we don't diagnose it. - And then there are implicit conversions as if by assignment.

The C standard describes all implicit conversions as if by simple assignment. These happen, for example, when you assign some integer to a variable of another integer type.

This can be a narrowing conversion, in which case you'll get a very explicit diagnostic:
alx@debian:~/tmp$ cat c.c 
int
main(void)
{
	long l = 42;
	int i = l;
}
alx@debian:~/tmp$ clang -Weverything -Wno-unused c.c 
c.c:5:10: warning: implicit conversion loses integer precision: 'long' to 'int' [-Wshorten-64-to-32]
    5 |         int i = l;
      |             ~   ^
1 warning generated.
Again, this is not in -Wall -Wextra, but you probably want to turn on -Wshorten-64-to-32 (and similar ones) for your projects, and disable it only when you know those conversions are good.

I personally disable it selectively in a few places with
#pragma clang diagnostic ignored "-Wshorten-64-to-32"
in a few places in a library where I know that's exactly what I want.

It can also be a sign-changing conversion:
alx@debian:~/tmp$ cat c.c 
int
main(void)
{
	unsigned int l = 42;
	int i = l;
}
alx@debian:~/tmp$ clang -Weverything -Wno-unused c.c 
c.c:5:10: warning: implicit conversion changes signedness: 'unsigned int' to 'int' [-Wsign-conversion]
    5 |         int i = l;
      |             ~   ^
1 warning generated.


which is covered by the same -Wsign-conversion I mentioned earlier, which you also want on all the time, with a few exceptions maybe.

---

So, the -Wall -Wextra compiler diagnostics are a bit lacking, but if you turn on all available diagnostics, they're quite safe. Rust's .into() seems like C's behavior when the diagnostics are on, except that it doesn't allow the few conversions that don't produce any diagnostic in C, and which are actually Good Conversions. Also, Rust's .into() is just typographic noise, because good conversions is what I want all the time.

---

Then there's the issue that Rust is unable to do .into() with constant expressions, which forces you to use casts. That's worse than not allowing the good conversions without noise; this is plain dangerous.

Are casts encouraged in Rust?

Posted Jul 1, 2025 1:03 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

> Rust's .into() seems like C's behavior when the diagnostics are on, except that it doesn't allow the few conversions that don't produce any diagnostic in C, and which are actually Good Conversions. Also, Rust's .into() is just typographic noise, because good conversions is what I want all the time.

This is a matter of opinion. I want to do all my conversions at the system boundaries, and then use the proper types throughout the program (preferably full-blown structs and enums, not just raw i32 or whatnot), with minimal or no further conversions after data has been ingested.

Anyway, I think I figured out why Rust does not allow that. Unlike all the other binary operators, the shift operators do support arbitrary mixing of integer types. However, their documentation pages point out that Rust applies a special rule when doing type inference: In the expression a << b (or a >> b), if Rust knows that a and b are both integers (of some possibly-unknown type), then the type inference system is special-cased to infer that the shift expression has the same type as a.

That actually tells me a lot. First of all, if you don't have that rule, it must cause issues, probably because whenever b is ambiguous, type inference can't figure out which overload to use, and just gives up rather than trying each in turn. Trying each in turn would produce the same result in this case (type of the output is the same as type of the left operand), but Rust is not always willing to do that if it can't prove that there's a unique solution (Rust is not C++ and does not want to reinvent SFINAE etc.). But that in turn means that the special case has to be really simplistic, and in particular, it must be possible to apply with partial information - if you only know the type of a and not the type of b, the rule allows you to make progress, because it only depends on the type of a. If you only know the type of b, then the rule doesn't help at all, but at least it does something in the other case. Finally, if you know neither type, then you can at least try to unify the output type with the left operand's type, and maybe that will tell you something.

So, if we wanted to allow a + b with mixed types, and make the output type be the *wider* of the two (instead of the left operand's type), then we'd probably have to give up on this idea of special-casing the type inference machinery (there's no rule you can come up with that will allow making forward progress when one and only one of the types is known). That in turn would lead to whatever problems they were originally having with a << b (i.e. probably the compiler asks for way too many type annotations).

Are casts encouraged in Rust?

Posted Jul 1, 2025 1:00 UTC (Tue) by iabervon (subscriber, #722) [Link] (1 responses)

I'd actually say that the C compilers are suboptimal here because they don't give a warning about line 13: if you want to get 0 directly, you should just use 0, and, in the more likely event that 65336 is coming from a macro or other constant and you want to just get the low bits into this constant (with the high bits going elsewhere), you should probably write it as VALUE & 0xffff, and I don't see any reason that writing (uint16_t) VALUE shouldn't give you a warning about the explicit cast resulting in a different constant value.

Are casts encouraged in Rust?

Posted Jul 1, 2025 2:14 UTC (Tue) by alx.manpages (subscriber, #145117) [Link]

> I'd actually say that the C compilers are suboptimal here because they don't give a warning about line 13

Line 13 has a cast, which precisely means: "compiler, please shut up".

If you want a diagnostic, remove the cast. As I've said, the appropriate number of casts in almost any given program is 0.

Are casts encouraged in Rust?

Posted Jun 29, 2025 20:29 UTC (Sun) by epilys (subscriber, #153643) [Link] (3 responses)

They make sense in a lot of usecases, plus they are opt-in/explicit, so they are neither encouraged or discouraged in my (humble) opinion. If you make a mistake with casts, it's basically yet another logic bug.

The good news is that if you wish to not allow them in your codebase, you can use lints like `clippy::as_conversions` and deny them globally: https://rust-lang.github.io/rust-clippy/master/#as_conversions

Are casts encouraged in Rust?

Posted Jul 2, 2025 17:09 UTC (Wed) by pbonzini (subscriber, #60935) [Link]

... Or even filter out which "as" conversions you want to allow and which you want to deny.

Are casts encouraged in Rust?

Posted Jul 2, 2025 20:32 UTC (Wed) by alx.manpages (subscriber, #145117) [Link] (1 responses)

But if there's no other way than using a cast (in constant expressions), we have a problem.

Casts are a known problem

Posted Jul 3, 2025 11:54 UTC (Thu) by farnz (subscriber, #17727) [Link]

Casts are a known problem, and (AIUI) people are working to make the "as conversions" clippy lint a warn-by-default lint. There's some issues to resolve before we get there:
  • To do a good job, we need a way to say "you can implement a trait with either const fn or fn, and if you use const fn, we can use this in a const context". We can't, however, say that all conversions happen in a const context, partly for backwards compatibility, and partly because we still want to support conversions that can't be done in a const context. But this would let us support let x: u16 = y.into(); type of safe conversion, or const z: u16 = y.try_into().unwrap_or_else(0);.
  • There needs to be a carefully considered plan for handling fallible conversions sensibly, and allowing you to choose behaviours. For example, converting u32::MAX to f32 could be a bug, and a reason for a good compile-time error (because f32 has 24 bits of precision, while u32 has 32), or you could want to round down to the first f32 below u32::MAX, or even round up to the next f32 above u32::MAX.
  • You don't want to make the obviously correct code unclear or hard to write. If a conversion is, in general, fallible, but in this specific instance is known to be infallible, I should be able to use an infallible conversion operator. For example, const MAX_SIZE: usize = 0xfe; const ERROR_INDICATOR: u8 = MAX_SIZE.into(); should work, since it's obviously possible to convert 0xfe to a u8, and you don't want to force people to write const ERROR_INDICATOR: u8 = MAX_SIZE.try_into().expect("Too big for u8"); when it's obviously infallible in context.

There's thus a lot of design work in getting this right; it's OK if the initial solution only solves for some of these problems, as long as it doesn't block off useful bits of the design space for other solutions. I'd thus expect that we'll have a decent solution to the first problem - traits usable in const and non-const contexts if they implement a function as const fn, but only in non-const contexts if they implement it as fn - long before the other 2 get solved, because it's the lynchpin on which the other two stand - and indeed, work on this is actively happening, leading to an early design that's being pushed around until people are confident that it's a good design.

Are casts encouraged in Rust?

Posted Jun 29, 2025 20:44 UTC (Sun) by excors (subscriber, #95769) [Link]

Casts are still bad. But I think the reason it doesn't need casts in C is that phy_read() returns an int which contains either an unsigned 16-bit value or an error that's usually represented as a negative error code (but sometimes as 0xffff). Once the caller has checked for negative error codes, the value is semantically a u16 but still stored in an int (because C makes it hard to use the correct type here), and gets bit-masked with constants that are semantically u16 but actually an int literal (via the uapi #define, which doesn't specify a type so C defaults to int). That doesn't seem ideal either.

The Rust driver's dev.read() returns a semantically-appropriate Result<u16>, which becomes u16 after handling the error case. The uapi module is mechanically generated from the C code, and I think it emits the constants as (probably?) i32 because it doesn't know how the constants are going to be used so it'll pick a default. So you need a cast from i32 to u16, which is ugly. The nice solution would be to get uapi to define the constants as u16, but those definitions need to be shared between C and Rust, and the C code doesn't use the correct types.

const BMCR_SPEED100 := uapi::BMCR_SPEED100;

Posted Jun 30, 2025 19:57 UTC (Mon) by adobriyan (subscriber, #30858) [Link] (2 responses)

> Types are mandatory for const definitions, but optional for let ones.

Maybe they'll change that. The moment C programmer gets a taste of type inference, they strike back!

I, for one, think that compile time constants should be in ℤ for as long as possible and silently converted if there is no loss of information.

const BMCR_SPEED100 := uapi::BMCR_SPEED100;

Posted Jul 1, 2025 2:34 UTC (Tue) by neilbrown (subscriber, #359) [Link] (1 responses)

> I, for one, think that compile time constants should be in ℤ for as long as possible

Or ℚ. go-lang claims to keep full precision for constants. I think it does for integers, but rationals become floats too soon I think (last I checked).

const BMCR_SPEED100 := uapi::BMCR_SPEED100;

Posted Jul 4, 2025 11:30 UTC (Fri) by massimiliano (subscriber, #3048) [Link]

What go-lang does is nice in theory, but it has the side effect of having a sort of "language inside the language" for const values, which has the same syntax as the more general go-lang but different semantics. That "language" is used to compute const-expression values and nothing else.

What I do not like about the go-lang approach is that this "const-expression language" is very specific and cannot apply to user-defined types with const values. It is OK for go-lang because of its poor type system, but IMHO it would make no sense in Rust.

In Rust there's this concept of "const context", where one can use a subset of the regular Rust language and know that it will be evaluated at compile time, but (crucially) this works for any type, not just primitive ones.

Over time, the subset of Rust usable in const context gets larger. Still, the language itself (and its semantics) remains the same, and I deeply appreciate the cleanliness of this approach and the fact that I can write entire functions that work on my own types and have them evaluated at compile time.

Now, this is not Zig's "comptime"... for that, there are macros, but this would open a different can of worms!


Copyright © 2025, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds