|
|
Log in / Subscribe / Register

Compiling Rust to readable C with Eurydice

By Daroc Alden
January 30, 2026

A few years ago, the only way to compile Rust code was using the rustc compiler with LLVM as a backend. Since then, several projects, including Mutabah's Rust Compiler (mrustc), GCC's Rust support (gccrs), rust_codegen_gcc, and Cranelift have made enormous progress on diversifying Rust's compiler implementations. The most recent such project, Eurydice, has a more ambitious goal: converting Rust code to clean C code. This is especially useful in high-assurance software, where existing verification and compliance tools expect C. Until such tools can be updated to work with Rust, Eurydice could provide a smoother transition for these projects, as well as a stepping-stone for environments that have a C compiler but no working Rust compiler. Eurydice has been used to compile some post-quantum-cryptography routines from Rust to C, for example.

Eurydice was started in 2023, and includes some code under the MIT license and some under the Apache-2.0 license. It's part of the Aeneas project, which works to develop several different tools related to applying formal verification tools to Rust code. The various Aeneas projects are maintained by a group of people employed by Inria (France's national computer-science-research institution) and Microsoft, but they do accept outside contributions.

Eurydice follows the same general structure as many compilers: take a Rust program, convert it into an intermediate representation (IR), modify the IR with a series of passes, and then output it as code in a lower-level language (in this case, C). Jonathan Protzenko, the most prolific contributor to Eurydice, has a blog post where he explains the project's approach. Unlike other compilers, however, Eurydice is concerned with preserving the overall structure of the code while removing constructs that exist in Rust but not in C. For example, consider this Rust function that calculates the least common multiple of two numbers using their greatest common denominator:

    fn gcd(a: u64, b: u64) -> u64 {
        if b == 0 {
            a
        } else {
            gcd(b, a%b)
        }
    }

    fn lcm(a: u64, b: u64) -> u64 {
        (a * b) / gcd(a, b)
    }

Here's how Eurydice compiles those functions to C:

    uint64_t example_gcd(uint64_t a, uint64_t b)
    {
        uint64_t uu____0;
        if (b == 0ULL)
        {
            uu____0 = a;
        }
        else
        {
            uu____0 = example_gcd(b, a % b);
        }
        return uu____0;
    }

    uint64_t example_lcm(uint64_t a, uint64_t b)
    {
        uint64_t uu____0 = a * b;
        return uu____0 / example_gcd(a, b);
    }

Whether this C code counts as "readable" is probably a matter of individual taste. It does, however, preserve the structure of the code. Even the evaluation order of the original is preserved by adding extra temporary variables (uu____0 in example_lcm()) where necessary to define an order. (Rust guarantees that if the multiplication overflows and causes a panic, that will happen before any side effects caused by calling example_gcd(), but C only guarantees that if the multiplication is performed in a separate statement.) Compiling the same functions with rustc results in a pair of entangled loops filled with bit-twiddling operations, instead — which is appropriate for machine-code output, but much less readable.

Of course, not all Rust programs can be faithfully represented in C. For example, for loops that use an iterator instead of a range need to be compiled to while loops that call into some of Eurydice's support code to manage the state of the iterator. More importantly, C has no concept of generics, so Rust code needs to be monomorphized during conversion. This can result in several different implementations of a function that differ only by type — often, the more idiomatic C approach would be to use macros or void * arguments.

The implementation of dynamically sized types also poses certain challenges. In Rust, a structure can be defined where one of its fields does not have a fixed size — like flexible array members in C:

    struct DynamicallySized<U: ?Sized> {
        header: usize,
        my_data: U, // The compiler does not know the size of U, here
    }

But if that structure is generic, and one of the generic users of the type gives the flexibly sized field a type with a known size, the compiler can take advantage of that knowledge to elide bounds checks where appropriate.

    let foo: DynamicallySized<[u8; 4]> = ...;
    // No bounds check emitted, since the array size is known to be 4:
    let bar = foo.my_data[2];

This kind of separation, where some parts of the code may know the size of a type and some may not, is an important semantic detail to preserve in C because of how it interacts with the possibility of formal verification. If Eurydice compiled DynamicallySized to use a flexible array member everywhere, analysis of the C code might point out "missing" bounds checks that were not required in Rust. Conversely, if Eurydice added extra bounds checks, it would need to manufacture extra error paths that don't appear in the Rust source and that should be completely unused.

So, Eurydice emits two different types: a version of the dynamically sized type that has a flexible array member, and one that has a known-length array member. Converting between the two representations is a no-op at run time, but it technically violates C's strict-aliasing rule. Therefore Protzenko recommends compiling Eurydice-generated code with -fno-strict-aliasing.

Associated tooling

This approach, of compiling a more abstract language to C in a way that preserves the structure of the code, is not new. The KaRaMeL project, upon which Eurydice is based, does the same thing for the F* programming language. F* is a dependently typed functional programming language used to develop cryptographic libraries. Compiling provably correct F* programs to equivalent C lets those libraries be used in programs where performance is a concern.

Unfortunately, Eurydice doesn't currently scale much beyond small examples. Rather than implement its own parser and typechecker for Rust code, Eurydice uses another Aeneas tool — Charon — to extract the parsed and preprocessed program from rustc. When I tested Charon on a variety of Rust packages, it was routinely foiled by more recent Rust features such as const generics.

When Charon does work, however, it dumps rustc's medium-level intermediate representation (MIR) as JSON, along with any compiler flags necessary to understand the compilation. Eurydice reads this JSON representation and converts it to KaRaMeL's intermediate representation. Then it uses a series of small passes over the KaRaMeL code to eliminate some Rust-specific details, before handing things over to the same code-generation logic that KaRaMeL uses for F*.

In its current form, Eurydice works best for small, self-contained programs that avoid complex Rust features. Within that niche, however, it works well. The generated code maintains the same structure as the original Rust code, except for places where Eurydice emits extra intermediate variables or needs some glue code to implement a more complicated feature. On the other hand, small self-contained code is also the easiest to rewrite by hand, so bringing in Eurydice is probably only worthwhile if the original Rust code is going to be updated and one wants an automatic solution to keep them in sync. In any case, Eurydice is only the newest tool in a rapidly expanding collection of ways to fold, spindle, and mutilate Rust code to fit into more environments.

[ Thanks to Henri Sivonen for the topic suggestion. ]



to post comments

Rustc?

Posted Jan 30, 2026 19:15 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (8 responses)

How close it to be able to compile rustc? This would neatly solve most of the bootstrap questions.

Rustc?

Posted Jan 30, 2026 19:32 UTC (Fri) by comex (subscriber, #71521) [Link] (7 responses)

If you mean bootstrapping by transpiling rustc to C and using that generated C code as the trusted seed, then I don't think that would be a good solution. Even if the C code is somewhat readable, it's not going to be as readable as the original source for the reasons described in the article. And even if it were completely readable, that doesn't mean anyone is going to actually read it. It would still be easier to hide a backdoor there than in a codebase maintained by hand.

If you mean bootstrapping by using Eurydice itself as the trusted seed, then that's theoretically possible (since Eurydice is written in OCaml) but not particularly different from mrustc.

Regardless, if Eurydice is "routinely foiled by more recent Rust features" then it's nowhere near being able to compile rustc, which uses ~all the features, including many unstable ones.

Rustc?

Posted Jan 30, 2026 20:07 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (6 responses)

I don't really think that a backdoor in the compiler is really a problematic case? As long as we can prove that the Eurydice output is deterministic, we can be sure that it's not backdoored.

And more practically, it creates a reasonably static bootstrap chain that allows us to start with just a C compiler.

Rustc?

Posted Jan 30, 2026 23:04 UTC (Fri) by notriddle (subscriber, #130608) [Link] (5 responses)

The problem with using Eurydice itself as your bootstrap seed is that it requires OCaml, which is self-hosting.

If your goal is to minimize the number of binary dependencies, then you're better off using mrustc, which only requires a C++ compiler (which you probably already have as part of your bootstrap chain).

Rustc?

Posted Jan 30, 2026 23:14 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

D'Oh. For some reason, I assumed that Eurydice is written in Rust.

mrustc is indeed another way, but it's a large complicated C++ code base, with unclear long-term prospects.

Rustc?

Posted Jan 31, 2026 1:11 UTC (Sat) by josh (subscriber, #17465) [Link]

Even if mrustc stopped development tomorrow, the source would continue to exist as a project capable of compiling rustc 1.74.0. That would be sufficient to substantially shorten the bootstrap chain.

Rustc?

Posted Jan 31, 2026 12:17 UTC (Sat) by ballombe (subscriber, #9523) [Link] (1 responses)

> The problem with using Eurydice itself as your bootstrap seed is that it requires OCaml, which is self-hosting.

You can run Eurydice on a different machine. Then you only need a C compiler.

Rustc?

Posted Feb 1, 2026 2:02 UTC (Sun) by pabs (subscriber, #43278) [Link]

But you can't run Eurydice without building it first, which requires running OCaml, which requires building OCaml, which requires running OCaml etc. camlboot from above might help bootstrapping OCaml though.

See also https://bootstrappable.org/ https://lwn.net/Articles/983340/ for some of the philosophy behind this, in short, don't trust any binaries, even of the Linux kernel etc, and see if you can get a Linux binary starting from scratch. In the bootstrappable.org/live-bootstrap case, they start from a small seed of commented machine code in hex encoding, that you can manually understand and write to a disk, then start the bootstrap.

Rustc?

Posted Feb 1, 2026 1:57 UTC (Sun) by pabs (subscriber, #43278) [Link]

There are folks working on OCaml bootstrap too btw:

https://github.com/Ekdohibs/camlboot/

example

Posted Jan 30, 2026 21:58 UTC (Fri) by ballombe (subscriber, #9523) [Link] (3 responses)

The example code is not quite good, the lcm should be computed as
(a / gcd(a, b) ) * b
so that overflows only occur when the lcm is too large to fit in a u64.
Also lcm(0,0) is not handled correctly (it should be 0, not 0/0)

example

Posted Jan 30, 2026 22:54 UTC (Fri) by neggles (subscriber, #153254) [Link]

Valid, but that's the point - it's not about the quality of the input Rust code, it's about Eurydice maintaining the semantics and ordering of the original Rust code through the conversion. Arguably, using a sub-optimal order of operations (and having the output code take steps to ensure that order is respected) does a better job of demonstrating that than using optimal code, IMO

example

Posted Jan 30, 2026 23:11 UTC (Fri) by jepsis (subscriber, #130218) [Link] (1 responses)

Yup. It should be something like this:


fn lcm(a: u64, b: u64) -> Option<u64> {
    if a == 0 || b == 0 {
        Some(0)
    } else {
        (a / gcd(a, b)).checked_mul(b)
    }
}

Or with overflow checks:


fn lcm(a: u64, b: u64) -> Option<u64> {
    if a == 0 || b == 0 {
        Some(0)
    } else {
        let g = gcd(a, b);
        (a / g).checked_mul(b)
    }
}

example

Posted Feb 3, 2026 17:03 UTC (Tue) by ttuttle (subscriber, #51118) [Link]

Did you perhaps mean not to include the checked_mul in your first example? (If not, it looks like the only difference between the two is that the second example stores the gcd in a local variable before using it in the final expression.)

Safety critical C

Posted Jan 31, 2026 19:23 UTC (Sat) by SLi (subscriber, #53131) [Link] (5 responses)

> More importantly, C has no concept of generics, so Rust code needs to be monomorphized during conversion. This can result in several different implementations of a function that differ only by type — often, the more idiomatic C approach would be to use macros or void * arguments.

If Eurydice is targeting safety critical domains, I'd say this is both true, from a "normal world developer" perspective, and quite irrelevant. C engineered using safety critical standard doesn't look like "idiomatic C", and my feeling is that at least the safety standards I've dealt with would prefer the way Eurydice does it.

Safety critical processes would likely treat void * as poison to be avoided (likely also heavily restricted) and view macros with *extreme* suspicion. Avoiding code duplication by what would be seen as "a clever void * trick" would be considered living dangerously and risking lives. If you can have static typing, take static typing, even if it means writing the same routine multiple times. And if you cannot, be prepared to explain why not.

I think people would even argue that monomorphic code is better because it makes it explicit that you have an uint8 path and an uint32 path and allow you to ask if you tested both.

Safety critical C

Posted Feb 1, 2026 13:12 UTC (Sun) by mb (subscriber, #50428) [Link]

void* and uintptr_t are handled by Misra-C a bit like Rust handles raw pointers.
At any time it is premitted to create void* and uintptr_t from any pointer.
But if you want to convert it back into a usable pointer type, the Misra-C checker throws a very clear warning.
Now it's up to the development process whether these warnings are allowed and how they are handled.
Typically these things would be avoided by the developers, if there is a good alternative implementation.
But sometimes there isn't a good alternative.

Safety critical C

Posted Feb 2, 2026 21:22 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (3 responses)

It will be very difficult to translate something like the following without monomorphization, so I believe that Eurydice made the right choice regardless of what the safety folks prefer.

use std::convert::Infallible;

struct Cat;
struct Dog;
enum CatOrDog<A: Animal, B, C> {
    CatData(B, A::CatCertificate),
    DogData(C, A::DogCertificate),
}
trait Animal: Sized {
    type CatCertificate: Copy + Sized + 'static;
    type DogCertificate: Copy + Sized + 'static;
    fn unwrap_cat<B, C>(s: CatOrDog<Self, B, C>, c: Self::CatCertificate) -> B;
    fn unwrap_dog<B, C>(s: CatOrDog<Self, B, C>, c: Self::DogCertificate) -> C;
}
impl Animal for Cat {
    type CatCertificate = ();
    type DogCertificate = Infallible;
    fn unwrap_cat<B, C>(s: CatOrDog<Self, B, C>, _: Self::CatCertificate) -> B{
        let CatOrDog::CatData(x, _) = s;
        x
    }
    fn unwrap_dog<B, C>(_: CatOrDog<Self, B, C>, c: Self::DogCertificate) -> C{
        match c {}
    }
}
impl Animal for Dog {
    type CatCertificate = Infallible;
    type DogCertificate = ();
    fn unwrap_cat<B, C>(_: CatOrDog<Self, B, C>, c: Self::CatCertificate) -> B{
        match c {}
    }
    fn unwrap_dog<B, C>(s: CatOrDog<Self, B, C>, _: Self::DogCertificate) -> C{
        let CatOrDog::DogData(x, _) = s;
        x
    }
}

CatOrDog is not representable in the C data model. It is, essentially, a compile-time tagged union. That is, it looks and quacks like a tagged union, but the compiler knows which variant is active at compile time, so all other variants disappear during monomorphization, and we emit no code to handle them (the certificates also vanish because they are ZSTs). The post-monomorphization version is trivial to translate into C, because you can just interpret the enum as a newtype around one or the other of its second and third type arguments (as applicable). But the pre-monomorphization version does not live behind a pointer and cannot be trivially represented with void* and such hacks. Note that B and C don't even need to be the same size or alignment, and Rust can avoid allocating unnecessary space for the larger type if it knows that it is dealing with the smaller type. I don't believe that there is any reasonable way to accomplish such a thing in C.

Technically, we also need a bit more supporting infrastructure, such as a way to generically construct these enums when the concrete type of A is not visible. Those methods can go into the Animal trait, because its impls can always "see" the concrete type of Self (even when the caller cannot). The unwrap methods demonstrate this visibility. I have elided the other methods for brevity. And for those confused by match c {}, it translates into English roughly as follows: "There are no possible values of c, so this must be dead code." The compiler will check that this is true, and then allow the match expression to coerce into whatever type we like (in this case, the return type of the function), because type correctness only matters for reachable code.

Safety critical C

Posted Feb 3, 2026 16:31 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (2 responses)

Having thought about it overnight, I realized that you probably can make something roughly similar to this using some very nasty preprocessor hacks, but nothing so clean and elegant as void* with casting. I have not thought through a full solution, but it would involve getting a macro to directly emit concrete type names inline everywhere these generics are used (and probably glue those type names into identifier tokens with the ## operator, similar to C++ name mangling). In other words, we would need to use the preprocessor as a full monomorphization engine and build a minimal clone of C++ atop C. Which, uh... yuck, but I don't see why you *can't* do that.

Safety critical C

Posted Feb 3, 2026 16:42 UTC (Tue) by Wol (subscriber, #4433) [Link]

> Which, uh... yuck, but I don't see why you *can't* do that.

Turing complete ... but just because you *can* doesn't mean you *should* :-)

Cheers,
Wol

Safety critical C

Posted Feb 4, 2026 10:55 UTC (Wed) by paulj (subscriber, #341) [Link]

If Eurydice is a transpiler, and it is outputting something to be compiled into an self-standing executable (i.e., it is not having to re-create an API in C) it doesn't need ## hacks, it can just emit a C type for each original boxed type.

If it's trying to re-create an API with generics in C, that is pretty much impossible in any nice way. Best I can find is the API requires the caller to set a #define for the generic type and then include the 'generic' header, e.g.:

#define TYPE foo_t
#include "generic_container.h"

<generic container header can ## together TYPE with its own container box type to create generic types, e.g. generic_container_foo_t, with helper macros to minimise need for caller to know about this, caller calls, say, generic_container_foo_t_whatever(....) and macro takes care of appropriate casts >

#define TYPE bar_t
#include "generic_container.h"

<etc>

The generic_container.h has to have 2 definitions for each 'method' though. The unvarnished one, using some top-level type (e.g. 'void *' or some library specific 'object_t' or whatever). And then macro level definitions to provide the 'generic' version, with the sugar to do the casting to and from the concatenated container-generic type and the unvarnished/top-level 'real' definition. Ugly.

I write this having written a little polymorphism library, with compile-time checked extension of types and with polymorphic interfaces for C (so you could have a "iterator" interface, and various containers could implement that and be useable transparently to a caller). A lot like libcymbal, except managing to do the checking at compile-time rather than runtime (and not as polished of course). I spent a lot of time trying to figure out a way to provide for generics, to get rid of the need for 'void *' or 'foo_container *' equivalent for my containers, and above strategy is the only not-utterly-fugly-way, but still pretty ugly.

At that point, you're better off just sticking with Zig or Rust really.

"recent"

Posted Feb 1, 2026 2:02 UTC (Sun) by tialaramex (subscriber, #21167) [Link] (2 responses)

> it was routinely foiled by more recent Rust features such as const generics.

For anybody new to Rust or indeed largely ignorant of it, the const generics MVP landed in March 2021. So "recent" here means almost five years since it was stabilized.

"recent"

Posted Feb 1, 2026 23:08 UTC (Sun) by riking (subscriber, #95706) [Link] (1 responses)

The MVP, sure. And it's been slowly expanded over time and hasn't reached feature completeness yet. So it's not exactly a stable target to implement.

"recent"

Posted Feb 2, 2026 1:15 UTC (Mon) by tialaramex (subscriber, #21167) [Link]

It's true that in principle if you implemented the full const generics that would naturally also support the MVP, but the exact mechanism for const generics is not, as I understand it, settled.

In C++ the analogous mechanism, "Non-type template parameters" is entirely open-ended and so that's not acceptable in Rust because you can trivially write unsound nonsense that way e.g. is Foo<NaN> the same type as Foo<NaN> ? After all NaN != NaN ...

A complete const generics needs to guarantee that we only produce sound types - the MVP just says OK integers are definitely fine, so we can guarantee the primitive integer types can be used and come back for user defined types later. Thus you'd expect Eurydice could make the same choice by now.

"C has no concept of generics"

Posted Feb 5, 2026 16:15 UTC (Thu) by sethkush (subscriber, #107552) [Link] (2 responses)

"C has no concept of generics"

Doesn't C11? C11 is probably a bit new to require, but could it use C11 _Generic to replace rust generics as an option?

"C has no concept of generics"

Posted Feb 5, 2026 16:34 UTC (Thu) by daroc (editor, #160859) [Link] (1 responses)

C's _Generic is not a full replacement for Rust generics, because it can't be used to change the argument or return types of a function. On the other hand, I do think it could be used to select between monomorphized functions in the generated C code, if you wanted to.

"C has no concept of generics"

Posted Feb 5, 2026 19:09 UTC (Thu) by sethkush (subscriber, #107552) [Link]

Fascinating. Thanks!


Copyright © 2026, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds