LWN: Comments on "Rust's incremental compiler architecture"

Cross-crate dead code elimination

saethlin — Sat, 14 Dec 2024 02:32:31 +0000

In general, we try rather hard to make sure that compile errors are surfaced, even in dead code. And dead code detection is really hard on generic code. So we pretty much need to run all the code through the compiler frontend (type-checking and borrow-checking) regardless of whether the code is reachable.

But apart from that, you can do a crude version of this right now by setting the flags -Zmir-opt-level=0 and -Zcross-crate-inline-threshold=always in your dependencies. Cargo profile overrides can do this unstably, for example:

[profile.release.package.aws_sdk_ec2]
rustflags = ["-Zmir-opt-level=0", "-Zcross-crate-inline-threshold=always"]

This basically delays all codegen to dependencies of the named crate, and then you get dead code elimination through the mechanism that rustc already uses to only lower required functions through the codegen backend. But those flags also may or may not ruin the incrementality of your builds, and may also make the entire compilation single-threaded, because the whole program is a single codegen unit based on your main.

I've worked a fair bit on this part of the compiler recently.

Cross-crate dead code elimination

farnz — Fri, 06 Dec 2024 11:24:08 +0000

There's already parallelism between crates that could be extended to support this; the compiler prioritises finishing the .rmeta component of the build process and signals when the .rmeta file exists so that you can start building downstream crates before codegen takes place (but you can't start linking the final output until all the .rlib files are built).

At least in theory, since the .rmeta file contains everything that's needed to build downstream crates bar the actual code you link, you could extend the query system all the way to the linker, so that a "build" just produces the .rmeta files, and the link phase uses queries to get everything it needs to create the final output.

Clever system !

donald.buczek — Fri, 06 Dec 2024 06:23:04 +0000

> * Rust doesn't have a stable ABI.

It could be added, though, that Rust can both offer and use the C API and ABI. So it's not that Rust is less capable than C or C++, just that you wouldn't normally restrict Rust libraries for Rust users in that way.

Cross-crate dead code elimination

intgr — Thu, 05 Dec 2024 18:54:44 +0000

> When they run the compiler, it will evaluate a top-level query to produce an output binary. That query will in turn query for the final versions of each function in the crate (after type-checking and optimization).

> In Rust, an entire crate is a single translation unit

I figure this means that dependency crates are still built in their entirety?

If my top-level crate includes a large dependency crate, but I only use 1 function out of that dependency, ideally this query approach would be able to skip compiling everything in the dependency except for that function. And maybe some transitive dependency can be skipped entirely.

But that would require breaking down the crate boundary and making it transparent to the query system.

I guess fast incremental compilation is the initial goal of this work.
But hopefully it's on the roadmap to enable such cross-crate dead code elimination ahead of compilation.

Release vs development builds

intelfx — Thu, 05 Dec 2024 12:54:02 +0000

> Does this mean "when rustc is doing a development build" or "in development builds of rustc"?

The former.

The default settings are cited here: https://doc.rust-lang.org/cargo/reference/profiles.html#dev

Release vs development builds

excors — Thu, 05 Dec 2024 12:53:26 +0000

> (default doesn't seem to be documented there)

It looks like that's documented in the "Default profiles" section: the `dev` profile (default for `cargo build` etc) has `incremental = true`, and `release` has `incremental = false`.

Release vs development builds

bjackman — Thu, 05 Dec 2024 12:46:30 +0000

> incremental compilation has been stable enough to become the default for development builds. Release builds still avoid incremental compilation by default

Does this mean "when rustc is doing a development build" or "in development builds of rustc"?

I found https://doc.rust-lang.org/cargo/reference/profiles.html which says how to enable it but I wonder if my dev builds already do so (default doesn't seem to be documented there).

Clever system !

samlh — Thu, 05 Dec 2024 07:31:24 +0000

[...]what is the difference between a crate and a library?

To keep it short:

A crate is the rust unit of compilation, composed of 1 or more rust source files
Crates can be compiled as libraries or executables
Library crates get compiled to a .rlib file which contains compiled binary code along with metadata for:
- exported apis, static variables, constants, etc
- generic function implementations (in a partially compiled form)
- other stuff :) [for example, when LTO is involved, the rlib can contain LLVM bitcode]
Think of .rlibs as equivalent to a C++ static library + header files + linking instructions.
The .rlib format (and the Rust abi more generally) is not stable across compiler releases, so .rlib files are usually treated as temporary artifacts (similar to .o files).
When compiling the final executable, the binary code from the rlibs are statically linked together, along with instantiations of the imported generic functions (as needed).
It is also possible to compile crates as C-style static libraries (libfoo.a) or shared objects (libfoo.so). This is often done in order to expose a stable C-compatible abi to be used externally.

See Rust by example and the Rust book for more info.

Clever system !

mathstuf — Wed, 04 Dec 2024 16:30:30 +0000

> For C++ templates, if you export a template from a module without providing the source code, it would also need to end up into a binary representation for it to being monomorphized by any user of that library.

No one has written anything about what this looks like in the Itanium ABI (or any other for that matter). AFAIK, modules still need to ship the module sources (just like headers before). Basically, PIMPL is still relevant and no, modules don't change the decl/impl file split. I don't see any appetite for no-module-interface library deployment from implementations, but maybe I'm just not listening in the right places.

> There is an experimental draft of a cross-compiler ABI for this

Are you talking about Microsoft's IFC format? That still doesn't resolve the "need to ship module source files" problem.

> C++ does NOT define a stable ABI

Nor will the language (in any likelihood). What is stable is the implementations' guarantees about how they generate ABIs when compiling. That has been sufficient so far.

(FD, should have mentioned in the prior comment, but I was rushing out the door: I implemented C++20 module support in CMake, co-chair WG21's SG15 Tooling, and am very involved in "how to build modules" processes as I hear about them.)

Clever system !

ebee_matteo — Wed, 04 Dec 2024 12:48:16 +0000

For C++ templates, if you export a template from a module without providing the source code, it would also need to end up into a binary representation for it to being monomorphized by any user of that library.

There is an experimental draft of a cross-compiler ABI for this, but to my knowledge this is not part of any C++ standard but just some kind of gentlemen agreement among compiler writers. People are still working on this and nowhere near complete. ABI stability is not even guaranteed across different versions of the same compiler (same as Rust, actually).

C++ does NOT define a stable ABI, it just kinda happened to settle down after decades of use. And for modules, which are a new feature, there is nothing uniform.

Clever system !

mathstuf — Wed, 04 Dec 2024 12:28:40 +0000

> Incidentally, C++20 modules introduce a lot of the same issues also to that language :-).

Hmm. I'm not seeing the connections to your bullet points. C++20 modules certainly support distributing ABI-stable libraries (and roughly on the same order of magnitude of difficulty as pre-modules) better than Rust does. Care to elaborate?

Clever system !

ebee_matteo — Wed, 04 Dec 2024 11:03:28 +0000

There are some things to keep in mind (I am boiling down a bit the following to keep it simple, sorry if it will be imprecise to some):

* Rust doesn't have a stable ABI.

* in Rust, everything is typically statically linked together; a manifest can specify different features which can cause dependent libraries to be recompiled when feature selection changes. In this sense, a library crate (a crate can also be binary-only, of course) in this sense is not too far away from a static library, but since there is no ABI stability guarantee for Rust, everything needs to be recompiled each time the compiler / toolchain changes.

* A lot of types use generic parameters. These are monomorphized at compile time. A library crate cannot possibly know all the ways a parameterized type is instantiated by its users. Type information needs hence to be encoded in the library e.g. by emitting an intermediate representation for it from the AST, or the source code for crates need to be available at compilation time for all its users.

* it gets worse with stuff such as proc macros :-)

So, a library crate is conceptually a library, but not a static or dynamic library in the C/C++ sense, which I think was the comment from OP asking for.

Incidentally, C++20 modules introduce a lot of the same issues also to that language :-).

Clever system !

Karellen — Wed, 04 Dec 2024 10:43:13 +0000

As someone who's not yet got into Rust (any day now, promise!) what is the difference between a crate and a library?

Clever system !

Wol — Wed, 04 Dec 2024 08:52:23 +0000

> I bet compiling files one by one was already a (painful) "optimization" when C was invented: maybe because computers were simply too limited at the time to compile the whole program at once?

Okay, I'm talking ten years later, but we made extensive use of libraries. Compute time was expensive and pre-compiled libraries saved a lot of that. Plus, as various people keep going on about modern optimisations and how the "simple model" of how a CPU works no longer fits reality, back then your typical C or Fortran or whatever code really was pretty much "portable assembler" so a decent programmer could optimise the hell out of code at a source level.

So compiling individual files to object code and stashing them in a library just made obvious sense. And even back then (as virtual memory was becoming the norm) you would have a "text" and "data" portion so the library code would only exist as one copy, even when many people were using it. In Pr1mos each subroutine/function call would create a data frame on the user's stack (or allocate space for a data-frame in the executable at link time - this obviously couldn't recurse ...).

Our company computer back then, for example, only had 256KB of RAM (Or was it KW, 512KB. I never really knew). Compare that to your typical PC of the era, which had 4KB and maxed out at 64KB. My Jupiter Ace had a massive 19KB.

Cheers,
Wol

Clever system !

marcH — Wed, 04 Dec 2024 07:58:56 +0000

It looks crazy at first, then it just looks unusual and finally it all makes sense:

> This approach is reminiscent of a build system — and that's deliberate. Build systems, while not always perfect, usually do a good job of tracking the dependencies between components and correctly caching partial results. A compiler that has the architecture of a build system can take advantage of the research that has been done on build systems to do the same.

This is how most optimizations are achieved: don't repeatedly recompute/refetch what you already did in the past; cache it instead. The actually "crazy inefficient" thing has been to use for decades a pipeline approach for compilation instead of a graph of dependencies.

I bet compiling files one by one was already a (painful) "optimization" when C was invented: maybe because computers were simply too limited at the time to compile the whole program at once?

Clever system !

matp75 — Tue, 03 Dec 2024 20:52:36 +0000

This is impressive and a bit crazy to do and achieve this with a compiler ! Completely agree on the fact that the build on change totally make sense to optimize out.