|
|
Subscribe / Log in / New account

Compiling Rust with GCC: an update

By Jonathan Corbet
September 9, 2022

Kangrejos
While the Rust language has appeal for kernel development, many developers are concerned by the fact that there is only one compiler available; there are many reasons why a second implementation would be desirable. At the 2022 Kangrejos gathering, three developers described projects to build Rust programs with GCC in two different ways. A fully featured, GCC-based Rust implementation is still going to take some time, but rapid progress is being made.

gccrs

Philip Herron and Arthur Cohen started off with a discussion of gccrs, which is a Rust front-end for the GCC compiler. This project, Herron said, was initially started in 2014, though it subsequently stalled. It was restarted in 2019 and moved slowly until funding for two developers arrived thanks to Open Source Security and Embecosm. Since then, development has been happening much more quickly.

Currently, the developers are targeting Rust 1.49, which is a bit behind the state of the art — it was released at the end of 2020. In some ways the developers are still playing catch-up with even that older version; there are a number of intrinsics missing, for example. In other ways they are trying to get ahead of the game; there has been some work done on const generics that, so far, is really only a parser, "but it's a start". An experimental release of gccrs as part of GCC can be expected in May or June 2023.

[Arthur Cohen and
Philip Herron] Cohen talked about building the Rust for Linux project (the integration of Rust with the Linux kernel) specifically. That project is currently targeting Rust 1.62, which is rather more recent than the 1.49 that gccrs is aiming at; there is thus a fair amount of ground yet to cover even once gccrs hits its target. There are not many differences in the language itself, he said, but there are more in the libraries. Even with the official compiler, Rust for Linux has to set the RUST_BOOTSTRAP variable to gain access to unstable features; gccrs is trying to implement the ones that are needed for the kernel. Generic associated types are also needed. Eventually, the goal is for gccrs to be able to compile Rust for Linux.

One thing he pointed out is that gccrs is making no attempt to implement the same compiler flags that the existing rustc compiler uses. That would be a difficult task and those options are "not GCCish". A separate wrapper is being implemented for the Cargo build system to allow it to invoke gccrs rather than rustc.

An important component for the kernel — and just about everything else — is the libcore library. It includes fundamental types like Option and Result, without which little can be done, for example. The liballoc library, which implements the Vector and Box types among others, is also needed. Cohen noted that this library has been customized for the kernel, but Rust for Linux developer Miguel Ojeda said that the changes are minimal.

Testing is currently done by compiling various projects with gccrs; these include the Blake3 crypto library and libcore 1.49. The rustc test suite is also being used. Plans are to add building Rust for Linux to the testing regime as well.

What else is missing at this point? Herron said that borrow checking is a big missing feature in current gccrs. Opaque types are not yet implemented. Plus, of course, there are a lot of bugs. Cohen added that the test suite needs work. A lot of the tests are intended to fail, so gccrs "passes" them, but for the wrong reason. He is working on adding the proper use of error codes so that only the right kinds of failures are seen as the correct behavior.

Future plans include a lot of cross-compiler testing. Eventually it would be good to start testing with Crater, which attempts to compile all of the crates found on crates.io, but that will take longer. With regard to borrow checking, Cohen added, they are not even trying to come up with their own implementation; instead they will be integrating Polonius. This, it is hoped, is a harbinger of more code sharing to be done with the Rust community in the future.

The code repository can be found on GitHub.

rust_codegen_gcc

Antoni Boucher then gave an update on his project, which is called rust_codegen_gcc. The rustc compiler is built on LLVM, he began, but it includes an API that allows the substitution of a different backend for code generation. That API can be used to hook libgccjit into rustc, enabling code generation with GCC. The biggest motivation for this work is to support architectures that LLVM cannot compile for. That is needed for Rust for Linux support across all of the architectures the kernel can be built for, and should be useful for other embedded targets as well.

Over the last year, rust_codegen_gcc has been merged into the rustc repository. It has gained support for global variables and 128-bit integers (though not yet in big-endian format). Support for SIMD operations has improved; it can compile the tests for x86-64 and pass most of them. It is also now possible to bootstrap rustc with rust_codegen_gcc, which is a big milestone — but some programs don't compile yet. Alignment support has improved, packed structs are supported, inline assembly support is getting better, and some function and variable attributes are supported. Boucher has added many intrinsics that GCC lacks and fixed a lot of crashes.

[Antoni Boucher] Almost all of the rustc tests pass now, and the situation is getting better; most of the failures are with SIMD or with unimplemented features like link-time optimization (LTO). With regard to SIMD, the necessary additions to libgccjit are done, as is the "vector shuffle" instruction. About 99% of the LLVM SIMD intrinsics and half of the Rust SIMD intrinsics have been implemented. A lot of fixes and improvements (128-bit integers, for example) have gone into GCC from this work.

There is, of course, a lot still to be done. Outstanding tasks include unwinding for panic() support, proper debuginfo, LTO, and big-endian 128-bit integers. Support for the new architectures has to be added to the libraries, and SIMD support needs to be expanded beyond x86. There are function and variable attributes, including inline, that are yet to be implemented. Supporting distribution via rustup is also on the list.

There are a number of things that could be improved in the rustc API. For example, GCC distinguishes between lvalues (assignment targets) and rvalues, while LLVM just has "values"; that mismatch creates difficulties in places. LLVM has a concept of "landing pads" for exceptions, while GCC uses a try/catch mechanism. The handling of basic blocks is implicit in the API, but needs to be explicit in places. More fundamentally, GCC's API is based on the abstract syntax tree, while LLVM's is based on instructions, leading to confusion at times. LLVM uses the same aggregate operations for structs and arrays, while they are different in GCC.

On the libgccjit side, there is room for improvement in type introspection, including attributes. The time required for code generation and the resulting binary size are both worse than with LLVM. There are optimizations that are missed by libgccjit as well.

Boucher demonstrated the compilation of some basic Rust kernel modules with the GCC-backed compiler. Wedson Almeida Filho asked whether any testing had been done with an architecture that is not supported by LLVM, but that has not yet happened. There will probably be "details to deal with" when that test is done, Boucher said.

There are some potential complications for Rust for Linux. The ABI used by the generated code differs on some platforms. There is also the question of whether backports should be done to support older versions of the compiler. That is complicated by the fact that patches to GCC are needed to make rust_codegen_gcc work now.

Herron and Cohen joined Boucher at the end of the session, where they were asked about their timelines. Herron answered that it will require most of a year for gccrs to get to the point where rust_codegen_gcc is now. When asked about compilation times, Cohen said that benchmarks would not be meaningful now, when the focus is still on just getting something to work. He expects that gccrs is "probably very slow" at this point. Boucher said that rust_codegen_gcc is slower than rustc, but that there are optimizations yet to be done to improve the situation.

[Thanks to LWN subscribers for supporting my travel to this event.]

Index entries for this article
ConferenceKangrejos/2022


to post comments

Compiling Rust with GCC: an update

Posted Sep 9, 2022 16:24 UTC (Fri) by developer122 (guest, #152928) [Link] (32 responses)

It seems to me the latter approach would be much preferable for targeting additional architectures and keeping all of the rsut implementations in step with one another through upstream rustc. That said, people with old toolchains will probably want an all-in-one GNU implementation that doesn't have additional external dependencies.

Compiling Rust with GCC: an update

Posted Sep 9, 2022 18:42 UTC (Fri) by calumapplepie (guest, #143655) [Link] (31 responses)

There are more reasons to want another toolchain for rustc.

Part of the idea is to stabilize rust: right now, there are no bugs in rustc, because the language is defined as what rustc says is valid. There's no standard for how rust should behave; the documentation is informative, not normative. There's no real way to know if the undocumented behavior you're optimizing around is undocumented because nobody bothered to, or undocumented because it'll change tomorrow.

Further, rust is EXTREMELY vulnerable to a trusting-trust attack right now. If, at some point, someone backdoored a rust compiler to add their malicious code to any rust compiler it compiles, then it's very possible that said backdoor has propagated across a chunk of the ecosystem. If someone buried the backdoor as a bug in an old version of rustc, then all rust compilers will have that backdoor, because the only way to get a rust compiler is to bootstrap it from a very old one written in C, using it to compile a slightly newer compiler, and then using that compiler to keep compiling. The defense against trusting-trust attacks is to have multiple compilers; you can make it much harder to do a trusting-trust if it needs to detect when it compiles either gcc-rs or LLVM rustc.

The irony of a trusting-trust style backdoor or bug in rust is that it would be official behavior. It's possible that, right now, rustc only compiles due to a self-propagating 'bug' in an old version of rustc. The behavior of your rustc compiler may differ from what the source code of said compiler says it should be, and *that hidden behavior would be the official rust behavior*.

Having 2 implementations means we can actually look at rust as a language, rather than as a binary program.

Compiling Rust with GCC: an update

Posted Sep 9, 2022 21:08 UTC (Fri) by Gaelan (guest, #145108) [Link] (12 responses)

The latter bit isn’t quite true: people can, and have, bootstrapped rustc from mrustc, a separate C++ implementation. (Specifically, this involves compiling rustc 1.54 with mrustc, then bootstrapping forward to the latest version from there.)

Compiling Rust with GCC: an update

Posted Sep 10, 2022 18:23 UTC (Sat) by calumapplepie (guest, #143655) [Link] (11 responses)

I know: I mentioned mrustc, though not by name.

Compiling Rust with GCC: an update

Posted Sep 11, 2022 0:24 UTC (Sun) by himi (subscriber, #340) [Link] (10 responses)

Yes, I think the point was that mrustc /isn't/ that old - it's actively maintained (supporting up to 1.54 at the moment, that changes periodically) and specifically intended for this bootstrap chain. A rust compiler written in rust but bootstrapped via mrustc can only be "infected" if the infection comes via mrustc, which isn't written in rust, so the trusting trust attack would need to be bootstrapped through the C compiler and into mrustc - unrealistic as the trusting-trust attack is, you'd have to stretch it even further to make it work here.

Now, mrustc specifically /isn't/ a production ready rust compiler - it's minimal, has no borrow checker, and lots of functionality is missing; that's why no one ever talks about it as an alternative implementation. The fact that it wasn't picked as the starting point for any serious attempts at an alternative implementation is telling, too. But it does still break the chain of trust you're concerned about.

Compiling Rust with GCC: an update

Posted Sep 11, 2022 1:04 UTC (Sun) by mjg59 (subscriber, #23239) [Link] (9 responses)

And it's sufficient for David A Wheeler's Double Diverse Compilation - if you build rustc with rustc and also build rustc with mrustc, and then run rebuild rustc with both of those outputs, if you get identical output then there's no backdoor (or alternatively both are backdoored, but that would mean whatever you built mrustc with is backdoored, and you can verify that through another round of DDC)

Compiling Rust with GCC: an update

Posted Sep 12, 2022 11:54 UTC (Mon) by paulj (subscriber, #341) [Link] (5 responses)

That statement isn't true though. The risk is reduced, but not eliminated.

Compiling Rust with GCC: an update

Posted Sep 12, 2022 22:08 UTC (Mon) by mjg59 (subscriber, #23239) [Link] (4 responses)

In what way is it not eliminated?

Compiling Rust with GCC: an update

Posted Sep 12, 2022 22:29 UTC (Mon) by paulj (subscriber, #341) [Link] (3 responses)

You're making an assumption that a subversion of a compiler binary can only be carried out by a binary of the same compiler.

There is no good reason to make this assumption. Particularly given the existence of file formats that aggregate different executable blobs together, along with hooks to allow execution to occur on loading. Even without those formats, there is simply no good reason to think the attacker who can (originally) cause a distributed binary of compiler A to be subverted must be limited to targeting the further subversion of /only/ compiler A source and binary.

Hell, even Thompson's original PoC targeted /two/ sets of sources for subversion of output.

The chances that a binary subversion targets your mrustc compiler AND your C++ compiler to compile mrustc may be lower than a subversion targeting just one, but that's kind of assuming you and your work-flow are not specially interesting to a skilled and sufficiently capable attacker. And such assumptions are not a basis to state "it is eliminated".

There are other assumptions in the DDC paper, e.g., that we could dig up some old compiler that existed before our potential-target. But then... we're still trusting a number of things, including the MAC algorithm. And MACs have a finite shelf-life - they weaken over time. Maybe the probability is low, but that depends on the juiciness of the target and the threat-model - and "lower probability" is different to "eliminated", unless you're into hand-waving.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 2:37 UTC (Tue) by mjg59 (subscriber, #23239) [Link] (2 responses)

No, you can generate a trusted compiler via a directly introspectable process. Pick an architecture. Write a trivial assembler directly in machine code. Use that to bootstrap a more competent assembler. Write a trivial C compiler. Use that to build an extremely old version of gcc. Use that to build a modern version of gcc. Use that to build a cross-compiler for whatever architecture you actually care about. You now have a trusted compiler, and the rest of Diverse Double Compilation falls out of that.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 4:01 UTC (Tue) by pabs (subscriber, #43278) [Link]

The Bootstrappable Builds project is an example of doing just that, although they aim to bootstrap all architectures from that initial machine code step, without going through the cross-compiler stage:

https://bootstrappable.org/

Compiling Rust with GCC: an update

Posted Sep 13, 2022 9:54 UTC (Tue) by paulj (subscriber, #341) [Link]

Building your own trusted tool chain is following Thompson's advice on what you need to do to build trust. So... not DDC.

Compiling Rust with GCC: an update

Posted Sep 12, 2022 22:12 UTC (Mon) by himi (subscriber, #340) [Link] (2 responses)

I'd have to actually check the bootstrap process for rust (it's probably been five years since I last tried), but I have a feeling you have a second step beyond just building rustc with mrustc, though that may be more of a verification step. mrustc doesn't have a lot of the memory safety features of rustc, and only implements the minimal features needed to compile rustc, so I don't think a compiler built by it will behave the same as the compiler built by rustc, even if it's the same version (i.e. mrustc building rustc 1.54 compared with rustc 1.54 building rustc 1.54).

It does still break (or dilute greatly) the chain of trust, but you'd need a more careful verification to make sure nothing hinky was going on.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 2:39 UTC (Tue) by mjg59 (subscriber, #23239) [Link]

I'd expect rustc built with mrustc to behave differently in various ways (such as plausibly not being correctly memory safe in the face of malformed input), but I'd expect the actual compilation process to be the same? It doesn't really matter, though, since you can just do another round of building rustc and you should have the same assertions around it being trustworthy.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 8:30 UTC (Tue) by farnz (subscriber, #17727) [Link]

Assuming mrustc is bug-free against its own spec, the rule is that it'll compile all code that rustc does with the same behaviour in the final compiled binary, but it will also compile programs that rustc rejects as invalid. For this comparison purpose, that's good enough - you can't trust an mrustc-built compiler if you don't also have a rustc-built compiler from the same source.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 0:40 UTC (Sat) by tialaramex (subscriber, #21167) [Link]

> there are no bugs in rustc

This overstates the situation. There are a bunch of promises Rust makes which, if it turns out rustc doesn't keep that promise in some corner case, they'll fix rustc. This is especially the case in safe Rust.

There are though - as you imply - a lot of crucial things which are not specified, particularly in unsafe Rust. For example the atomics model in Rust is basically "You see what your C++ compiler actually does? That". But then the reality in C++ is similar, your C++ compiler's behaviour, even in C++ 11 "mode" is a lot more like the documented C++ 20 atomics than the C++ 11 atomics. That's not because the compiler is doing it wrong in C++ 11 mode, it's just because the C++ 11 documentation was (more) wrong.

We're at the edge of our understanding is the problem. I'm comfortable with this remaining true in unsafe Rust, because the compiler internals and ultimately even CPU internals we're dealing with actually are this murky - but yes this needs to be resolved for safe Rust.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 11:29 UTC (Sat) by developer122 (guest, #152928) [Link] (2 responses)

While trusting trust attacks are hypothetically possible, they're of little to no concern to the vast majority of people.

I can think of only one urban legend where one was successfully deployed. That was in a software environment where all source code for every system component was standardized and provided by AT&T, with every single computer on earth having byte-identical source code and system software.

Unlike GNU, where the tools and libraries are ossified in place and haven't changed in perhaps 30 years, the rust ecosystem is still quite new. Code is frequently being tossed out and rewritten as styles and standards change or needs evolve, making a trusting trust attack particularly hard to pull off.

Compiling Rust with GCC: an update

Posted Sep 12, 2022 22:51 UTC (Mon) by rahulsundaram (subscriber, #21946) [Link] (1 responses)

> While trusting trust attacks are hypothetically possible, they're of little to no concern to the vast majority of people.

We shouldn't be discounting attacks like that on the basis of what is popular. Otherwise, we run the risk of repeating the giant mess from the slew of side channel attacks that went from "hypothetically possible" to demonstrable but hard to repeat to causing industry wide changes within a few years. Good news for Rust is that there are multiple implementations already and only likely going to mature with time.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 7:23 UTC (Tue) by kleptog (subscriber, #1183) [Link]

> We shouldn't be discounting attacks like that on the basis of what is popular.

Sure, but on the other hand it's sufficient if only a small group are working on solving the problem. Once it's worked out, we can automate it and roll it out everywhere. It's also relevant that there hasn't been an example found in the wild, which means it's judged very low risk.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 18:52 UTC (Sat) by rvolgers (guest, #63218) [Link] (6 responses)

> Part of the idea is to stabilize rust: right now, there are no bugs in rustc, because the language is defined as what rustc says is valid. There's no standard for how rust should behave; the documentation is informative, not normative. There's no real way to know if the undocumented behavior you're optimizing around is undocumented because nobody bothered to, or undocumented because it'll change tomorrow.

In a multi-party standard you have to look at *multiple* code bases and then go through a mediation process involving multiple parties if you discover an underspecified area, whereas on Rust you can just go through the normal change process to update the documentation, the code, or both.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 20:33 UTC (Sat) by tialaramex (subscriber, #21167) [Link] (5 responses)

There are cases where we can look at what people did, which broke, and say "That was obviously not supported, you keep both halves". For example there was a case recently where some crates were like "Hey I bet Rust's networking related data structures are literally identical to the C structures, and so I can just point Linux C code doing low-level networking at the Rust structures and it'll work" and one day that stopped working. That was never guaranteed to work, nobody should have done that.

But on the other hand there are cases where people are obliged to guess Rust does something, that there's some behaviour, and yet Rust's docs are basically just a shrug emoji. No behaviour is specified. Suppose I have some 64 byte aligned structures. Lots of them actually. I can make pointers to them, Rust is OK with that. Now, Rust doesn't have pointer arithmetic like C, but what if I turn a pointer into an integer. (unsafe) Rust is OK with this too. Surely the bottom four bits of that integer (at least) are zero, right? That's how aligned pointers work. Well, Rust doesn't formally say so, but it feels reasonable. Now, what if I mask these bits off, and use them to store 4 flags. Now I have a pointer-sized value with a pointer *and* my four flags, hooray. To get the pointer back, surely I mask the bits off back to zero, and turn my integer back into a pointer. No harm done. Does that work? Historically Rust said well, we do not promise this is OK, but it's the only thing we offer that seems appropriate here, and it did work.

Today you have to be more careful, nobody warned you about this, beyond the general warning that what you were doing was "unsafe" but what you were doing might stop working. On some platforms. Or maybe not. You have to either obey Strict Provenance, or you need to say OK, I can't meet these requirements, I opt out of strict provenance and I'll take my chances with this PNVI exposure stuff, and in both cases that has consequences I won't summarise here and it could change.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 21:12 UTC (Sat) by stephen.pollei (subscriber, #125364) [Link] (2 responses)

I watched a youtube video , RustConf 2022 - WHAT IF WE PRETENDED UNSAFE CODE WAS NICE, AND THEN IT WAS? by Aria Beingessner, that talked a little bit about "provenance" . Seems like some have reasons on why they want to make things in this area a bit more strict.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 22:22 UTC (Sat) by tialaramex (subscriber, #21167) [Link]

Yes, that's the work I'm talking about. Just watched it and I felt like Aria's analogy in that video was very strange, while https://faultlore.com/blah/tower-of-weakenings/ her blog entry was clearer to me, but that might just be different preferences.

Compiling Rust with GCC: an update

Posted Sep 11, 2022 0:56 UTC (Sun) by khim (subscriber, #9252) [Link]

> Seems like some have reasons on why they want to make things in this area a bit more strict.

It's simple yet pretty useful psychological trick: if you ask people to follow needlessly strict rules but promise to relax them later 90% (if not 99%) is people would be happy with them. No matter what rules would you invent. And you can talk to the tiny subset of people who want more relaxed rules and try to make them happy, too. If you try to make the rules precise then you can never reach acceptance from everyone.

Heck, the whole Rust building is built on top of that approach: that's what separation of safe and unsafe Rust does.

Compiling Rust with GCC: an update

Posted Sep 11, 2022 0:33 UTC (Sun) by khim (subscriber, #9252) [Link] (1 responses)

> Today you have to be more careful, nobody warned you about this, beyond the general warning that what you were doing was "unsafe" but what you were doing might stop working. On some platforms. Or maybe not. You have to either obey Strict Provenance, or you need to say OK, I can't meet these requirements, I opt out of strict provenance and I'll take my chances with this PNVI exposure stuff, and in both cases that has consequences I won't summarise here and it could change.

But how can multiple implementations help there? C and C++ do have multiple implementation, the do have ISO Standard (many ones, actually) yet to this very day nobody knows what can or can not be done with pointers.

I think this is the last attempt which tried to clarify the issue (and proposal to, you know, make compilers which actually obey the standard as published was explicitly rejected).

At least Rust developers never claimed that they have a normative documentation which explains how unsafe is supposed to work.

C and C++ pretend that they do have such documentation and there are even people who claim that Rust is deficient because of that!

IMNSHO informative documentation is better that something which claims to be a normative documentation which you couldn't use as such.

At least if documentation is informative you know you couldn't use it as a guide.

Compiling Rust with GCC: an update

Posted Sep 11, 2022 9:20 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

https://open-std.org/JTC1/SC22/WG14/www/docs/n3005.pdf is the current state of TS 6010, the draft Technical Specification which is what happened to N2676

So, modulo crazy ISO problems, the C23 standard per se won't mandate this roughly PNVI-address exposed model, but there will be an ISO document separately specifying how this would work. The standard is... rough. But there is limited enthusiasm for figuring out all the fine details while it remains unclear if everybody will even implement it. This only starts to make sense once at least two major compilers (e.g. MSVC and GCC) implement it.

With TS6010 you get most of the optimisations people expect in a modern compiler, and which of course Rust is doing, but you can do a XOR doubly linked list in C, as one example of stunt pointer manipulation that some people still think is a good idea. Of course some optimisations are given up in your doubly linked list code to make this work, but you don't feel that loss in unrelated code.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 14:04 UTC (Tue) by hunger (subscriber, #36242) [Link] (5 responses)

> Part of the idea is to stabilize rust: right now, there are no bugs in rustc, because the language is defined as what rustc says is valid. There's no standard for how rust should behave; the documentation is informative, not normative.

While totally true: How will adding a gcc-based rust compiler change this?

The current process to improve Rust is to write an RFC and provide an implementation for rustc. This is then extensively tested, feedback on the implementation is collected, all available Rust projects will be built to see what breaks, etc. With this process gcc-based Rust will always have to catch up to the "real" rust and will never be a serious alternative! The only way to avoid this is to change the process and make that less code- and more paper-based. The process would need to produce a specifications for the compiler teams to implement later. This sounds like a huge step backwards to me! Just look at C++ to see how poorly that works: Compilers are always behind the specification and it is a huge pain for projects to agree on the exact features they can use in their code (without loosing too many users that need to stick with older compilers).

But in practice this will probably not be necessary: The gcc-based rust compiler plans to reuse parts of the original rust compiler as those are factored out and become available. E.g. Polonius, the library that should eventually contain the entire borrow checker. This will significantly reduce the costs to maintain the gcc-rust project and is thus obviously a good thing. So in the long-run all possible programs that work with rust code will converge towards a shared frontend, incl. rustc, rust-analyzer, a stand-alone gcc-based rust compiler and more. Many of the Rust features will be in this shared front-end code eventually, at which point we are back at having an implementation defined language. This also re-introduces the trusting-trust issues you bring up: If the shared front-end code produces malicious high-level code, then all non-malicious backend-implementations will faithfully produce malicious binaries from their inputs.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 17:39 UTC (Tue) by atnot (subscriber, #124910) [Link] (4 responses)

My question is: if all this comes to pass as written, then what exactly was the purpose of gccrs over rust-codegen-gcc and mrustc?

It won't be a fully independent implementation that can be used to find ambiguities, as nebulous as that idea was. Folks who insist that Rust must have multiple compilers because that's what C has have seemingly moved the goalposts elsewhere already anyway. It doesn't help bootstrapping because it won't be pure C++ like mrustc. The people who have a weirdly selective worry about dependence on permissively licensed software won't be happy either. Who exactly is left then? GCC developers? People who's mouth froths at the sight of a Code of Conduct? People who can install a modern gcc version and all of the other rust development tools, but not rustc? Are there really enough of those?

We've seen this play out a few times before, even: People demanded gcc frontends for D and Go, which are mostly abandoned and nobody uses, because why would you. I don't see how this vision of gccrs would be any different.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 17:53 UTC (Tue) by rahulsundaram (subscriber, #21946) [Link] (2 responses)

> People demanded gcc frontends for D and Go, which are mostly abandoned and nobody uses, because why would you

I have seen this claim made in several places but this isn't what happened atleast for Go. It wasn't based on any demand from anyone. Go team themselves decided to do it.

https://go.dev/blog/gccgo-in-gcc-471

"The Go language has always been defined by a spec, not an implementation. The Go team has written two different compilers that implement that spec: gc and gccgo. Having two different implementations helps ensure that the spec is complete and correct: when the compilers disagree, we fix the spec, and change one or both compilers accordingly. Gc is the original compiler, and the go tool uses it by default. Gccgo is a different implementation with a different focus, and in this post we’ll take a closer look at it."

Compiling Rust with GCC: an update

Posted Sep 13, 2022 19:16 UTC (Tue) by atnot (subscriber, #124910) [Link] (1 responses)

I did indeed not know that, thanks for pointing that out! I'm not sure if it really makes things better though: A frontend for a relatively simple language with the full support of a team that earnestly believed in the specification with multiple implementations approach is today mostly irrelevant, has a single active contributor and appears to be at least half a year behind an already frozen spec. I don't hate the idea of multiple implementations, but it really gives reason to temper your expectations of what gccrs will deliver.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 23:09 UTC (Tue) by rahulsundaram (subscriber, #21946) [Link]

> I did indeed not know that, thanks for pointing that out! I'm not sure if it really makes things better though

It may not be better but it is more accurate. Even if one were to stick to the same conclusion they had originally, it's helpful to validate the data points. The common narrative appears to be that GCC support for various languages including Rust are being added based on unreasonable demands for some unknown but definitely odd reasons and they will inevitably fail or splinter the language and the spec based approach is an old relic of the past that must be inherently doomed one way or the other. It doesn't leave room much room for acknowledging that multiple languages do have several successfully used implementations or even a mild curiosity of why things like gccrs is even funded by commercial organizations in the first place.

> I don't hate the idea of multiple implementations, but it really gives reason to temper your expectations of what gccrs will deliver.

This is something readily acknowledged in https://github.com/Rust-GCC/gccrs/wiki/Frequently-Asked-Q... but this is some extend mitigated by funding and possibility of code sharing via things like polonius.

Compiling Rust with GCC: an update

Posted Sep 23, 2022 14:42 UTC (Fri) by njs (subscriber, #40338) [Link]

Given that gccrs is funded by the grsecurity folks, and their big product is "the kernel, but compiled with proprietary gcc plugins", I always assumed the motivation for gccrs is that they want to get rust into the gcc pipeline early enough that it's before their plugins.

Compiling Rust with GCC: an update

Posted Sep 15, 2022 3:18 UTC (Thu) by firstyear (subscriber, #89081) [Link]

> Further, rust is EXTREMELY vulnerable to a trusting-trust attack right now. If, at some point, someone backdoored a rust compiler to add their malicious code to any rust compiler it compiles, then it's very possible that said backdoor has propagated across a chunk of the ecosystem.

These attacks just don't happen in reality though. It's "simple to grasp" but "almost impossible to fix" which makes it extremely attractive to a broad audience to spend huge amounts of time writing think pieces about it. When in reality attacks are "complex and difficult to grasp" and "require a lot of smaller broad, annoying fixes".

No one is pulling off these backdoor compiler attacks today. And why would they? Attackers don't attack "ideologically shiny targets" they attack the lowest hanging fruit. Things like lack of mfa, typo-squatting popular libraries, uploading malicious source directly into a library, and more. But I don't see people being willing to acknowledge the broad complex social and technical systems that would actually need to be improved to resolve this.

Compiling Rust with GCC: an update

Posted Sep 9, 2022 17:01 UTC (Fri) by developer122 (guest, #152928) [Link] (22 responses)

Out of curiosity: what widely used, kernel-supported architectures are supported by GCC and not LLVM?

Compiling Rust with GCC: an update

Posted Sep 9, 2022 17:33 UTC (Fri) by atnot (subscriber, #124910) [Link] (11 responses)

Comparing the list between the linux and rustc docs:
- ARC (Synopsis embedded core mostly used for MCUs)
- Xtensa (although there is a third party fork)
- SuperH (only really notably used in SEGA game consoles)
- PA-RISC (out of support since 2013)
- OpenRISC (still finds itself on various embedded boards)
- Nios II (Altera/Intel hard core on FPGAs)
- Itanium (no comment)

I have to say, I'm a bit surprised by this, considering the amount of noise about GCCs greater architecture support. I expected some hard hitters, but none of these seem particularly relevant. Even m86k, which is mostly kept alive for recreational purposes, is already supported upstream. All of these are going to be primitive enough that I don't think the lack of rust support will be relevant for a long long while. Unless it becomes impossible to compile Linux without rust at all.

Compiling Rust with GCC: an update

Posted Sep 9, 2022 20:16 UTC (Fri) by developer122 (guest, #152928) [Link] (1 responses)

Looking at https://github.com/fishinabarrel/linux-kernel-module-rust... it seems that a lot of architectures not currently supported have in fact been removed from LLVM at one point or another. Perhaps as old architectures are dropped from linux the gap will close.

Compiling Rust with GCC: an update

Posted Sep 9, 2022 20:46 UTC (Fri) by atnot (subscriber, #124910) [Link]

That's what I think too. Of these ARC (maybe Xtensa) seem like the only one that is likely to be still considered for current designs. This will only accelerate as ARM and RISC-V continue to displace custom cores.

Assuming a very optimistic timeline of two years until widespread Rust adoption in the Kernel and another two years until the stale distros actually pick up those kernels, I think most of these will be long gone from the Linux tree by the time this is relevant. Or have LLVM backends. Cadence and Synopsys certainly don't lack the resources to make that happen if they want to.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 0:15 UTC (Sat) by ndesaulniers (subscriber, #110768) [Link] (7 responses)

>> widely used

Compiling Rust with GCC: an update

Posted Sep 10, 2022 18:31 UTC (Sat) by WolfWings (subscriber, #56790) [Link] (6 responses)

I mean there's still a financially self-sustaining scene of international indie game devs making entirely new games on the Dreamcast, big release day parties, kickstarters making ~25k to fund the titles, etc. SuperH is more widely uses than folks realize just not on desktops or servers, just by/for gamers. :)

Compiling Rust with GCC: an update

Posted Sep 11, 2022 4:39 UTC (Sun) by k8to (guest, #15413) [Link]

When i was at Wind River in the late 90s SuperH was an alive and well dev board type that people were actively developing on. I'm sure the arch made its way into various devices. I have no idea which if them are potentially still alive, of course, but I suspect it's not a trivial amount.

At the same time, it was dwarfed by powerpc, x86, mips, arm, and even sparc.

Compiling Rust with GCC: an update

Posted Sep 11, 2022 7:16 UTC (Sun) by joib (subscriber, #8541) [Link] (4 responses)

Well, the Dreamcast retro gaming scene doesn't run on Linux, does it, so not particularly relevant to the topic of which architectures Linux supports? Still kinda cool, I would have expected Dreamcast to be long gone and forgotten.

(Likewise the above-mentioned xtensa is ubiquitous in the 'maker scene' thanks to the ESP8266/ESP32 family of chips, but those don't run Linux so again not particularly relevant to this discussion. And as a side-note, it seems Expressif is transitioning to RISC-V cores.)

Compiling Rust with GCC: an update

Posted Sep 11, 2022 14:02 UTC (Sun) by mathstuf (subscriber, #69389) [Link] (3 responses)

> Well, the Dreamcast retro gaming scene doesn't run on Linux,

No, but Linux does run on the Dreamcast hardware due to a community effort (with recent updates!): http://linuxdc.sourceforge.net/

Compiling Rust with GCC: an update

Posted Sep 18, 2022 15:45 UTC (Sun) by flussence (guest, #85566) [Link]

Game console homebrew is a weird place. I've noticed substantial *N64* support patches going into mainline during 5.x. Not sure what they were going for, but that's pretty cool.

Compiling Rust with GCC: an update

Posted Sep 23, 2022 11:10 UTC (Fri) by Tobu (subscriber, #24111) [Link] (1 responses)

> with recent updates!

The date in that page seems to be generated on every request. Following the links, updates are from 2012 or so.

Compiling Rust with GCC: an update

Posted Sep 23, 2022 11:35 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

Ah…indeed. Oh well :( .

Compiling Rust with GCC: an update

Posted Sep 12, 2022 12:04 UTC (Mon) by moltonel (guest, #45207) [Link]

You might also find surprising that gcc still doesn't support compiling for Apple M1 on MacOS. Gcc does support M1 on Linux, so this missing target triplet doesn't concern the kernel, but it's still an interesting counter point to the "Gcc supports more targets than LLVM" popular wisdom. aarch64-apple-darwin has orders of magnitude more users than all the Gcc-exclusive targets combined.

Compiling Rust with GCC: an update

Posted Sep 9, 2022 17:37 UTC (Fri) by mfuzzey (subscriber, #57966) [Link] (9 responses)

The list of kernel supported architectures not supported by LLVM seems to be

* alpha
* arc
* m68k
* microblaze
* nios2
* openrisc
* parisc
* s390
* sh
* um
* xtensa

alpha and nios2 used to be supported but LLVM but have since been dropped.

As for "widely used" I guess that depends on your perspective

Compiling Rust with GCC: an update

Posted Sep 9, 2022 20:16 UTC (Fri) by developer122 (guest, #152928) [Link]

There is in fact a port of rust to m68k, but only on linux.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 0:14 UTC (Sat) by ndesaulniers (subscriber, #110768) [Link] (3 responses)

S390 and UM (usermide x86) are supported.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 0:17 UTC (Sat) by ndesaulniers (subscriber, #110768) [Link]

In fact we test many different builds of Linux kernel for s390 and um with clang, and boot test them in CI.

s390x vs s390

Posted Sep 10, 2022 0:54 UTC (Sat) by tialaramex (subscriber, #21167) [Link] (1 responses)

I guess Rust only cares about s390x ("modern" 64-bit IBM) whereas I believe the Linux kernel technically builds on (32-bit) s390 despite there presumably not being many (any?) actual 20th century System/390 machines running it in practical use ?

If it ever came down to "Should we support decades old System/390 mainframes or Rust?" that seems like a no brainer but for now at least the ambition isn't anywhere close to that.

s390x vs s390

Posted Sep 10, 2022 12:16 UTC (Sat) by willy (subscriber, #9762) [Link]

Compiling Rust with GCC: an update

Posted Sep 10, 2022 11:18 UTC (Sat) by glaubitz (subscriber, #96452) [Link]

m68k is actually supported by LLVM as an experimental target thanks to efforts of the community ;-).

Compiling Rust with GCC: an update

Posted Sep 10, 2022 18:09 UTC (Sat) by linusw (subscriber, #40300) [Link] (2 responses)

There are some further "holes" in that list.

The ARM ISA is not universally supported, specifically not ARMv4 (not even in LLVM in general last time I checked) and I am even uncertain about ARMv5 for rust, both have substantial deployment and isn't going away from the kernel anytime soon.

Compiling Rust with GCC: an update

Posted Sep 11, 2022 10:39 UTC (Sun) by tialaramex (subscriber, #21167) [Link] (1 responses)

Rust's tier support list says armv4t-none-eabi has tier 3. That platform is described as ARMv4 with Thumb and exists particularly to make the Nintendo Gameboy Advance work.

Tier 3 means Rust's CI checks this compiles, but they don't check it works, and it is only supplied with the core library.

Obviously the kernel is comfortable in that world, you can't just TcpStream::connect() from inside the kernel either, although it won't fit on a GBA as I understand it, presumably if you've got a big enough ARMv4 system to run Linux, Rust isn't a difficult problem.

Compiling Rust with GCC: an update

Posted Sep 11, 2022 12:18 UTC (Sun) by josh (subscriber, #17465) [Link]

> Tier 3 means Rust's CI checks this compiles, but they don't check it works, and it is only supplied with the core library.

That's tier 2 (compiles, but isn't tested). Tier 3 is "this exists in the codebase, and might work, but you'll need to build it yourself".

https://doc.rust-lang.org/nightly/rustc/platform-support....

Compiling Rust with GCC: an update

Posted Sep 9, 2022 21:36 UTC (Fri) by scientes (guest, #83068) [Link] (1 responses)

> About 99% of the LLVM SIMD intrinsics and half of the Rust SIMD intrinsics have been implemented.

So it appears them are compiling the SIMD themselves, and bypassing LLVM and GCC's SIMD support, both of which were pretty good except for that no languages could really use them (I gave a talk about this at the 2019 LLVM conference, and wrote a patch series for Zig that only about half of it ever got merged, but I would still recommend Zig over C for this, as C's extensions have some problems that cannot be fixed except by starting over from C11.)

> many developers are concerned by the fact that there is only one compiler available;

I don't think 10 independent C++ compilers would be enough to convince Linus to allow C++ in Linux.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 1:38 UTC (Sat) by developer122 (guest, #152928) [Link]

Hell, for many years GCC was pretty much the only game in town.

Compiling Rust with GCC: an update

Posted Sep 12, 2022 12:29 UTC (Mon) by moltonel (guest, #45207) [Link] (8 responses)

It's great to see the progress in both projects, kudos to all involved.

I'm a bit worried reading that gccrs is cherry-picking features (const generics and various kernel-needed features) far beyond their announced target (1.49). I don't want a scenario where I declare an MSRV of 1.49 but unknowingly use a 1.50 feature. Or a gccrs that announces 1.60 compatibility but ignores some "minor" 1.59 feature. Gccrs got a lot of backlash from users worried about language/ecosystem split, it should be very careful with its features roadmap.

Compiling Rust with GCC: an update

Posted Sep 12, 2022 14:37 UTC (Mon) by mathstuf (subscriber, #69389) [Link] (7 responses)

> I don't want a scenario where I declare an MSRV of 1.49 but unknowingly use a 1.50 feature.

You do have CI set up for your MSRV, right? If you have an MSRV but only test/develop with stable, this is already a problem.

> Or a gccrs that announces 1.60 compatibility but ignores some "minor" 1.59 feature.

I think the gcc-rs developers are aware of such things and are unlikely to declare "supports Rust 1.X" without supporting every (compiler feature) that X implies. Or at least as far as the rustc test suite of 1.X is able to diagnose.

Compiling Rust with GCC: an update

Posted Sep 12, 2022 17:10 UTC (Mon) by moltonel (guest, #45207) [Link] (6 responses)

> You do have CI set up for your MSRV, right?

*I* do, but I'm pretty sure that's not universally the case. The point is not that the issue can be tested for, but that there is a new issue to be aware of. Also, I'm willing to bet that some developers will only setup their CI with gccrs, for philosophical reasons or to reduce resource usage.

Gccrs could probably prevent the issue by adopting rustc's #[feature(...)] system, but feature names are a bit implementation-specific, it's not clear how to make that work well with multiple implementations.

> unlikely to declare "supports Rust 1.X" without supporting every (compiler feature) that X implies

That was my understanding so far. But the implementation of const generics before they reached the 1.49 milestone casts a doubt. How will gccrs describe its current feature set ? "1.49 plus const generics as of 1.60 plus features X and Y" ? "1.60 except feature Z" ? Users will simplify that in their head to a single version number, and possibly write an optimistic MSRV to Cargo.toml.

To be fair, these feature differences are surmountable (maybe sites like caniuse.rs could start tracking gccrs and cg_gcc), and it's normal for gccrs to want to release features without exactly following rustc's release history, but as a user I yearn for a simple "gccrs version X matches the features of rustc version Y".

Compiling Rust with GCC: an update

Posted Sep 12, 2022 18:38 UTC (Mon) by mathstuf (subscriber, #69389) [Link] (1 responses)

> How will gccrs describe its current feature set ?

If I were the gccrs developers, I'd probably describe it as "1.49 with additional implemented features". Or, more succinctly, "see our issue tracking for specifics because we are still under heavy development".

> Users will simplify that in their head to a single version number, and possibly write an optimistic MSRV to Cargo.toml.

Users will do awful things. What do you suppose anyone do about it?

> it's normal for gccrs to want to release features without exactly following rustc's release history,

Yeah, railroading any implementation to limitations such as "you cannot make stdlib function X const before you implement $complicated_compiler_feature that took rustc developer 5 years to do" is…really strict and basically saying "none may pass"-style goalposts along this road.

> but as a user I yearn for a simple "gccrs version X matches the features of rustc version Y".

I sympathize, but I've also been in this field long enough to know that such things are usually hiding Lovecraftian contortions to make such a facade.

Compiling Rust with GCC: an update

Posted Sep 12, 2022 22:41 UTC (Mon) by moltonel (guest, #45207) [Link]

Fair enough, you're right, some amount of messiness is unavoidable. But keeping the feature diff small is a good ideal to aim for.

Compiling Rust with GCC: an update

Posted Sep 12, 2022 22:24 UTC (Mon) by himi (subscriber, #340) [Link]

>That was my understanding so far. But the implementation of const generics before they reached the 1.49 milestone casts a doubt. How will gccrs describe its current feature set ? "1.49 plus const generics as of 1.60 plus features X and Y" ? "1.60 except feature Z" ? Users will simplify that in their head to a single version number, and possibly write an optimistic MSRV to Cargo.toml.

Don't they just mean that during development they're targeting 1.49 for the moment, and in order to be able to support testing with the kernel (and satisfying the preference from a lot of kernel devs for having multiple compilers) they're also trying to get a move on with some more advanced features the kernel project needs? As in, they haven't set their supported version in stone, they're just using some of them as goal posts, with the kernel stuff as an additional goal post.

Presumably before they announce they're production ready for kernel builds they'll reach something a lot closer to feature parity. And presumably they'll only announce "supports version 1.foo" when they have full feature parity with version 1.foo, for exactly the reasons you've noted.

They're still in fairly early development - maybe wait until they're closer to actual release candidates before dinging them for how they're talking about feature/version support . . .

Compiling Rust with GCC: an update

Posted Sep 13, 2022 11:19 UTC (Tue) by milesrout (subscriber, #126894) [Link] (2 responses)

What's the point of multiple implementations if it's insisted that they all operate completely in lockstep, a few versions behind rustc? What's the point of multiple implementations if the "rustc" version is considered the same thing as the Rust version?

The best thing about having a separate Rust implementation would be that long-needed features could be implemented and used that have been languishing in RFCs for years being bikeshedded to death. How long did it take for const generics to be added? And they're still highly limited. Where are generic associated types? Tracking issue open for FIVE YEARS. Where are DST coercions? The DST coercion tracking issue has been open for SEVEN YEARS, almost as long as Rust has existed, and they're going nowhere.

Compiling Rust with GCC: an update

Posted Oct 1, 2022 12:44 UTC (Sat) by sammythesnake (guest, #17693) [Link] (1 responses)

That sounds like a very risky prospect to me.

If those bikeshedded features do eventually escape the shed and appear in rust (as defined by rustic or by some future official spec) they may not do so unchanged from whatever variant was implemented in whatever codebase jumped the gun. Then what we have is two implementations that are incompatible and codebases designed for each one.

At the very least, it would be necessary to have some way to indicate (analogously with editions?) which version of which features are implemented (e.g. rust 1.55 + shiny-new-feature-1-from-draft-spec-NNNN-x.yy.zzz + shiny-new-feature-2-from-draft-spec-MMMM-a.bb.ccc + ...) that code could indicate it's coded for.

To be able to compile such code, rustc (or any third implementation) would also have to provide compatible support for these incompatible variants. Variants that they explicitly declined to implement pending conclusion of bikeshedding.

I suppose they could be handled similarly to unstable features, though.

If the additions were limited to adding selected features from a future edition (e.g. rust 1.55 + feature-AAAA-from-rust-1.56) then you could reasonably hope that the more involved edition-plus-modifiers thing might at least be *forward* compatible, but still horribly convoluted and developers would likely much prefer to keep the number of modifiers specified to a minimum by simply upping the edition they specify at the first opportunity.

In a worse scenario, though, the features caught in bikeshed limbo might be decisively NACKed leaving two permanently diverged forks of the language. That can't possibly be a good thing.

Perhaps lessons can be learned from "browser feature detection" in various JavaScript libraries which has been a thing for quite some time. Is it reliably non-nightmarish...?

Compiling Rust with GCC: an update

Posted Oct 1, 2022 14:05 UTC (Sat) by Wol (subscriber, #4433) [Link]

I think anything like that is promptly going to fall foul of different people running different rustc.

If there are two major compilers, stuff will be NACK'd if it doesn't run on both. And even if it's bikeshedded over, the "editions" stuff will make sure it's okay.

(And from the efforts the rust people are going to for linux, and also given Linus' very pragmatic not purist attitude, stuff that linux wants is not likely to suffer too much bikeshedding. Even if it does (quite likely) end up as "this only exists in unsafe code".)

Cheers,
Wol

Compiling Rust with GCC: an update

Posted Sep 21, 2022 2:30 UTC (Wed) by basix (guest, #156492) [Link]

Small nitpick: It's RUSTC_BOOTSTRAP, not RUST_BOOTSTRAP.


Copyright © 2022, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds