|
|
Log in / Subscribe / Register

Compiling Rust with GCC: an update

Compiling Rust with GCC: an update

Posted Sep 9, 2022 18:42 UTC (Fri) by calumapplepie (guest, #143655)
In reply to: Compiling Rust with GCC: an update by developer122
Parent article: Compiling Rust with GCC: an update

There are more reasons to want another toolchain for rustc.

Part of the idea is to stabilize rust: right now, there are no bugs in rustc, because the language is defined as what rustc says is valid. There's no standard for how rust should behave; the documentation is informative, not normative. There's no real way to know if the undocumented behavior you're optimizing around is undocumented because nobody bothered to, or undocumented because it'll change tomorrow.

Further, rust is EXTREMELY vulnerable to a trusting-trust attack right now. If, at some point, someone backdoored a rust compiler to add their malicious code to any rust compiler it compiles, then it's very possible that said backdoor has propagated across a chunk of the ecosystem. If someone buried the backdoor as a bug in an old version of rustc, then all rust compilers will have that backdoor, because the only way to get a rust compiler is to bootstrap it from a very old one written in C, using it to compile a slightly newer compiler, and then using that compiler to keep compiling. The defense against trusting-trust attacks is to have multiple compilers; you can make it much harder to do a trusting-trust if it needs to detect when it compiles either gcc-rs or LLVM rustc.

The irony of a trusting-trust style backdoor or bug in rust is that it would be official behavior. It's possible that, right now, rustc only compiles due to a self-propagating 'bug' in an old version of rustc. The behavior of your rustc compiler may differ from what the source code of said compiler says it should be, and *that hidden behavior would be the official rust behavior*.

Having 2 implementations means we can actually look at rust as a language, rather than as a binary program.


to post comments

Compiling Rust with GCC: an update

Posted Sep 9, 2022 21:08 UTC (Fri) by Gaelan (guest, #145108) [Link] (12 responses)

The latter bit isn’t quite true: people can, and have, bootstrapped rustc from mrustc, a separate C++ implementation. (Specifically, this involves compiling rustc 1.54 with mrustc, then bootstrapping forward to the latest version from there.)

Compiling Rust with GCC: an update

Posted Sep 10, 2022 18:23 UTC (Sat) by calumapplepie (guest, #143655) [Link] (11 responses)

I know: I mentioned mrustc, though not by name.

Compiling Rust with GCC: an update

Posted Sep 11, 2022 0:24 UTC (Sun) by himi (subscriber, #340) [Link] (10 responses)

Yes, I think the point was that mrustc /isn't/ that old - it's actively maintained (supporting up to 1.54 at the moment, that changes periodically) and specifically intended for this bootstrap chain. A rust compiler written in rust but bootstrapped via mrustc can only be "infected" if the infection comes via mrustc, which isn't written in rust, so the trusting trust attack would need to be bootstrapped through the C compiler and into mrustc - unrealistic as the trusting-trust attack is, you'd have to stretch it even further to make it work here.

Now, mrustc specifically /isn't/ a production ready rust compiler - it's minimal, has no borrow checker, and lots of functionality is missing; that's why no one ever talks about it as an alternative implementation. The fact that it wasn't picked as the starting point for any serious attempts at an alternative implementation is telling, too. But it does still break the chain of trust you're concerned about.

Compiling Rust with GCC: an update

Posted Sep 11, 2022 1:04 UTC (Sun) by mjg59 (subscriber, #23239) [Link] (9 responses)

And it's sufficient for David A Wheeler's Double Diverse Compilation - if you build rustc with rustc and also build rustc with mrustc, and then run rebuild rustc with both of those outputs, if you get identical output then there's no backdoor (or alternatively both are backdoored, but that would mean whatever you built mrustc with is backdoored, and you can verify that through another round of DDC)

Compiling Rust with GCC: an update

Posted Sep 12, 2022 11:54 UTC (Mon) by paulj (subscriber, #341) [Link] (5 responses)

That statement isn't true though. The risk is reduced, but not eliminated.

Compiling Rust with GCC: an update

Posted Sep 12, 2022 22:08 UTC (Mon) by mjg59 (subscriber, #23239) [Link] (4 responses)

In what way is it not eliminated?

Compiling Rust with GCC: an update

Posted Sep 12, 2022 22:29 UTC (Mon) by paulj (subscriber, #341) [Link] (3 responses)

You're making an assumption that a subversion of a compiler binary can only be carried out by a binary of the same compiler.

There is no good reason to make this assumption. Particularly given the existence of file formats that aggregate different executable blobs together, along with hooks to allow execution to occur on loading. Even without those formats, there is simply no good reason to think the attacker who can (originally) cause a distributed binary of compiler A to be subverted must be limited to targeting the further subversion of /only/ compiler A source and binary.

Hell, even Thompson's original PoC targeted /two/ sets of sources for subversion of output.

The chances that a binary subversion targets your mrustc compiler AND your C++ compiler to compile mrustc may be lower than a subversion targeting just one, but that's kind of assuming you and your work-flow are not specially interesting to a skilled and sufficiently capable attacker. And such assumptions are not a basis to state "it is eliminated".

There are other assumptions in the DDC paper, e.g., that we could dig up some old compiler that existed before our potential-target. But then... we're still trusting a number of things, including the MAC algorithm. And MACs have a finite shelf-life - they weaken over time. Maybe the probability is low, but that depends on the juiciness of the target and the threat-model - and "lower probability" is different to "eliminated", unless you're into hand-waving.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 2:37 UTC (Tue) by mjg59 (subscriber, #23239) [Link] (2 responses)

No, you can generate a trusted compiler via a directly introspectable process. Pick an architecture. Write a trivial assembler directly in machine code. Use that to bootstrap a more competent assembler. Write a trivial C compiler. Use that to build an extremely old version of gcc. Use that to build a modern version of gcc. Use that to build a cross-compiler for whatever architecture you actually care about. You now have a trusted compiler, and the rest of Diverse Double Compilation falls out of that.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 4:01 UTC (Tue) by pabs (subscriber, #43278) [Link]

The Bootstrappable Builds project is an example of doing just that, although they aim to bootstrap all architectures from that initial machine code step, without going through the cross-compiler stage:

https://bootstrappable.org/

Compiling Rust with GCC: an update

Posted Sep 13, 2022 9:54 UTC (Tue) by paulj (subscriber, #341) [Link]

Building your own trusted tool chain is following Thompson's advice on what you need to do to build trust. So... not DDC.

Compiling Rust with GCC: an update

Posted Sep 12, 2022 22:12 UTC (Mon) by himi (subscriber, #340) [Link] (2 responses)

I'd have to actually check the bootstrap process for rust (it's probably been five years since I last tried), but I have a feeling you have a second step beyond just building rustc with mrustc, though that may be more of a verification step. mrustc doesn't have a lot of the memory safety features of rustc, and only implements the minimal features needed to compile rustc, so I don't think a compiler built by it will behave the same as the compiler built by rustc, even if it's the same version (i.e. mrustc building rustc 1.54 compared with rustc 1.54 building rustc 1.54).

It does still break (or dilute greatly) the chain of trust, but you'd need a more careful verification to make sure nothing hinky was going on.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 2:39 UTC (Tue) by mjg59 (subscriber, #23239) [Link]

I'd expect rustc built with mrustc to behave differently in various ways (such as plausibly not being correctly memory safe in the face of malformed input), but I'd expect the actual compilation process to be the same? It doesn't really matter, though, since you can just do another round of building rustc and you should have the same assertions around it being trustworthy.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 8:30 UTC (Tue) by farnz (subscriber, #17727) [Link]

Assuming mrustc is bug-free against its own spec, the rule is that it'll compile all code that rustc does with the same behaviour in the final compiled binary, but it will also compile programs that rustc rejects as invalid. For this comparison purpose, that's good enough - you can't trust an mrustc-built compiler if you don't also have a rustc-built compiler from the same source.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 0:40 UTC (Sat) by tialaramex (subscriber, #21167) [Link]

> there are no bugs in rustc

This overstates the situation. There are a bunch of promises Rust makes which, if it turns out rustc doesn't keep that promise in some corner case, they'll fix rustc. This is especially the case in safe Rust.

There are though - as you imply - a lot of crucial things which are not specified, particularly in unsafe Rust. For example the atomics model in Rust is basically "You see what your C++ compiler actually does? That". But then the reality in C++ is similar, your C++ compiler's behaviour, even in C++ 11 "mode" is a lot more like the documented C++ 20 atomics than the C++ 11 atomics. That's not because the compiler is doing it wrong in C++ 11 mode, it's just because the C++ 11 documentation was (more) wrong.

We're at the edge of our understanding is the problem. I'm comfortable with this remaining true in unsafe Rust, because the compiler internals and ultimately even CPU internals we're dealing with actually are this murky - but yes this needs to be resolved for safe Rust.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 11:29 UTC (Sat) by developer122 (guest, #152928) [Link] (2 responses)

While trusting trust attacks are hypothetically possible, they're of little to no concern to the vast majority of people.

I can think of only one urban legend where one was successfully deployed. That was in a software environment where all source code for every system component was standardized and provided by AT&T, with every single computer on earth having byte-identical source code and system software.

Unlike GNU, where the tools and libraries are ossified in place and haven't changed in perhaps 30 years, the rust ecosystem is still quite new. Code is frequently being tossed out and rewritten as styles and standards change or needs evolve, making a trusting trust attack particularly hard to pull off.

Compiling Rust with GCC: an update

Posted Sep 12, 2022 22:51 UTC (Mon) by rahulsundaram (subscriber, #21946) [Link] (1 responses)

> While trusting trust attacks are hypothetically possible, they're of little to no concern to the vast majority of people.

We shouldn't be discounting attacks like that on the basis of what is popular. Otherwise, we run the risk of repeating the giant mess from the slew of side channel attacks that went from "hypothetically possible" to demonstrable but hard to repeat to causing industry wide changes within a few years. Good news for Rust is that there are multiple implementations already and only likely going to mature with time.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 7:23 UTC (Tue) by kleptog (subscriber, #1183) [Link]

> We shouldn't be discounting attacks like that on the basis of what is popular.

Sure, but on the other hand it's sufficient if only a small group are working on solving the problem. Once it's worked out, we can automate it and roll it out everywhere. It's also relevant that there hasn't been an example found in the wild, which means it's judged very low risk.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 18:52 UTC (Sat) by rvolgers (guest, #63218) [Link] (6 responses)

> Part of the idea is to stabilize rust: right now, there are no bugs in rustc, because the language is defined as what rustc says is valid. There's no standard for how rust should behave; the documentation is informative, not normative. There's no real way to know if the undocumented behavior you're optimizing around is undocumented because nobody bothered to, or undocumented because it'll change tomorrow.

In a multi-party standard you have to look at *multiple* code bases and then go through a mediation process involving multiple parties if you discover an underspecified area, whereas on Rust you can just go through the normal change process to update the documentation, the code, or both.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 20:33 UTC (Sat) by tialaramex (subscriber, #21167) [Link] (5 responses)

There are cases where we can look at what people did, which broke, and say "That was obviously not supported, you keep both halves". For example there was a case recently where some crates were like "Hey I bet Rust's networking related data structures are literally identical to the C structures, and so I can just point Linux C code doing low-level networking at the Rust structures and it'll work" and one day that stopped working. That was never guaranteed to work, nobody should have done that.

But on the other hand there are cases where people are obliged to guess Rust does something, that there's some behaviour, and yet Rust's docs are basically just a shrug emoji. No behaviour is specified. Suppose I have some 64 byte aligned structures. Lots of them actually. I can make pointers to them, Rust is OK with that. Now, Rust doesn't have pointer arithmetic like C, but what if I turn a pointer into an integer. (unsafe) Rust is OK with this too. Surely the bottom four bits of that integer (at least) are zero, right? That's how aligned pointers work. Well, Rust doesn't formally say so, but it feels reasonable. Now, what if I mask these bits off, and use them to store 4 flags. Now I have a pointer-sized value with a pointer *and* my four flags, hooray. To get the pointer back, surely I mask the bits off back to zero, and turn my integer back into a pointer. No harm done. Does that work? Historically Rust said well, we do not promise this is OK, but it's the only thing we offer that seems appropriate here, and it did work.

Today you have to be more careful, nobody warned you about this, beyond the general warning that what you were doing was "unsafe" but what you were doing might stop working. On some platforms. Or maybe not. You have to either obey Strict Provenance, or you need to say OK, I can't meet these requirements, I opt out of strict provenance and I'll take my chances with this PNVI exposure stuff, and in both cases that has consequences I won't summarise here and it could change.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 21:12 UTC (Sat) by stephen.pollei (subscriber, #125364) [Link] (2 responses)

I watched a youtube video , RustConf 2022 - WHAT IF WE PRETENDED UNSAFE CODE WAS NICE, AND THEN IT WAS? by Aria Beingessner, that talked a little bit about "provenance" . Seems like some have reasons on why they want to make things in this area a bit more strict.

Compiling Rust with GCC: an update

Posted Sep 10, 2022 22:22 UTC (Sat) by tialaramex (subscriber, #21167) [Link]

Yes, that's the work I'm talking about. Just watched it and I felt like Aria's analogy in that video was very strange, while https://faultlore.com/blah/tower-of-weakenings/ her blog entry was clearer to me, but that might just be different preferences.

Compiling Rust with GCC: an update

Posted Sep 11, 2022 0:56 UTC (Sun) by khim (subscriber, #9252) [Link]

> Seems like some have reasons on why they want to make things in this area a bit more strict.

It's simple yet pretty useful psychological trick: if you ask people to follow needlessly strict rules but promise to relax them later 90% (if not 99%) is people would be happy with them. No matter what rules would you invent. And you can talk to the tiny subset of people who want more relaxed rules and try to make them happy, too. If you try to make the rules precise then you can never reach acceptance from everyone.

Heck, the whole Rust building is built on top of that approach: that's what separation of safe and unsafe Rust does.

Compiling Rust with GCC: an update

Posted Sep 11, 2022 0:33 UTC (Sun) by khim (subscriber, #9252) [Link] (1 responses)

> Today you have to be more careful, nobody warned you about this, beyond the general warning that what you were doing was "unsafe" but what you were doing might stop working. On some platforms. Or maybe not. You have to either obey Strict Provenance, or you need to say OK, I can't meet these requirements, I opt out of strict provenance and I'll take my chances with this PNVI exposure stuff, and in both cases that has consequences I won't summarise here and it could change.

But how can multiple implementations help there? C and C++ do have multiple implementation, the do have ISO Standard (many ones, actually) yet to this very day nobody knows what can or can not be done with pointers.

I think this is the last attempt which tried to clarify the issue (and proposal to, you know, make compilers which actually obey the standard as published was explicitly rejected).

At least Rust developers never claimed that they have a normative documentation which explains how unsafe is supposed to work.

C and C++ pretend that they do have such documentation and there are even people who claim that Rust is deficient because of that!

IMNSHO informative documentation is better that something which claims to be a normative documentation which you couldn't use as such.

At least if documentation is informative you know you couldn't use it as a guide.

Compiling Rust with GCC: an update

Posted Sep 11, 2022 9:20 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

https://open-std.org/JTC1/SC22/WG14/www/docs/n3005.pdf is the current state of TS 6010, the draft Technical Specification which is what happened to N2676

So, modulo crazy ISO problems, the C23 standard per se won't mandate this roughly PNVI-address exposed model, but there will be an ISO document separately specifying how this would work. The standard is... rough. But there is limited enthusiasm for figuring out all the fine details while it remains unclear if everybody will even implement it. This only starts to make sense once at least two major compilers (e.g. MSVC and GCC) implement it.

With TS6010 you get most of the optimisations people expect in a modern compiler, and which of course Rust is doing, but you can do a XOR doubly linked list in C, as one example of stunt pointer manipulation that some people still think is a good idea. Of course some optimisations are given up in your doubly linked list code to make this work, but you don't feel that loss in unrelated code.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 14:04 UTC (Tue) by hunger (subscriber, #36242) [Link] (5 responses)

> Part of the idea is to stabilize rust: right now, there are no bugs in rustc, because the language is defined as what rustc says is valid. There's no standard for how rust should behave; the documentation is informative, not normative.

While totally true: How will adding a gcc-based rust compiler change this?

The current process to improve Rust is to write an RFC and provide an implementation for rustc. This is then extensively tested, feedback on the implementation is collected, all available Rust projects will be built to see what breaks, etc. With this process gcc-based Rust will always have to catch up to the "real" rust and will never be a serious alternative! The only way to avoid this is to change the process and make that less code- and more paper-based. The process would need to produce a specifications for the compiler teams to implement later. This sounds like a huge step backwards to me! Just look at C++ to see how poorly that works: Compilers are always behind the specification and it is a huge pain for projects to agree on the exact features they can use in their code (without loosing too many users that need to stick with older compilers).

But in practice this will probably not be necessary: The gcc-based rust compiler plans to reuse parts of the original rust compiler as those are factored out and become available. E.g. Polonius, the library that should eventually contain the entire borrow checker. This will significantly reduce the costs to maintain the gcc-rust project and is thus obviously a good thing. So in the long-run all possible programs that work with rust code will converge towards a shared frontend, incl. rustc, rust-analyzer, a stand-alone gcc-based rust compiler and more. Many of the Rust features will be in this shared front-end code eventually, at which point we are back at having an implementation defined language. This also re-introduces the trusting-trust issues you bring up: If the shared front-end code produces malicious high-level code, then all non-malicious backend-implementations will faithfully produce malicious binaries from their inputs.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 17:39 UTC (Tue) by atnot (guest, #124910) [Link] (4 responses)

My question is: if all this comes to pass as written, then what exactly was the purpose of gccrs over rust-codegen-gcc and mrustc?

It won't be a fully independent implementation that can be used to find ambiguities, as nebulous as that idea was. Folks who insist that Rust must have multiple compilers because that's what C has have seemingly moved the goalposts elsewhere already anyway. It doesn't help bootstrapping because it won't be pure C++ like mrustc. The people who have a weirdly selective worry about dependence on permissively licensed software won't be happy either. Who exactly is left then? GCC developers? People who's mouth froths at the sight of a Code of Conduct? People who can install a modern gcc version and all of the other rust development tools, but not rustc? Are there really enough of those?

We've seen this play out a few times before, even: People demanded gcc frontends for D and Go, which are mostly abandoned and nobody uses, because why would you. I don't see how this vision of gccrs would be any different.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 17:53 UTC (Tue) by rahulsundaram (subscriber, #21946) [Link] (2 responses)

> People demanded gcc frontends for D and Go, which are mostly abandoned and nobody uses, because why would you

I have seen this claim made in several places but this isn't what happened atleast for Go. It wasn't based on any demand from anyone. Go team themselves decided to do it.

https://go.dev/blog/gccgo-in-gcc-471

"The Go language has always been defined by a spec, not an implementation. The Go team has written two different compilers that implement that spec: gc and gccgo. Having two different implementations helps ensure that the spec is complete and correct: when the compilers disagree, we fix the spec, and change one or both compilers accordingly. Gc is the original compiler, and the go tool uses it by default. Gccgo is a different implementation with a different focus, and in this post we’ll take a closer look at it."

Compiling Rust with GCC: an update

Posted Sep 13, 2022 19:16 UTC (Tue) by atnot (guest, #124910) [Link] (1 responses)

I did indeed not know that, thanks for pointing that out! I'm not sure if it really makes things better though: A frontend for a relatively simple language with the full support of a team that earnestly believed in the specification with multiple implementations approach is today mostly irrelevant, has a single active contributor and appears to be at least half a year behind an already frozen spec. I don't hate the idea of multiple implementations, but it really gives reason to temper your expectations of what gccrs will deliver.

Compiling Rust with GCC: an update

Posted Sep 13, 2022 23:09 UTC (Tue) by rahulsundaram (subscriber, #21946) [Link]

> I did indeed not know that, thanks for pointing that out! I'm not sure if it really makes things better though

It may not be better but it is more accurate. Even if one were to stick to the same conclusion they had originally, it's helpful to validate the data points. The common narrative appears to be that GCC support for various languages including Rust are being added based on unreasonable demands for some unknown but definitely odd reasons and they will inevitably fail or splinter the language and the spec based approach is an old relic of the past that must be inherently doomed one way or the other. It doesn't leave room much room for acknowledging that multiple languages do have several successfully used implementations or even a mild curiosity of why things like gccrs is even funded by commercial organizations in the first place.

> I don't hate the idea of multiple implementations, but it really gives reason to temper your expectations of what gccrs will deliver.

This is something readily acknowledged in https://github.com/Rust-GCC/gccrs/wiki/Frequently-Asked-Q... but this is some extend mitigated by funding and possibility of code sharing via things like polonius.

Compiling Rust with GCC: an update

Posted Sep 23, 2022 14:42 UTC (Fri) by njs (subscriber, #40338) [Link]

Given that gccrs is funded by the grsecurity folks, and their big product is "the kernel, but compiled with proprietary gcc plugins", I always assumed the motivation for gccrs is that they want to get rust into the gcc pipeline early enough that it's before their plugins.

Compiling Rust with GCC: an update

Posted Sep 15, 2022 3:18 UTC (Thu) by firstyear (subscriber, #89081) [Link]

> Further, rust is EXTREMELY vulnerable to a trusting-trust attack right now. If, at some point, someone backdoored a rust compiler to add their malicious code to any rust compiler it compiles, then it's very possible that said backdoor has propagated across a chunk of the ecosystem.

These attacks just don't happen in reality though. It's "simple to grasp" but "almost impossible to fix" which makes it extremely attractive to a broad audience to spend huge amounts of time writing think pieces about it. When in reality attacks are "complex and difficult to grasp" and "require a lot of smaller broad, annoying fixes".

No one is pulling off these backdoor compiler attacks today. And why would they? Attackers don't attack "ideologically shiny targets" they attack the lowest hanging fruit. Things like lack of mfa, typo-squatting popular libraries, uploading malicious source directly into a library, and more. But I don't see people being willing to acknowledge the broad complex social and technical systems that would actually need to be improved to resolve this.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds