Bos: Do we need a "Rust Standard"?

[Posted October 27, 2022 by corbet]

Mara Bos has written a lengthy blog post on whether the Rust language needs to be standardized. The answer is "no" — but she draws a distinction between a "standard" (maintained by some distant standards body) and a "specification".

While no official decision has been made yet, there does seem to be a general agreement that we should indeed work towards having and maintaining an official complete Rust specification from within the Rust project. It’s just a lot of work, so I’m afraid we won’t get there with just some enthusiastic volunteers, even if we can use the Ferrocene specification as a start. We’ll need support and funding from the Rust Foundation and interested companies.

Bos: Do we need a "Rust Standard"?

Posted Oct 27, 2022 15:34 UTC (Thu) by JoeBuck (subscriber, #2330) [Link] (3 responses)

A specification that is contributed to by the authors of multiple independent implementers of Rust, and that is as rigorously defined as one of the higher quality ISO standards, is as good as a standard. In fact, it's better, because you don't have a standards organization that demands payment for each copy of the standard.

Bos: Do we need a "Rust Standard"?

Posted Oct 27, 2022 15:53 UTC (Thu) by khim (subscriber, #9252) [Link] (1 responses)

ECMA allows one to download standards for free. Here's one for Windows API.

Kinda joke, but not really: it appears as if ECMA is lax enough that you can more-or-less continue doing what you are doing while getting the right to call the result “standard”. Here's the one for ECMAScript (more commonly known under it's maiden name, JavaScript).

In that particular case low requirements of ECMA is blessing in a disguise: I don't think it would be too hard to convince them to give Rust a number and agree to publish official documents from time to time.

And if you consider that Rust Foundation is more serious about following the rules then many “real” standard organizations… why not?

P.S. Of course before that happens someone need to fund actual development of rigorous specs. Even if ECMA is willing to rubberstamp, basically, anything, it wouldn't do anyone any good if specs would rubberstamped yet useless (like aforementioned Windows API specs).

Bos: Do we need a "Rust Standard"?

Posted Oct 27, 2022 21:10 UTC (Thu) by iabervon (subscriber, #722) [Link]

I think the appropriate purpose for a standards body is mainly to check that Rust (or whoever) actually followed their process, particularly when it comes to getting the authors of RFCs and a specification to review that they agree on what they mean, which would become more important if they adopt the Ferrocene specification as something that periodically gets signed by Rust as officially an accurate description of Rust.

After that, it's archival and hosting: here's what everyone agreed on in March 2022, here's what everyone agreed in December 2022 were typos back in March, here's what they said was new in October 2022. Properly hosting a historical document is a specialized job separate from the decision-making that Rust has a process for.

Bos: Do we need a "Rust Standard"?

Posted Oct 28, 2022 15:21 UTC (Fri) by seanyoung (subscriber, #28711) [Link]

I think what Mara Bos is trying to say: a specification is that is descriptive, rather prescriptive. In other words, a specification which describes how the rust language works in the existing rust compiler, rather than a standard which the rust compiler must follow.

I think this is the right approach. It allows the language to change to be enhanced quickly, and rust is much better for it.

Bos: Do we need a "Rust Standard"?

Posted Oct 27, 2022 16:09 UTC (Thu) by philbert (subscriber, #153223) [Link] (9 responses)

I support the idea of a spec very much so, I think people hate the idea of losing control of the direction of the language but I am not sure this is the case.

I would love to see if there is a way to help out in the year with my experience leading gccrs effort; I feel quite strongly that the 'core' language of rust is not very clear. So much of the language relies on definitions in libcore to do pretty much anything (adding numbers or making slices), but I believe the limiting factor on a spec is the borrow checker which is part of the language which makes it a much more difficult job to discuss lifetimes which is more complex.

Bos: Do we need a "Rust Standard"?

Posted Oct 27, 2022 18:38 UTC (Thu) by matthias (subscriber, #94967) [Link] (8 responses)

The borrow checker might be hard, but this should be doable. Of course it might be very challenging to document precisely which program is accepted by the borrow checker and which is not. But is this really necessary for a specification? It should be enough to have a lower bound (the set of programs that must be accepted) and an upper bound (the set of programs that might be accepted). The upper bound can also be defined by the complement: The set of programs that must be rejected.

The upper bound could be specified as: The invariants regarding references must be satisfied. Or even: The program should not exhibit undefined behavior. Of course, this property is undecidable, but there is no need to decide this property. The lower bound needs to be much more conservative. Especially it cannot allow any program that is currently rejected by the borrow checker. The specification would then describe the behavior for every program that is below the upper bound (i.e., which does not exhibit UB). For a program that is between the lower and the upper bound, the semantics would actually be: if it compiles then the semantics are like this. But for these programs it is allowed that the compiler (or borrow checker in the compiler) refuses compilation.

The borrow checker is there to ensure that you do not actually produce unsound code. But it does not do any harm if the specification also specifies behavior for code that is sound but rejected by the (imperfect) borrow checker. This way, the borrow checker can be improved without invalidating the formal specification. In a newer version of the spec, the lower bound can be changed to reflect the fact that for newer rust versions the set of programs that are accepted is larger.

The real problem is, that up to now it is quite unclear what is UB in unsafe rust. A formal specification has to include unsafe rust, as it is quite unlikely that the code that needs a formal verification and formal semantics will be absolutely free of unsafe code. In embedded environments there will always be some unsafe at the bottom. And of course if the safety of some nuclear plant relies on the behavior of unsafe code, this code should be verified against the specification. At least I would expect this for all nuclear power plants in my neighborhood.

Bos: Do we need a "Rust Standard"?

Posted Oct 27, 2022 20:36 UTC (Thu) by khim (subscriber, #9252) [Link] (4 responses)

> But is this really necessary for a specification?

If you couldn't explain how that system works then it's kinda hard to ensure that code which one compiler accepts another one would accept it.

> But is this really necessary for a specification? It should be enough to have a lower bound (the set of programs that must be accepted) and an upper bound (the set of programs that might be accepted). The upper bound can also be defined by the complement: The set of programs that must be rejected.

It's funny because the blog post which we are discussing extolls the virtue of Rust by noting that this is different than what you might experience when programming in a language like C, where you might need to pay extra attention not to accidentally rely on any compiler-specific extensions that are not part of the language specification.

Such extensions would be precisely non-portability pain points not dissimilar to C.

Note that it's just just lifetimes. Issues like that one: not only trait bounds may cause compilation failure if you replace where T: X + Y with where T: Y + X, both cases may be valid yet give different results!

This may turn out to be similar (or even worse) than difference between signed and unsigned char in different C compilers!

Bos: Do we need a "Rust Standard"?

Posted Oct 27, 2022 23:26 UTC (Thu) by tialaramex (subscriber, #21167) [Link] (2 responses)

As I understand it we know how to explain what the borrow checker should _ideally_ do but it's a semantic constraint on a computation and so Rice's theorem says it's Undecidable. This doesn't prevent us from writing it down in natural language, but it does mean nobody can actually implement that ideal checker in software since that's what "Undecidable" means (see also Halting Problem).

So, Rust's actual real world compiler has provided checkers which approach that ideal from one side only. That is, all the programs which rustc will compile do satisfy this ideal constraint, but some of them ones which do NOT compile would also satisfy the ideal constraint, which is annoying for the programmer but usually it's pretty trivial to fix. The cases which are non-trivial have been increasingly esoteric, at some point you are going out of your way to make the checking difficult, maybe just don't?

The Non-Lexical Lifetimes improvement to Rust basically improved the borrow checker to allow some of those programs which satisfied the ideal constraint but wouldn't previously compile because lexically they didn't satisfy previous checkers. This brought many of the "Duh, stupid Rust, this is obviously correct" programs from the "Doesn't compile" to "Does compile" categories. I guess we could provide some examples like that and say "Your implementation of the borrow checker should allow at least these examples" but at some point it's a Quality of Implementation issue, like if a toy Unix only allows 16 open file descriptors. It's clearly Unix it just isn't very good.

Bos: Do we need a "Rust Standard"?

Posted Oct 28, 2022 4:42 UTC (Fri) by cjwatson (subscriber, #7322) [Link]

Not disagreeing, but I was curious so looked it up: I don't think that example would be POSIX, since POSIX requires OPEN_MAX to be at least 20.

(Asking for POSIX conformance from a toy Unix would be asking a lot, though!)

Bos: Do we need a "Rust Standard"?

Posted Oct 28, 2022 11:09 UTC (Fri) by Karellen (subscriber, #67644) [Link]

it's a semantic constraint on a computation and so Rice's theorem says it's Undecidable.

Surely it's only undecidable for it to determine whether any arbitrary program either definitely satisfies the constraint, or definitely does not.

But that's not what Rust's borrow checker needs to do. Rust's borrow checker only needs to determine that a subset of programs definitely satisfy the constraint. If it cannot prove that an arbitrary program can satisfy the constraint, it can reject it. It doesn't matter if the reason the program cannot be proven to satisfy the constraint is because it does not, or because the checker is not powerful enough, or because that program is actually an undecidable case.

If the checker, rigorously implemented to the spec, can prove that an acceptably useful subset of valid programs pass it, it (and, by extension, Rust) are valuable additions to our programming language toolbox.

Bos: Do we need a "Rust Standard"?

Posted Oct 28, 2022 11:02 UTC (Fri) by josh (subscriber, #17465) [Link]

> It's funny because the blog post which we are discussing extolls the virtue of Rust by noting that this is different than what you might experience when programming in a language like C, where you might need to pay extra attention not to accidentally rely on any compiler-specific extensions that are not part of the language specification.

Fortunately that's one problem we don't have in Rust. The Rust compiler won't have any extensions (we don't add things to stable Rust that aren't first added to the language), and the other compiler that might get used in production settings has said it doesn't intend to have any extensions.

But as you noted, that's a key reason why the specification shouldn't have implementation-specific leeway, in borrow checking or anywhere else.

Bos: Do we need a "Rust Standard"?

Posted Oct 28, 2022 10:58 UTC (Fri) by josh (subscriber, #17465) [Link] (2 responses)

> Of course it might be very challenging to document precisely which program is accepted by the borrow checker and which is not. But is this really necessary for a specification? It should be enough to have a lower bound (the set of programs that must be accepted) and an upper bound (the set of programs that might be accepted). The upper bound can also be defined by the complement: The set of programs that must be rejected.

No, that's not sufficient, because then different implementations could accept different programs. To ensure that doesn't happen, you need a *precise* specification of exactly what's accepted and what isn't, with no gaps.

Bos: Do we need a "Rust Standard"?

Posted Oct 28, 2022 12:20 UTC (Fri) by rds (subscriber, #19403) [Link] (1 responses)

Or do what Java does and have a compliance test suite, where anything that passes this test is compliant.

Java has a specification

Posted Oct 29, 2022 2:04 UTC (Sat) by spigot (subscriber, #50709) [Link]

Are you referring to the JCK (a.k.a. Java Compatibility Kit, a.k.a. TCK for Java SE, etc.)?

My understanding is that passing it is a necessary, but not sufficient, requirement for something to claim it's a compliant Java implementation. Among other things, it must also conform to the Java Language Specification. (There's also a Java Virtual Machine Specification.)

Bos: Do we need a "Rust Standard"?

Posted Oct 28, 2022 17:50 UTC (Fri) by rvolgers (guest, #63218) [Link]

Begging people to actually read this article, as it's excellent and in-depth.

I would even go so far as to say little more needs to be said about this topic. Of course, others can and will disagree...