Bos: Do we need a "Rust Standard"?

Posted Oct 27, 2022 16:09 UTC (Thu) by philbert (subscriber, #153223)
Parent article: Bos: Do we need a "Rust Standard"?

I support the idea of a spec very much so, I think people hate the idea of losing control of the direction of the language but I am not sure this is the case.

I would love to see if there is a way to help out in the year with my experience leading gccrs effort; I feel quite strongly that the 'core' language of rust is not very clear. So much of the language relies on definitions in libcore to do pretty much anything (adding numbers or making slices), but I believe the limiting factor on a spec is the borrow checker which is part of the language which makes it a much more difficult job to discuss lifetimes which is more complex.

Bos: Do we need a "Rust Standard"?

Posted Oct 27, 2022 18:38 UTC (Thu) by matthias (subscriber, #94967) [Link] (8 responses)

The borrow checker might be hard, but this should be doable. Of course it might be very challenging to document precisely which program is accepted by the borrow checker and which is not. But is this really necessary for a specification? It should be enough to have a lower bound (the set of programs that must be accepted) and an upper bound (the set of programs that might be accepted). The upper bound can also be defined by the complement: The set of programs that must be rejected.

The upper bound could be specified as: The invariants regarding references must be satisfied. Or even: The program should not exhibit undefined behavior. Of course, this property is undecidable, but there is no need to decide this property. The lower bound needs to be much more conservative. Especially it cannot allow any program that is currently rejected by the borrow checker. The specification would then describe the behavior for every program that is below the upper bound (i.e., which does not exhibit UB). For a program that is between the lower and the upper bound, the semantics would actually be: if it compiles then the semantics are like this. But for these programs it is allowed that the compiler (or borrow checker in the compiler) refuses compilation.

The borrow checker is there to ensure that you do not actually produce unsound code. But it does not do any harm if the specification also specifies behavior for code that is sound but rejected by the (imperfect) borrow checker. This way, the borrow checker can be improved without invalidating the formal specification. In a newer version of the spec, the lower bound can be changed to reflect the fact that for newer rust versions the set of programs that are accepted is larger.

The real problem is, that up to now it is quite unclear what is UB in unsafe rust. A formal specification has to include unsafe rust, as it is quite unlikely that the code that needs a formal verification and formal semantics will be absolutely free of unsafe code. In embedded environments there will always be some unsafe at the bottom. And of course if the safety of some nuclear plant relies on the behavior of unsafe code, this code should be verified against the specification. At least I would expect this for all nuclear power plants in my neighborhood.

Bos: Do we need a "Rust Standard"?

Posted Oct 27, 2022 20:36 UTC (Thu) by khim (subscriber, #9252) [Link] (4 responses)

> But is this really necessary for a specification?

If you couldn't explain how that system works then it's kinda hard to ensure that code which one compiler accepts another one would accept it.

> But is this really necessary for a specification? It should be enough to have a lower bound (the set of programs that must be accepted) and an upper bound (the set of programs that might be accepted). The upper bound can also be defined by the complement: The set of programs that must be rejected.

It's funny because the blog post which we are discussing extolls the virtue of Rust by noting that this is different than what you might experience when programming in a language like C, where you might need to pay extra attention not to accidentally rely on any compiler-specific extensions that are not part of the language specification.

Such extensions would be precisely non-portability pain points not dissimilar to C.

Note that it's just just lifetimes. Issues like that one: not only trait bounds may cause compilation failure if you replace where T: X + Y with where T: Y + X, both cases may be valid yet give different results!

This may turn out to be similar (or even worse) than difference between signed and unsigned char in different C compilers!

Bos: Do we need a "Rust Standard"?

Posted Oct 27, 2022 23:26 UTC (Thu) by tialaramex (subscriber, #21167) [Link] (2 responses)

As I understand it we know how to explain what the borrow checker should _ideally_ do but it's a semantic constraint on a computation and so Rice's theorem says it's Undecidable. This doesn't prevent us from writing it down in natural language, but it does mean nobody can actually implement that ideal checker in software since that's what "Undecidable" means (see also Halting Problem).

So, Rust's actual real world compiler has provided checkers which approach that ideal from one side only. That is, all the programs which rustc will compile do satisfy this ideal constraint, but some of them ones which do NOT compile would also satisfy the ideal constraint, which is annoying for the programmer but usually it's pretty trivial to fix. The cases which are non-trivial have been increasingly esoteric, at some point you are going out of your way to make the checking difficult, maybe just don't?

The Non-Lexical Lifetimes improvement to Rust basically improved the borrow checker to allow some of those programs which satisfied the ideal constraint but wouldn't previously compile because lexically they didn't satisfy previous checkers. This brought many of the "Duh, stupid Rust, this is obviously correct" programs from the "Doesn't compile" to "Does compile" categories. I guess we could provide some examples like that and say "Your implementation of the borrow checker should allow at least these examples" but at some point it's a Quality of Implementation issue, like if a toy Unix only allows 16 open file descriptors. It's clearly Unix it just isn't very good.

Bos: Do we need a "Rust Standard"?

Posted Oct 28, 2022 4:42 UTC (Fri) by cjwatson (subscriber, #7322) [Link]

Not disagreeing, but I was curious so looked it up: I don't think that example would be POSIX, since POSIX requires OPEN_MAX to be at least 20.

(Asking for POSIX conformance from a toy Unix would be asking a lot, though!)

Bos: Do we need a "Rust Standard"?

Posted Oct 28, 2022 11:09 UTC (Fri) by Karellen (subscriber, #67644) [Link]

it's a semantic constraint on a computation and so Rice's theorem says it's Undecidable.

Surely it's only undecidable for it to determine whether any arbitrary program either definitely satisfies the constraint, or definitely does not.

But that's not what Rust's borrow checker needs to do. Rust's borrow checker only needs to determine that a subset of programs definitely satisfy the constraint. If it cannot prove that an arbitrary program can satisfy the constraint, it can reject it. It doesn't matter if the reason the program cannot be proven to satisfy the constraint is because it does not, or because the checker is not powerful enough, or because that program is actually an undecidable case.

If the checker, rigorously implemented to the spec, can prove that an acceptably useful subset of valid programs pass it, it (and, by extension, Rust) are valuable additions to our programming language toolbox.

Bos: Do we need a "Rust Standard"?

Posted Oct 28, 2022 11:02 UTC (Fri) by josh (subscriber, #17465) [Link]

> It's funny because the blog post which we are discussing extolls the virtue of Rust by noting that this is different than what you might experience when programming in a language like C, where you might need to pay extra attention not to accidentally rely on any compiler-specific extensions that are not part of the language specification.

Fortunately that's one problem we don't have in Rust. The Rust compiler won't have any extensions (we don't add things to stable Rust that aren't first added to the language), and the other compiler that might get used in production settings has said it doesn't intend to have any extensions.

But as you noted, that's a key reason why the specification shouldn't have implementation-specific leeway, in borrow checking or anywhere else.

Bos: Do we need a "Rust Standard"?

Posted Oct 28, 2022 10:58 UTC (Fri) by josh (subscriber, #17465) [Link] (2 responses)

> Of course it might be very challenging to document precisely which program is accepted by the borrow checker and which is not. But is this really necessary for a specification? It should be enough to have a lower bound (the set of programs that must be accepted) and an upper bound (the set of programs that might be accepted). The upper bound can also be defined by the complement: The set of programs that must be rejected.

No, that's not sufficient, because then different implementations could accept different programs. To ensure that doesn't happen, you need a *precise* specification of exactly what's accepted and what isn't, with no gaps.

Bos: Do we need a "Rust Standard"?

Posted Oct 28, 2022 12:20 UTC (Fri) by rds (subscriber, #19403) [Link] (1 responses)

Or do what Java does and have a compliance test suite, where anything that passes this test is compliant.

Java has a specification

Posted Oct 29, 2022 2:04 UTC (Sat) by spigot (subscriber, #50709) [Link]

Are you referring to the JCK (a.k.a. Java Compatibility Kit, a.k.a. TCK for Java SE, etc.)?

My understanding is that passing it is a necessary, but not sufficient, requirement for something to claim it's a compliant Java implementation. Among other things, it must also conform to the Java Language Specification. (There's also a Java Virtual Machine Specification.)