Standardization - two independent implementations are good.

Posted Sep 2, 2024 15:12 UTC (Mon) by jjs (guest, #10315)
In reply to: GCC and Rust by ralfj
Parent article: Rust-for-Linux developer Wedson Almeida Filho drops out

gccrs being a second, independent implementation is good. It ensures that the specification is, in fact, clear. I've seen many projects (HW & SW) where the specifications seemed clear to the writers who did an implementation, but someone else followed the specifications, made something that matched the specifications, yet it was not interoperable with the original version. This is the nature of language - there's lots of places where the meaning of a word is not a singular, universally agreed meaning (check any dictionary).

Law dictionaries exist to help ensure legal language is precise & unambiguous. There's a reason IETF requires two, independent implementations before declaring something a Internet Standard - https://www.ietf.org/participate/runningcode/implementati.... If two implementations don't produce the same product, it's time to go back and fine tune the specification to clarify the ambiguities that arise. And the only way to check for ambiguities is via an independent implementation.

Standardization - two independent implementations are good.

Posted Sep 2, 2024 18:43 UTC (Mon) by ralfj (subscriber, #172874) [Link] (8 responses)

> gccrs being a second, independent implementation is good. It ensures that the specification is, in fact, clear.

It may do that. Or it may cause endless issues due to differences in behavior between implementations, as is the case in C. One reason why the standard leaves so many things as "Undefined Behavior" is that implementations happened to implement different behavior, and none of them wanted to change. It's easy for them to agree to make things UB, the consequences are beard by programmers... just look at the entire debacle with realloc-of-size-0 now being UB: https://queue.acm.org/detail.cfm?id=3588242

I don't deny that multiple independent implementations have advantages. But they also have serious disadvantages. And given the resources required to build and maintain them, I am not convinced that it's worth it overall. The fact that language implementations are typically open-source these days has removed one of the biggest arguments in favor of multiple implementations.

Standardization - two independent implementations are good.

Posted Sep 2, 2024 22:09 UTC (Mon) by jjs (guest, #10315) [Link] (2 responses)

Yes, they can define the behavior as UB -which means they've changed the spec. If you have a spec with defined behavior, and two implementations have different behavior, but meet the spec, you really have two choices, IMO -
1. Follow what appears to be the C way - declare it UB in the spec. Also, what I understand from this article & other things I've read about Rust that the Rust community is trying to avoid.
2. Clarify the spec. Choose which behavior is correct (or a third way), and rewrite the spec to clarify it.

In either case, the spec is changed. I suppose a 3rd way is to ignore the problem, but, IMO that's worse.

"The fact that language implementations are typically open-source these days has removed one of the biggest arguments in favor of multiple implementations."

I'll argue the opposite - it's the language implementations being open source is one of the biggest arguments in favor of multiple implementations. Look at what went on with Linux and GCC/LLVM as LLVM began to work to compile the kernel. More defined behavior, from what I can tell. And a huge advantage of open source is everyone can contribute.

Standardization - two independent implementations are good.

Posted Sep 5, 2024 11:40 UTC (Thu) by taladar (subscriber, #68407) [Link]

The C way was to declare it undefined behavior in the spec because the committee of representatives from multiple implementations that already implemented things differently failed to find a consensus whose code should change, not because there is ever any advantage at all in having undefined parts in a spec.

Standardization - two independent implementations are good.

Posted Sep 5, 2024 14:28 UTC (Thu) by ralfj (subscriber, #172874) [Link]

> If you have a spec with defined behavior, and two implementations have different behavior, but meet the spec, you really have two choices, IMO

That's not what happened here. In this case, the C standard was unambiguous since at least C89: "If size is zero and ptr is not a null pointer, the object it points to is freed". Some implementations violated the standard, and somehow it was deemed better to introduce UB into tons of existing code than to fix the buggy implementations.

Such a hypothetical case could of course happen, though. IMO in that case you have a buggy (unintentionally underdefined) standard -- which happens and which needs to be dealt with reasonably well. If you have multiple different implementations of the standard, they are very hard to fix (other than by making the standard so weak that it encompasses all implementations), and that explains some (but not all) of the oddities in C. If you only have a single implementation, it is a lot easier to fix such bugs in the standard/specification by adjusting either the spec (to still have a *defined* behavior! just maybe not the one that we'd ideally have liked to see) or the implementation. These kinds of things happen in Rust fairly regularly. A big part of what makes this possible is that we have the ability to add "future compatibility" lints to Rust so that there's many months or even years of advance notice to all code that might be affected by a compiler change. I worry that with multiple implementations, this kind of language evolution will become even harder than it already is due to the added friction of having to coordinate this across implementations.

Standardization - two independent implementations are good.

Posted Sep 2, 2024 22:40 UTC (Mon) by viro (subscriber, #7872) [Link] (4 responses)

When specification is "whatever the interpreter actually does", you get wonders like sh(1). Which is _not_ a good language to write in...

Standardization - two independent implementations are good.

Posted Sep 5, 2024 11:41 UTC (Thu) by taladar (subscriber, #68407) [Link] (3 responses)

Which is why there is an effort to develop a Rust spec but that still doesn't require a second implementation, just a test suite that checks if the one implementation conforms to the spec.

2nd Implementation tests the meaning of the specification

Posted Sep 7, 2024 17:10 UTC (Sat) by jjs (guest, #10315) [Link] (2 responses)

That test suite can determine if one implementation meets what the spec writers interpret the spec to mean. It can't detect if the spec always means what the spec writers think it means (the wonders of human language). The purpose of a second implementation is to check that the wording of the spec actually only means what the spec writers think it means. I.e. catch unseen errors in the spec. Again, there's a reason IETF requires two independent implementations of an RFC before they declare it a standard.

2nd Implementation tests the meaning of the specification

Posted Sep 7, 2024 17:40 UTC (Sat) by intelfx (subscriber, #130118) [Link]

Or you can simply have one team write the compiler (and maybe the spec) and some other team to write the tests using the spec.

2nd Implementation tests the meaning of the specification

Posted Sep 8, 2024 12:02 UTC (Sun) by farnz (subscriber, #17727) [Link]

That's where two implementations of the test suite comes in handy, since you now have two separate groups of people who've read the specification and agree on what it means; where one test suite fails and the other passes, you need to resolve that by either fixing the specification, or getting the passing test suite to agree that they had a gap in test coverage.