Real Rust type system usefulness
Real Rust type system usefulness
Posted Nov 6, 2024 9:37 UTC (Wed) by dottedmag (subscriber, #18590)Parent article: Safety in an unsafe world
It would be nicer if constraints encoding was first-class language construct and not a side-effect of a type system, but in absence of such a faculty, an expressive type system is a passable substitute.
Posted Nov 6, 2024 12:20 UTC (Wed)
by matthias (subscriber, #94967)
[Link] (7 responses)
Using the type system has one big advantage. You can encode arbitrary constraints. The constraints are spelled out in the documentation and are enforced by the module that implements the type. The enforcing is done by ensuring that only valid values can be constructed.
If you add first-class features, then you have a syntax for describing constraints. But this syntax limits the nature of constraints that can be encoded. Such a syntax would still be nice for types of constraints that are commonly used. But the type system has to be there to allow to encode everything that is not covered by the syntax for constraint specification.
Rust already allows to specify constraints directly in the language. If I write (bool,bool), then rust enforces that the two bytes of memory may only contain values 0 and 1. Specifying tuples, structs, and enums naturally gives constraints on the represented values. It just happens that until now, nobody has added the feature of defining lock orders as a first-class feature of the language. One could add such a feature, but then the next person comes and wants to encode special properties for MMU handling in the type system ;) So there will always be constraints that one wants to express, but which are not first-class features. Which types of contraints should be first-class features is of course open for debate.
Posted Nov 6, 2024 12:36 UTC (Wed)
by daroc (editor, #160859)
[Link] (6 responses)
However, the line between systems like this and type systems can actually be a bit blurry. Languages with full dependent types, such as Idris, incorporate those constraints directly into the type system. Other languages (Ada SPARK, I think?) keep them separate.
Ultimately, like many other problems in programming language design, how you integrate features for making programs more correct into the rest of the language is a complex tradeoff between consistency, explainability, and what people expect from the language. Whether integrating these things into Rust's type system is ultimately the best approach for the language or not, I cannot say.
Posted Nov 6, 2024 16:37 UTC (Wed)
by tialaramex (subscriber, #21167)
[Link]
Almost any language will reject attempts to call a function which simply doesn't exist, many at compile time.
Fewer but still most will reject a call with the wrong number of parameters
Fewer again, but still a great many, have type constraints, "Lots" is a string, that's probably not OK
And then some will observe finer distinctions, "Lots" is not only a string, it's not even a string like "15" which can be coerced into a number, it's something else. This is rarely something available at compile time and when it is available it's usually magic, not general purpose.
Rust's compiler, faced with a call to std::str::from_utf8([/* Some values */]) for example, will cheerfully look at a constant array you're passing, decide whether it's not actually UTF-8 encoded text, and warn (which can be suppressed if appropriate) that as a result this call always errors. But that's magic, if we make our own type with different rules the same doesn't happen.
Shifting left is valuable. Finding out that C++ operator precedence means your logging doesn't have the desired effect... by reading the useless results in a customer bug report is bad. Finding out when the CI testing the potential new release builds fails six hours after you wrote the code is better but still inconvenient. Finding out five minutes after writing the code that now the unit test cases broke locally is enormously better but not perfect. Finding out because a red squiggly line appeared in the editor when you wrote the logging code is extremely good. Only "It just fucking works" is finally the ideal and unlike the other options that's a language design change and doesn't help with your next bug.
Posted Nov 6, 2024 22:36 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (4 responses)
But they typically are enforced only at runtime, with limited compile-time support.
A good example is Eiffel language.
Posted Nov 6, 2024 22:51 UTC (Wed)
by daroc (editor, #160859)
[Link] (3 responses)
[1]: https://www.idris-lang.org/
Posted Nov 6, 2024 23:06 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
It has something similar ("Type Invariant") that can be checked at compile time, but it's limited with the type itself. It can't express an invariant "there exist a path between this type and the root of the lock hierarchy".
Posted Nov 6, 2024 23:19 UTC (Wed)
by daroc (editor, #160859)
[Link]
I have used the other two, though, and can confirm that they operate entirely at compile time.
Posted Nov 11, 2024 16:30 UTC (Mon)
by zmower (subscriber, #3005)
[Link]
Posted Nov 6, 2024 12:40 UTC (Wed)
by LtWorf (subscriber, #124958)
[Link]
Posted Nov 6, 2024 16:36 UTC (Wed)
by NYKevin (subscriber, #129325)
[Link] (4 responses)
I have no idea if it's still being worked on or not, but there's no obvious indication that it has been abandoned.
Posted Nov 6, 2024 22:31 UTC (Wed)
by tialaramex (subscriber, #21167)
[Link]
Without this value, these types have a niche, meaning Option<BalancedI8> is a single byte, Option<BalancedI32> is only four bytes, and so on. Pattern Types unlock niches for ordinary stable Rust by third parties.
This delivers the same functionality as making these most negative values sentinels, but with the much improved ergonomics of Option. I think there are a lot of variables in software for which -10, +10 and crucially *zero* (which would be forbidden in the similarly sized but stdlib-blessed NonZeroI8 type) are reasonable values but -128 is not.
Today Rust's niches can be made at will by its own standard library (hence NonZeroI8 for example, but also OwnedFd and numerous others) but for us lay folk writing our own software this feature is permanently unstable compiler only magic (you can of course invoke this magic at your own risk, that's what my unstable "nook" crate does, but it's specifically marked as never-to-be-stabilised). The only lay person accessible new niches are from the sum types (enum). If I make an enum with say two hundred variants that contain no data, Rust will use two hundred of the 256 bit patterns for a byte to represent that enum and, it will mark all the remaining 56 patterns as a niche.
This is how the current era of CompactString works. The type has 24 bytes, the last byte is an enum, with 128 values representing ASCII characters, then 64 values representing non-ASCII last bytes of valid 24 byte UTF-8 strings, then lengths zero through to 23 inclusive signifying that the string didn't use this last byte because it was shorter, then one value signifying the strings was too big and lives on the heap instead and finally one more for static text. This leaves a large niche, so Option<CompactString> is a 24 byte data structure which can be None, or Some(string) where the string only needs a heap allocation for text longer than 24 UTF-8 code units. Fancy.
Posted Nov 8, 2024 0:42 UTC (Fri)
by aszs (subscriber, #50252)
[Link] (2 responses)
Posted Nov 8, 2024 16:53 UTC (Fri)
by khim (subscriber, #9252)
[Link] (1 responses)
Um. Rust compile times are already NP-hard… would promotion to NP-competeness do that much difference?
Posted Nov 11, 2024 17:22 UTC (Mon)
by NYKevin (subscriber, #129325)
[Link]
But anyway, I think the broader objection to this argument looks roughly like the following:
* It is not typical to match patterns of the size and complexity of the patterns shown in that blog post. They are a pathological edge case that nobody uses in practice.
Real Rust type system usefulness
Real Rust type system usefulness
Real Rust type system usefulness
Real Rust type system usefulness
Real Rust type system usefulness
[2]: https://ucsd-progsys.github.io/liquidhaskell/
[3]: https://www.adacore.com/about-spark
Real Rust type system usefulness
Real Rust type system usefulness
Real Rust type system usefulness
Real Rust type system usefulness
Real Rust type system usefulness
Pattern Types.
Real Rust type system usefulness
> there's not much appetite to make Rust compile times NP Complete!
Real Rust type system usefulness
Real Rust type system usefulness
* Refutable patterns (if let, match with a _ case, etc.) only need to be checked at runtime (and for structural correctness), which is also fairly straightforward unification logic.
* If you want to start allowing refutable pattern matching at compile time, things get significantly harder, because now every type has to carry around information about which patterns it is known to match, and you have to use that information to more precisely infer which other patterns are refutable, or else it's not doing much good (what is the point of knowing that my type matches 1..5 if I can't write separate match arms for 1, 2, 3..5 without a catchall _ arm?). That means you have to match patterns against each other, rather than merely matching objects or (simple/non-pattern) types against patterns. The unification logic gets a lot more complicated in that case, and it is far more plausible that it has an accidental complexity explosion in some edge case that users might really run into.