|
|
Log in / Subscribe / Register

Rust Keyword Generics Progress Report: February 2023

The group working on adding keyword generics to the Rust language is foreshadowing what it plans to propose:

A main driver of the keywords generics initiative has been our desire to make the different modifier keywords in Rust feel consistent with one another. Both the const WG and the async WG were thinking about introducing keyword-traits at the same time, and we figured we should probably start talking with each other to make sure that what we were going to introduce felt like it was part of the same language - and could be extended to support more keywords in the future.


to post comments

Rust Keyword Generics Progress Report: February 2023

Posted Feb 24, 2023 3:09 UTC (Fri) by xi0n (guest, #138144) [Link] (3 responses)

While I appreciate the attempts of bringing a limited form of monadic typeclasses into Rust, I'm disappointed it only focuses on ?async and ?const. It completely ignores a much bigger source of "function coloring" problem that's largely unique to the language: mutability.

Right now, many libraries (including std) rely purely on naming convention to distinguish mutable and immutable methods. Both variants also have to be written separately, leading to proliferation of pairs such as get/get_mut, as_ref/as_mut, as_deref/as_deref_mut, and sometimes also traits (Deref/DerefMut) and various wrapper types (e.g. Ref and Mut in Bevy). Unlike async or even const, this duplication affects basically any Rust codebase. Having something like ?mut so that you can at least write methods that are generic over `&?mut self` would go a long towards reducing API cruft.

Rust Keyword Generics Progress Report: February 2023

Posted Feb 25, 2023 21:37 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (2 responses)

I think the argument against this is some variation of "mut is syntactic salt."

In other words:

* Immutable (shared) references have weaker constraints than mutable (exclusive) references.
* If we implicitly promote a reference to mut, then we impose stronger constraints on your program, which may not be obvious from the call-site.
* Those additional constraints may prove difficult to explain in compile errors, since you may not realize which of your references are mut. The compiler would have to explain how it monomorphized all potentially relevant &?mut annotations, and there might be a lot of them (e.g. "Z is mutable, but you borrow it again over here. Z is mutable because Y is mutable, and Y is mutable because X is mutable, and...").
* More generally, taking &mut everywhere is probably a Bad Idea in the first place (for the same reason that const-correctness is so important in C++).
* Therefore, we don't want to hand you a mutable reference unless you explicitly ask for one.
* The easiest way to accomplish that is to have two methods with different names. Inventing exceptional syntax for an operation that would still require the call-site to manually select a specialization would be pointless.

But I don't work on Rust, so the above is just my best guess.

Rust Keyword Generics Progress Report: February 2023

Posted Feb 26, 2023 11:19 UTC (Sun) by mb (subscriber, #50428) [Link] (1 responses)

> The easiest way to accomplish that is to have two methods with different names

Yes. I completely agree. We must have separate functions that the caller explicitly chooses from, for mutability.

But I sometimes wish there would be some help from the language to make implementing these functions easier.
For example for simple reference-getter functions we basically just duplicate the function (with added mut).

I'd sometimes like to have something like this:

fn get{_ref|_mut}(& ?mut self) -> & ?mut Foo {
& ?mut self.foo
}

It would still generate two functions with two names, but I would only have to write one.
Yes, I can do that with macros, but support from the language in the form of syntactic sugar would feel much nicer to me.

Rust Keyword Generics Progress Report: February 2023

Posted Feb 28, 2023 16:24 UTC (Tue) by plietar (subscriber, #110706) [Link]

The Pony language has some version of this called "viewpoint adaptation": https://tutorial.ponylang.io/reference-capabilities/arrow-types.html. A function signature can look like "fun get(): this->Foo", which means it returns a Foo "as seen by this": the returned reference is mutable only if the method is called on a mutable receiver.

Rust Keyword Generics Progress Report: February 2023

Posted Feb 24, 2023 3:39 UTC (Fri) by droundy (guest, #4559) [Link] (1 responses)

I really wish they had given as an example how this might enable unification of Iterators and Streams. If they can make Streams redundant by making Iterators keyword generic, that would be lovely. And if they *can't*, then it seems like they need to figure out why not, because writing code without Iterators isn't going to be workable.

Rust Keyword Generics Progress Report: February 2023

Posted Feb 24, 2023 19:54 UTC (Fri) by yoshuawuyts (guest, #123806) [Link]

Hi, post author here. A lot of the work we've been doing here has specifically been motivated by the unification of existing std Iterator, the async Stream trait, and the ability to use loops in const contexts (e.g. a const version of Iterator). We actually have a version of the async Stream trait on nightly Rust in the async_iter submodule. However, a major downside of it is that in its current form it is scheduled to become a carbon-copy of the existing std::iter crate, with the only difference being that most functions and closures will be prefixed with the async keyword. That's a lot of duplication which needs to be maintained, just because because we wanted an async version of one interface. And it's likely at some point the stdlib will want to provide its own version of async TcpStream, async File, etc. - and that would result in an enormous amount of duplicate APIs which would largely be identical to their non-async counterparts.

And the problems don't just stop with async either. Have you ever wondered why you can't just use the ? operator from a closure? Or have you seen the Rust for Linux project ask whether they could have non-panicking versions of certain APIs? We don't want the answer to those kinds of asks to be: "We'll add an identical version of the same API which differs on exactly one dimension". That would either set us on the path to an exponential blowup of API surface, or needing to make hard choices on which APIs we don't want to support in certain contexts. Instead we believe the best path to solving this is through the type system - like we're already doing with const - and that's why we started the work on the Keyword Generics Initiative.

Rust Keyword Generics Progress Report: February 2023

Posted Feb 24, 2023 10:25 UTC (Fri) by tlamp (subscriber, #108540) [Link] (12 responses)

Found Graydon's Hoare (the initiator of the rust programming language) comment in a reddit thread on this topic quite agreeable (from my POV as a professional rust programmer):

In addition to the syntax being far too bitter a pill to swallow, I think this adds too much cognitive load for too little gain (and there's much more load ahead as details are worked out). Users reading and writing code are already too close (or often way beyond) their cognitive limits to add another degree of polymorphism.

Const, fallibility, and async are all very different from one another in Rust; making an analogy over them is misguided. Async implementations use fundamentally different code (separate data structures and system calls) than sync, whereas const is a strict subset of non-const and can always be called transparently from a runtime context. And a different (though livable) solution to fallibility has already spread all through the ecosystem with Result and having maybe-fallible methods define themselves with Result<T, Self::Error>, with actually-infallible traits defining type Error = Infallible. This works today (you could finish stabilizing ! but otherwise .. it works).

IMO this whole effort, while well-meaning, is an unwise direction. Writing two different copies of things when they are as fundamentally different as sync and async versions of a function is not bad. Trying to avoid the few cases that are just block_on wrappers aren't worth the cost to everyone else by pursuing this sort of genericity. At some point adding more degrees of genericity is worse than partially-duplicated but actually-different bodies of code. This initiative greatly overshoots that point.

Please reflect on how many times Rust usage surveys have come back with "difficulty learning" as a top problem. That is a very clear message: don't add more cognitive load. Really, Rust needs to stop adding cognitive load. It's important. "Being more Haskell like" is not a feature. Haskell's ecosystem of hyper-generic libraries reduces the set of potential users dramatically because of the cognitive load. That's not clever and cool, it's a serious design failure.

— Graydon Hoare, in a comment on reddit

Rust Keyword Generics Progress Report: February 2023

Posted Feb 24, 2023 22:01 UTC (Fri) by roc (subscriber, #30627) [Link]

That's my gut reaction too. I've written a lot of Rust code, including a lot of async code, and grappled with variable fallibility, and while it can be a bit annoying, I think the cure here looks worse than the problem.

Rust Keyword Generics Progress Report: February 2023

Posted Feb 24, 2023 22:26 UTC (Fri) by bartoc (guest, #124262) [Link] (8 responses)

I'm not convinced async code is so different from sync code that the duplication is justified. After all the whole _point_ of adding state machine style async/await into a language is to allow you to write async code in the same way as sync code without all the overhead of needing to keep track of the entire processor and stack state, the compiler can tell you what state is actually important to keep track of and restore upon running completion code. Like ultimately the synchronous system calls in sync code are _also_ "suspend points" it's just the queues are in different places.

Rust Keyword Generics Progress Report: February 2023

Posted Feb 25, 2023 17:50 UTC (Sat) by rrolls (subscriber, #151126) [Link] (7 responses)

I'm not a Rust programmer, but I like to keep an eye on how a number of different languages develop just out of interest, whether I use them or not.

From that perspective-

I've noticed two main ways of doing async code: "the Node.js way", which started out as callback functions, then turned into Promises, then became what we now call "colored functions" - which has been adopted by Python and Rust; and "the Ruby way", aka Fibers, where any function could potentially suspend, which has been adopted by PHP and (IIUC) Zig. Personally, despite Python being the only one of these languages I actually use on a regular basis, I'm massively in favor of "the Ruby way", for the very reason you point out that it allows you to use, say, a third-party library, with both sync and async code and the library doesn't need to care. I do wonder if the only real reason any language still does it "the Node.js way" is that it'd be a massive backward compatibility break to change it.

It seems the Rust team has come up with their own ingenious solution that should allow de-duplicating most code that suffers from the "is it async or not" problem, though perhaps not as cleanly as languages doing things "the Ruby way", which don't have to mark calls which could potentially be async at all: in Rust, even with the proposal being discussed here, you'll still have to write .await? or .do on every potentially-async function call.

Rust Keyword Generics Progress Report: February 2023

Posted Feb 26, 2023 8:34 UTC (Sun) by burki99 (subscriber, #17149) [Link] (1 responses)

Thanks for bringing this up - I found https://journal.stuffwithstuff.com/2015/02/01/what-color-... explaining the details

Rust Keyword Generics Progress Report: February 2023

Posted Feb 27, 2023 8:11 UTC (Mon) by rrolls (subscriber, #151126) [Link]

Good read. I remember coming across that post myself some years ago!

Rust Keyword Generics Progress Report: February 2023

Posted Feb 27, 2023 15:46 UTC (Mon) by jaymell (guest, #106443) [Link] (2 responses)

I have not used it but understand there are some attempts underway to introduce a coroutine-based concurrency implementation to Rust, e.g., May -- https://github.com/Xudong-Huang/may -- similar to "goroutines" in Go and (I presume) the implementation in Ruby you describe.

I enjoyed using Go for the reasons you describe: Generally, any code from any lib can generally be put into a goroutine and interacted with via channels. It does force you to structure your code very differently than async/await syntax does, however. From what I understand, Kotlin also has a pretty mature coroutine implementation at this point, though it also requires a certain amount of "coloring" functions as well.

I'm not sure how this will ultimately play out in Rust, but it will be interesting if we ultimately have multiple options for approaching concurrency.

Rust Keyword Generics Progress Report: February 2023

Posted Mar 6, 2023 13:12 UTC (Mon) by ssokolow (guest, #94568) [Link] (1 responses)

The big problem is that the fibers/stackful coroutines approach Go uses plays poorly with FFI and FFI is Rust's bread and butter.

Give Fibers under the magnifying glass by Gor Nishanov a look.

Rust Keyword Generics Progress Report: February 2023

Posted Mar 8, 2023 21:46 UTC (Wed) by bartoc (guest, #124262) [Link]

The other problem is that it's motivated by performance considerations that no longer apply to modern operating systems (esp if we get io_uring clone/exec)

Rust Keyword Generics Progress Report: February 2023

Posted Mar 8, 2023 21:45 UTC (Wed) by bartoc (guest, #124262) [Link] (1 responses)

The problem with "the ruby way" (fibres) is that you still need to rewrite the whole runtime to support them (since IO routines need to be taught how to switch tasks) and you don't really save any resources over just making a normal thread. At best you can stop allocating stacks (both the kernel stack and the user stack) for each task, but usually you just save the kernel stack. And if you _can_ eliminate both stacks that means your language / runtime heap allocates basically everything. The only other option is to get very, very, very clever often at the expense of some safety or adding limitations on the depth of coroutine invocations (I think zig takes this approach).

These sorts of runtimes also tend to be bug-prone because tasks can call out to libraries that are unenlightened and use things like thread local storage and get surprised when the values change out from under them as a task gets resumed on another "real" thread. This isn't a problem if you only have one "real" thread, but these sorts of systems usually want to use one thread per CPU.

Also, the performance advantages of fibre-like schemes over "just using a real thread" are not that pronounced anymore, they became popular in the days where most operating systems had "one big lock" around the whole scheduler, that's no longer true and so normal OS schedulers scale much better with large numbers of threads and cores now, making these sorts of N:M fibre schemes a little pointless.

Rust Keyword Generics Progress Report: February 2023

Posted Mar 10, 2023 2:05 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

> At best you can stop allocating stacks (both the kernel stack and the user stack) for each task, but usually you just save the kernel stack. And if you _can_ eliminate both stacks that means your language / runtime heap allocates basically everything.

Rust's async (or JavaScript's, or Python's) is basically isomorphic to segmented stacks. You save your true stack in a linked list of heap-allocated objects, and the system/kernel stack is only borrowed to run coroutines. You only need a handful of real threads, and there is no problem with having millions of coroutines.

The problem is the speed. Go tried essentially this approach earlier in its life, and segmented stacks failed because they can cause unpredictable and horrible slowdowns when a tight loop crosses over the segmentation threshold.

Instead, now Go uses moveable and resizable stacks, which provide the best of both worlds. This is possible because Go can maintain an invariant that no pointer on the heap can point to an object on the stack. So the runtime can just use contiguous stacks, without any penalty for normal functions. At the same time, the minimum stack size can be very small (2kb for Go, it can go down further, but apparently this is the best compromise).

This kind of design is probably the best overall, but it's very hard to do without a rather intrusive runtime support.

Rust Keyword Generics Progress Report: February 2023

Posted Feb 25, 2023 17:20 UTC (Sat) by Wol (subscriber, #4433) [Link] (1 responses)

Sounds like me banging on about "emergent complexity".

Don't try and pull concepts from different layers together. By all means try and re-use a similar api for a similar function (or, for eg sync/async provide the same api that has something where you can choose sync/async/don't care).

I don't see why you can't use the same language for hardware-level, system-level and application-level programming, but what you should NOT be doing is taking a task at one level, and making the programmer care about a different level. Things like maybe have the same API for thread-safe and not-thread-safe, but something tells the compiler and it either forces the thread-safe implementation or blocks threads if the not-thread-safe implementation is used.

Cheers,
Wol

Rust Keyword Generics Progress Report: February 2023

Posted Feb 27, 2023 20:18 UTC (Mon) by NYKevin (subscriber, #129325) [Link]

> Things like maybe have the same API for thread-safe and not-thread-safe, but something tells the compiler and it either forces the thread-safe implementation or blocks threads if the not-thread-safe implementation is used.

Rust already does that with the Send and Sync traits. If an object is not Send and/or Sync, then the compiler will know that it is thread-unsafe and refuse to let you share it between threads (the two traits refer to two slightly different interpretations of the word "share"). Rust also automatically deduces these traits where applicable, so most* things are thread-safe by default, and the language prevents you from using threads with thread-unsafe objects.

By the time you get to something that Rust considers thread-unsafe, there is rarely much opportunity to "just" fix it by switching implementations, as the problem likely involves some sort of inherently thread-unfriendly design (e.g. Cell or RefCell). RwLock is a bit like a thread-safe version of Cell/RefCell, but not really, because RwLock has stronger constraints on the caller (the caller must take care to avoid deadlock), and regardless, the API differs, so automatic deduction would be inappropriate. The exception is Rc/Arc, because the latter really is "just" a thread-safe implementation of the former, but a whole language rule for one special case probably wouldn't be worth it, especially since you might still need to wrap the refcounted object in a Mutex or something anyway (Arc doesn't protect the inner value from threads, it just protects its own reference count).

* The borrow checker already provides a measure of thread safety for "simple" objects, and the automatic deduction takes this into account, so a thread-unsafe object is generally going to be rather more complex than it would be in other languages. It's not like C where any static variable or shared pointer is automatically a data race waiting to happen. Rust lets you hand out immutable references like Halloween candy if you so choose, without putting thread safety at risk.

Rust Keyword Generics Progress Report: February 2023

Posted Feb 25, 2023 0:34 UTC (Sat) by gmgod (guest, #143864) [Link]

I'm having a hard time understanding how this kind of polymorphism actually solves anything beyond superficial code duplication.

The distinction between red and blue functions doesn't disappear. If anything I'd prefer async functions being automatically callable from a sync context instead of adding purple functions (however compatible with the rest) to do exactly that (the same way const fn can be seamlessly called at runtime).

All I see here is a syntactic distinction atop async code... Maybe, just maybe, don't add the distinction... That way you don't get colors, just inflexion depending on the context...


Copyright © 2023, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds