Unix sockets

Posted Dec 31, 2024 11:13 UTC (Tue) by smurf (subscriber, #17840)
In reply to: Unix sockets by khim
Parent article: Systemd improves image features and adds varlink API

> Go's “green threads” are designed as yet-another-async/await system

Huh. For me async/await implies the two-color problem, which Go doesn't have.

In fact whether the threads underlying the Go runtime are "green" and built on some select call, or 1:1 on top of Fibers, shouldn't really matter from a high-level perspective.

> and are Google's solution to the fact that not everyone have Google Fibers.

Sure, but if basic kernel support for them is out there (is it?) then I'd assume that promoting the Fibers solution to the problem might be in their interest.

Unix sockets

Posted Dec 31, 2024 11:33 UTC (Tue) by khim (subscriber, #9252) [Link] (22 responses)

> Huh. For me async/await implies the two-color problem, which Go doesn't have.

Go's approach to two color problems is to eliminate sync entirely. Except it's still there if you are trying to use foreign code. Go's solution to that is to trey to implement everything natively.

I'm not sure how much Google embraced Google Go, but from what I've heard even Google Borg is still not rewritten in Go (even if both Docker and Kubernetes used outside of Google are written in Go).

> In fact whether the threads underlying the Go runtime are "green" and built on some select call, or 1:1 on top of Fibers, shouldn't really matter from a high-level perspective.

Yes and no. The whole point of Google Fibers is reuse synchronous code in async context. Go doesn't have any syncronyous code to reuse, thus the whole Google Fibers machinery is pretty much irrelevant for Go: if you are already committed yourself to rewrite of the whole world to solve two-colors problem in that way, then why would it matter if you can use old, synchronous, code or not?

> Sure, but if basic kernel support for them is out there (is it?) then I'd assume that promoting the Fibers solution to the problem might be in their interest.

Kernel support is “out there” in a sense that Google shared the code and LWN reviewed it.

Google never pushed it to the inclusion in the mainline. From what I understand when mainline people has looked on it and rewritten it… Google lost interest.

Because if the whole thing would be entirely rebuilt and redesigned… it would be the same story as with Google Borg and Docker/Kubernetes: outside world get it's new toy which would help it, but Google wouldn't get a reprieve from “we are an isolated island” issue.

Instead Google works on async/await… slowly (I wonder if they have even allowed the use of async/await in Rust by now). Microsoft Windows style (where work on NT started in year 1989, but most users only got their home edition is year 2001, after 12 years of development).

Unix sockets

Posted Dec 31, 2024 19:41 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (21 responses)

I don't believe that these patches involve massive kernel thread concurrency. They merely add a primitive to forcibly suspend/schedule the user-space threads. If it's otherwise, can you please point out where it allows the _kernel_ threads to be lightweight enough for millions of parallel threads?

Unix sockets

Posted Dec 31, 2024 20:16 UTC (Tue) by khim (subscriber, #9252) [Link] (20 responses)

> They merely add a primitive to forcibly suspend/schedule the user-space threads.

Not “userspace threads”. Patches allow one to suspecr/resume kernel threads. That's the whole point.

And yes, that's the only thing that you need in kernel, everything else can be done in userspace.

> If it's otherwise, can you please point out where it allows the _kernel_ threads to be lightweight enough for millions of parallel threads?

Um. Haven't you just wrote about that? The reason it's not efficient to have millions of threads in normal Linux is not the fact that Linux scheduler becomes overwhelmed, but because to implement async/await story with “green threads” that are communicating via channels one needs a mechanism to wake up “the other side of the channel” when you pass the work to it.

This patch adds precisely such mechanism. Everything else is done in userspace.

Unix sockets

Posted Dec 31, 2024 23:33 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (13 responses)

> Not “userspace threads”. Patches allow one to suspecr/resume kernel threads. That's the whole point.

I still don't see it. I looked over the patchset: https://lwn.net/ml/linux-kernel/20210520183614.1227046-1-... and it's just another N:M scheduling system, with only concession being that the kernel can kick the userspace scheduler when it blocks a thread.

> The reason it's not efficient to have millions of threads in normal Linux is not the fact that Linux scheduler becomes overwhelmed, but because to implement async/await story with “green threads” that are communicating via channels one needs a mechanism to wake up “the other side of the channel” when you pass the work to it.

No, it's not the case. A million of threads is inefficient because the underlying kernel overhead is unavoidable. Golang has about ~500 bytes overhead for each thread (because it can rescale the stack as needed), Rust can get down to a hundred bytes for each coroutine (because it essentially flattens the entire state machine). The kernel is at least 16kB.

This is fundamental. That's why kernel threads can never be cheap, and the userspace needs some way to multiplex onto them. UCMG is just another mechanism to help with that multiplexing, but it's neither essential, nor does it obviate the need for io_uring/epoll/...

Unix sockets

Posted Jan 1, 2025 0:05 UTC (Wed) by khim (subscriber, #9252) [Link] (12 responses)

> and it's just another N:M scheduling system

Where does that idiotic conviction comes from? Google even made a video where it explain how and why they decision not to pursue M:N model – and you still go on and about how that couldn't ever work.

> Golang has about ~500 bytes overhead for each thread (because it can rescale the stack as needed), Rust can get down to a hundred bytes for each coroutine (because it essentially flattens the entire state machine). The kernel is at least 16kB.

That doesn't mean one couldn't have millions of threads. Million times by 16kB is only 16GB. Not too much by modern standards. Many desktops have more, serves have hundred times more.

> That's why kernel threads can never be cheap, and the userspace needs some way to multiplex onto them.

Well, Google Fibers don't do that. You say as if 16kB for request is a big deal. And compared to 500 bytes it even sounds like that.

But all these fancy tracing GCs take memory (most garbage collectors start behaving abysmally slow if you don't give them at least 2x “breathing” room), io_uring pools take memory, everything takes memory.

Even management of M:N thread system takes memory.

You need very peculiar and special kind of code to squeeze all that complexity in 15.5kB per request.

Unix sockets

Posted Jan 1, 2025 18:02 UTC (Wed) by ibukanov (subscriber, #3942) [Link] (11 responses)

First thanks for very interesting discussion!

> Million times by 16kB is only 16GB

Apple M-series uses 16K pages and it’s only a matter of time when other CPUs follow. So kernel threads will take more memory.

And the cost of context switching between the kernel and the user space will continue to grow so io_uring or similar is necessary to archive the optimal performance.

> But all these fancy tracing GCs take memory

Neither Rust nor C++ use GC. Granted in C++ code like Chromium that implements asynchronous IO using callbacks there are more heap allocations, but the overhead is minimal. And Rust can do async without heap allocations at all.

So I suspect after reading this discussion that Google Fibers are just too specialized. It works nicely in its intended niche (servers behind google.com on present AMD/Intel hardware). But outside of that its advantages are not that strong to compensate for drawbacks.

On the other hand explicitly 2-colored code can target wide areas with better performance while hiding the complexity behind libraries like Rust Tokio io_uring support.

Unix sockets

Posted Jan 1, 2025 18:49 UTC (Wed) by khim (subscriber, #9252) [Link] (10 responses)

> Apple M-series uses 16K pages and it’s only a matter of time when other CPUs follow. So kernel threads will take more memory.

How is that related? Kernel used 8KiB per thread for years. It was expanded to 16KiB ten years ago – but that happened because kernel stack was overflowing, not because pages have become larger. And since it's already 16KiB… Why should switch to 16KiB pages make it larger? It would just use one page instead of four…

> And the cost of context switching between the kernel and the user space will continue to grow so io_uring or similar is necessary to archive the optimal performance.

Again: why should context switching have become more expensive? It's more-or-less constant for last ~20 years.

The biggest problem that io_uring solves is not high cost of context switching (it's still needed to actually perform the desired operation, anyway), but green threads (aka async/await) blocking. With “million threads” model blocking is not an issue.

> Neither Rust nor C++ use GC.

Yes, but async/await wasn't born on C++ or Rust. Both were pressured into adoption of async/await by people who were using it in C#, JavaScript, Go and other languages that “waste” memory and CPU cycles in a much worse fashion than Google Fibers.

And the last time I've heard there was no pressure to abandon tracing GC languages because they waste memory… why is it the problem for the “million threads” model, then?

The fact that languages that actually care about every last bit of performance resisted async/await as much as they could and were literally pressured into it by people who were used to it on their bloated tracing-GC based memory and CPU hungry runtimes tells us everything we need to know about the real reason for async/await… no, that's not efficiency.

> Granted in C++ code like Chromium that implements asynchronous IO using callbacks there are more heap allocations, but the overhead is minimal.

Maybe, but the reason Chromium uses that model is not to save some memory for sure: Chromium is one of the most memory-wasteful programs that exist. Chromium have many advantages, but “frugal about memory” is not in the list of its fortes…

Rather the impetus for Chromium is the need to work well on Windows… so we are back to the “pile of kludges” justification.

> And Rust can do async without heap allocations at all.

Sure. And maybe, on bare metal, it would even go back to where we started and would implement interface with kernel code properly. But everywhere else people are using Tokio and similar executors which efficiently converts async/await into a set of green threads. Heck, C++ even backed green threads into their version of async/await on the language level and Rust haven't done that only because of incredible resistance of Rust language developers, people who clamored for async/await wanted the same thing they got in C++ (and most other languages) and still grumble about “useless complexity” that Rust attempt to avoid allocations brings.

> So I suspect after reading this discussion that Google Fibers are just too specialized. … But outside of that its advantages are not that strong to compensate for drawbacks.

Sure. But “specialized” here just means “need special API in OS foundations to be usable”. And most apps (including Chromium) need to work on iOS, Windows, macOS… they couldn't adopt Google Fibers.

> On the other hand explicitly 2-colored code can target wide areas with better performance while hiding the complexity behind libraries like Rust Tokio io_uring support.

Except is not “hiding anything”. One couldn't just magically “flip the switch” and make Tokio use io_uring. Tokio-uring is separate and different runtime. You have to support it explicitly in your code.

The decisive advantage async/await have over Google Fibers is the fact that it leaves hundreds of layers of kludges that we accumulated by now undisturbed. And buzzword compliance, of course.

It's just ridiculous to claim that async/await is used because it's “more efficient” in a world where people are using it in Node.JS and/or Electron-based apps that then come with their own copy of everything, including the kitchen sink (well… a userland USB driver for xbox360 controllers…).

P.S. And, again, I'm not saying that C++ or Rust designers did bad work with async/await. On the contrary, they did superb work (Rust guys, especially). But it's important to understand why that work was even needed… and that's to not touch the accumulated pile of kludges… not to save 16KiB of kernel memory. You may save few kernel threads by going with Electron app with async/await JavaScript framework, but you sure as heck don't actually save neither memory nor CPU cycles. What you do save is investment into “pile of kludges”… and that's not a bad thing, I'm not saying that everyone should drop things they use and embrace Google Fibers… but it's important to understand real reason for why one thing or the other thing is used.

Unix sockets

Posted Jan 1, 2025 19:59 UTC (Wed) by ibukanov (subscriber, #3942) [Link] (8 responses)

> Why should switch to 16KiB pages make it larger? It would just use one page instead of four…

The user space stack will grow from 4K to 16K.

> Again: why should context switching have become more expensive? It's more-or-less constant for last ~20 years.

Protection against SPECTRe and other hardware bugs is expensive.

> The fact that languages that actually care about every last bit of performance resisted async/await as much as they could

If async/await existed 30 years ago in C it would make programming in C GUI frameworks so much easier and allow to make GUI apps that do not freeze when writing to slow disks on single-core processors of that age without using threads or mutexes.

Which is another advantage of async/await model. It allows to mix events from different sources without much complexity. For example, consider the problem of adding a timeout to code that is doing a blocking io operation in traditional C++ code. One is forced to use a second thread that just calls sleep(), then sets an atomic flag and the IO thread then checks for the cancel after the IO operation. Alternatively instead of a flag one can try to close a file descriptor to force the IO operation to return. But doing that properly is hard and requires a non-trivial synchronization on the file descriptor access.

Yet await implementation allows to await for results of several IO operations with simpler and more efficient code. Heck, even in plain C on Linux I would rather replace the IO operation with non-blocking one and do a poll loop with the timeout than using threads if I have a choice. Then I just close the file descriptor with no need to worry about any synchronization.

Even Golang is plagued by that problem of having to create extra threads when the code needs to react to a new event. Of cause the threads are green and their creation is cheap and ergonomic, but that still brings a lot of complexity. Golang solution of cause was to add the color in the form of Context argument and gradually change libraries to respect that Context cancellation. They are far from there as file/pipe IO still has no support for that. Plus cancel is rather limited and still does not allow to mix arbitrarily events.

As I understand on Google servers there is simply no need to react to extra events. If necessary the OS just kills the whole process or starts a new one.

So async/await is just simply more universal and performant when implemented right like in Rust even if the code can be less straightforward than would be in cases when Fibers are used at Google.

Unix sockets

Posted Jan 1, 2025 21:08 UTC (Wed) by khim (subscriber, #9252) [Link] (7 responses)

> The user space stack will grow from 4K to 16K.

Usespace stack can always be swapped out. Kernel threads are limiting factor, not userspace.

> If async/await existed 30 years ago in C it would make programming in C GUI frameworks so much easier

30 years ago every single platform that was embracing GUI was already implementing async/await. Only await was called PostMessage back then. It worked, but poorly.

> and allow to make GUI apps that do not freeze when writing to slow disks on single-core processors of that age without using threads or mutexes

Nope. To make GUI apps that don't freeze you don't need async/await and, in fact, adding async/await doesn't help.

What you need are preemptively scheduled threads. And they were embraced as soon as thay became feasible.

Except for one platform: Web. JavaScript and DOM were designed as fundamentally single-threaded thing and thus couldn't benefit from the preemptively scheduled threads. And on Windows these threads were so heavy they even tried to add green threads back on the OS level in WindowsNT 3.51SP3.

And in these circumstances, on platforms where threads were too heavy to even contemplate “mullions of threads” approach async/await was added to F#, C#, Haskell, Python, TypeScript, JavaScript (in that order), and only after all that – to C++ and Rust.

Please don't try to play “this was done for the efficiency” tune: this doesn't sound even remotely plausible when languages that should care about efficiency most added async/await last.

No, that was a kludge for inefficient runtimes. And then it was brought to C++ and Rust “from above”… but no one ever compared it to alternatives. Except Google – but since Google embraced “millions of threads” approach decades before async/await was added to C++ it's biased in the other direction.

> But doing that properly is hard and requires a non-trivial synchronization on the file descriptor access.

And doing that with async/await is not easier and requires the exact same same dance. Read about how CancellationTokens work in C#.

You can do the exact same thing with Google Fibers and even normal threads.

The only platform where such cancellation is “easy” (for some definition of “easy”) is Rust – and that's because when heavily pressured to embrace async/await paradigm Rust adopted it in a radically different fashion from all other languages.

But that was just a side effect from how was implemented (Rust doesn't have a runtime and was unwilling to readd it back just for async/await) - and many still think it was a mistake: in practice people don't understand that any async function in Rust can be stopped and then dropped at any await point and there are attempts to fix that problem!

IOW: what you portray as “shining advantage” of async/await is only available in one implementation that arrived late and is considered a problem there by many developers!

> Yet await implementation allows to await for results of several IO operations with simpler and more efficient code.

Why does it even matter? Yes, something that “millions of threads” approach doesn't even need can be easier and faster with async/await… so what?

You have created serious problems for yourself and then heroically managed to overcome them… congrats… but isn't it yet another example of how we build kludges on top of kludges?

> Golang solution of cause was to add the color in the form of Context argument and gradually change libraries to respect that Context cancellation.

And that's what every other language does, too. Even Rust. Only C# starts with non-cancellable green threads and Rust with infinitely cancellable async/await but they both realize that you need to bake cancellation into context and only cancel in places where it's safe to cancel.

Thread-based word also have it's own solution, BTW. And it doesn't work for the exact same reason.

Arbitrary cancellation in arbitrary place is just not safe!

> So async/await is just simply more universal and performant when implemented right like in Rust even if the code can be less straightforward than would be in cases when Fibers are used at Google.

Maybe, this remains to be seen. But that's not even the point that we are discussing here. If you would say that design of async/await that Rust specifically adopted in year 2019 (and that was designed in years 2016-2018, you can find the references here) influenced the decisions of Microsoft in year 2007 (when F# got async) or even in year 2012 (when C# got it in version 5.0) then would recommend you to study how casualty works: “after does not mean because”, sure… but “before” does means “not because”!

It would be interesting to see what Rust would manage to do with kludges that it embraced in async/await, it's really a very exciting work… only it's brought to life by the set of kludges that Rust had when it embraced async/await and not by the fact that Google Fibers are inefficient.

And Rust had to adapt async/await not because they wanted to build something new and unique, nope, Rust embraced async/await because that was popular paradigm in other languages, most of them either not very efficient or highly inefficient languages – and developers wanted the same paradigm in Rust.

Nowhere in that journey an attempt to solve concurrency problems in a different fashion, without “green threads”, was considered – and even Rust (that did an interesting experiment by embracing async/await while trying to reject the “green threads”) got all that it got not because “millions of threads” approach was not considered efficient enough. Also, in practice, only embedded world tried to embrace async/await without “green threads” in Rust.

Even more exciting experiment, but very much result of the set of kludges and design decisions Rust had when it tried to adopt async/await and not result of someone's decision to go redesign the foundations.

Unix sockets

Posted Jan 2, 2025 10:13 UTC (Thu) by smurf (subscriber, #17840) [Link] (1 responses)

> The only platform where such cancellation is “easy” (for some definition of “easy”) is Rust

Another platform where cancellation Just Works is Python. I've been using the Trio async runtime for ages now (or what feels like them, given that the library is ~7 years old), and there's even a way to side-step the two-color problem if necessary ("greenback").

> Thread-based word also have it's own solution

Umm, might you be persuaded to not mix up "its" and "it's"? Thanks.

> Arbitrary cancellation in arbitrary place is just not safe!

which is one reason why the two-color problem sometimes is not a problem and async/await is actually helpful: at every place you can possibly get cancelled (or rescheduled) there's a glaringly obvious "async" or "await" keyword.

Unix sockets

Posted Jan 2, 2025 12:38 UTC (Thu) by Wol (subscriber, #4433) [Link]

> > Thread-based word also have it's own solution

> Umm, might you be persuaded to not mix up "its" and "it's"? Thanks.

Mind you, he's in very good company - the number of NATIVE speakers who mess it up ...

Herewith a Grammar Lesson - the apostrophe should *only* be used to indicate an elision (aka missing letters).

In "ye olde English" (and that's a thorn, not a y), possessives were created by adding "es". In time, people dropped the "e", and the standard ending became " 's ". Except there's a special rule for words that naturally end in "s", for example "James", which may have passed through "James's" before the official standard of " James' " took hold. Mind you, people are happy to use either version.

Now to the thorny question of "its". Here the possessive has always been "its", with no "e". Hence it does NOT have an apostrophe, because there was no letter there to elide. "It's" always is an elision of "it is".

Cheers,
Wol

async/await

Posted Jan 2, 2025 15:10 UTC (Thu) by kleptog (subscriber, #1183) [Link] (4 responses)

> 30 years ago every single platform that was embracing GUI was already implementing async/await. Only await was called PostMessage back then.

No, async/await is a language feature that allows programmers to write asynchronous code in a natural way. I don't see the relevance of PostMessage, it doesn't sleep so is not useful in implementing await. The magic of await is not the triggering of other code, but the fact that the response comes back at the exactly the same point in the calling code.

GUI apps use an event loop, which is not what async/await is about. It may work that way under the hood, the point is *the programmer doesn't need to know or care*. And indeed, Rust apparently does it without event loop (neat trick I may add).

The reason async/await become popular is because it makes the Reactor pattern usable. Creating several new threads for every request is tedious overhead. You may say that 16GB for millions of kernel stacks is cheap, but it's still 16GB you could be doing more useful things with. Most of those threads won't do any syscalls other than memory allocation, so they're wasted space.

(I guess it depends on the system, but in my experience, the I/O related threads form a minority of actual threads.)

I don't know if anyone else here used the Twisted framework before yield/send (the precursor to async/await) were usable. That was definitely the definition of unreadable/hard to reason about code, especially the exception handling. Async/await make sense to anyone with minimal training.

(Interestingly, Twisted got inlineCallbacks also around 2007, right about the time F# got async/await... Something in the air I suppose.)

>Arbitrary cancellation in arbitrary place is just not safe!

Just like arbitrary preemption in arbitrary places is unsafe. Which is why it's helpful if it only happens when you use await. Just like cancelling async tasks is easy, just have await throw a cancellation exception, like Python does.

(I just looked at C# and the cancellation tokens and like, WTF? Why don't they just arrange for the await to throw an exception and unwind the stack the normal way? If the answer is: it was hard to change the runtime to support this they should be shot. Cancellation tokens are an ergonomic nightmare.)

> > Yet await implementation allows to await for results of several IO operations with simpler and more efficient code.

> Why does it even matter? Yes, something that “millions of threads” approach doesn't even need can be easier and faster with async/await… so what?

Umm, you receive a request, which requires doing three other requests in parallel and aggregating the responses. Creating three threads just for that is annoying, but with the right framework you can make it look the same I suppose. But creating a bunch of kernel stacks just so they can spend their lifetime waiting on a socket and exiting seems like overkill.

results = await asyncio.wait([Request1(), Request2(), Request3()], timeout=30)

Is quite readable and does what it says on the tin. And even supports cancellation!

Despite this thread going on for a while, it still appears the only real argument for Google Fibres is "Google uses them internally a lot but no-one else is interested". If there were even one other large business saying they used them and had published sources we'd actually be able to compare results.

async/await

Posted Jan 2, 2025 16:19 UTC (Thu) by khim (subscriber, #9252) [Link] (3 responses)

> I don't see the relevance of PostMessage, it doesn't sleep so is not useful in implementing await.

It doesn't sleep today which means it was useless as await 30 years ago? What kind of logic is that? Besides .await is not about sleeping (in fact it would return right back if there are no other work to do anywhere in the system).

30 years ago is year 1995, that's before Windows 95 and on Win16 PostMessage gives the chance for other GUI window procedures to run and process their GUI changes – and that's the essence of await.

That's how multitasking worked in an era before preemptive multitasking… almost like async/await works today.

And it was a nightmare: GUI wasn't “responsive”, it was incredibly laggy because any app could hog the whole system… just like today may happen with single-tread executors (mostly used in tests).

> The magic of await is not the triggering of other code, but the fact that the response comes back at the exactly the same point in the calling code.

That's, ideed, some kind of magic which can be achieved by the use of some kinds of heavy drugs because it doesn't match the reality. It's just not how async/await works.

Sure there are lots of develops who naïvely believe that's how async/await works, and it may even be the reason for async/await popularity… but it's not how it works.

> GUI apps use an event loop, which is not what async/await is about.

What's the difference? Conceptually “an event loop” does the exact same thing as an executor in async/await world. Sure, most languages hide these details from the developer, but so do applications frameworks (MFC, OWL, etc).

> It may work that way under the hood, the point is *the programmer doesn't need to know or care*.

When programmer doesn't know or care the end result is incredibly laggy and unpleasant mess. Abstractions are leaky and async/await is not an exception.

And if you are Ok with laggy and unpleasant programs then you don't need neither async/await nor “millions of threads”, producing mess is easy with any paradigm used.

> And indeed, Rust apparently does it without event loop (neat trick I may add).

Rust doesn't do it “without event loop”. It just makes it possible to write your own event loop. That was possible to do in Windows 1.0 about 40 years ago, too.

> You may say that 16GB for millions of kernel stacks is cheap, but it's still 16GB you could be doing more useful things with.

So you don't want to lose 5% of memory because you need it… for what exactly?

> I don't know if anyone else here used the Twisted framework before yield/send (the precursor to async/await) were usable.

Ah, right. You need these 5% and CPU cycles of memory saved to waste 95% of computer power on a highly inefficient runtime of a highly inefficient language with GIL. Makes perfect sense. Not.

> (Interestingly, Twisted got inlineCallbacks also around 2007, right about the time F# got async/await... Something in the air I suppose.)

Oh, absolutely. But you just have to understand and accept what problems are solved with these tools.

And these problems are not problems of efficiency or kernel memory waste (after all wasting 16KiB of kernel memory is still less painful than wasting 16MiB of userspace memory).

No, async/await was solving entirely different problem: on OSes and languages that had no efficient threading (or, in some cases no threading at all) it made it possible to write some kind of concurrent processing that actually worked.

That was the real justification, everything was developed in the expected “we would put another layer of kludges on top of pule of kludges that we already have” fashion.

And Google Fibers were developed for that, too… just they were developed for different pole of kludges – but at least they are somewhat relevant to a discussion because they do show that changes in the foundation are possible, even if in limited scope.

> If the answer is: it was hard to change the runtime to support this they should be shot.

Why? They were developing solution for yet another pile of kludges. And invented something that worked best for that particular pile.

> But creating a bunch of kernel stacks just so they can spend their lifetime waiting on a socket and exiting seems like overkill.

But using language that uses 95% of resourtces just for it's own bookkeeping that have nothing to do with the task on hand is not overkill? Get real, please.

> Despite this thread going on for a while, it still appears the only real argument for Google Fibres is "Google uses them internally a lot but no-one else is interested".

And the only counterargument is “we shouldn't waste 5% of resources on Google Fibers after we already wasted 95% of resources on our language runtime”.

Which is even sounding funny. And couldn't be real in any sane world. The real argument (and pretty sensible one) is “Google Fibers do solve the problem they claim to solve but we couldn't use them because of pile of kludges that we have while async/await are more complicated, but they fit into said pile of kludges better”. Because:

> results = await asyncio.wait([Request1(), Request2(), Request3()], timeout=30)

Is quite readable and does what it says on the tin. And even supports cancellation!

Sure, but nothing prevents you from implementing such API with Google Fibers. Except if you have language with heavy threads (like C#) or GIL (like Python).

But if you have committed yourself to such a language then you have already wasted more resources than Google Fibers may ever waste!

It's stupid and pointless to talk about async/await as if they are better in general if the only reason they exist is deficiency in our OSes and language runtimes.

P.S. Again, if someone would have went to the foundations and fixed that, by replacing syscalls with something better then complains about wasteful nature of Google Fibers would have sounded sensible. Google Fibers are not perfect. But the fact remains that all current async/await alternatives are even more wasteful! Except Rust, maybe, but then Rust was very late to the party and Rust developers have also accepted “pile of kludges” reasoning, they just kinda said ”if we need to adopt async/await then we may as well use that opportunity to add something to the language that we wanted to have since day one”.

async/await

Posted Jan 3, 2025 9:54 UTC (Fri) by smurf (subscriber, #17840) [Link] (2 responses)

> > The magic of await is not the triggering of other code, but the fact that the response comes back at the exactly the same point in the calling code.

> That's, ideed, some kind of magic which can be achieved by the use of some kinds of heavy drugs because it doesn't match the reality. It's just not how async/await works.

The point is that this is how async/await looks like to the programmer – just another method call, with an additional keyword.

The fact that there's a stack unwind/rewind or promises (plus closures) or some other deep magic underneath, along with an event loop or an io_uring or whatever, is entirely irrelevant to a high-level programmer, for much the same reason that most high-level languages no longer have a "goto" statement even though that's all the underlying hardware is capable of.

async/await

Posted Jan 3, 2025 13:28 UTC (Fri) by khim (subscriber, #9252) [Link] (1 responses)

> The point is that this is how async/await looks like to the programmer – just another method call, with an additional keyword.

And what's the difference from call to PostMessage that also look like “just a function call”, but have many subtle implications, including possible switch of context to handle other GUI objects?

You couldn't have both ways: either we do care about implementation (and then async/await is just a set of green threads with funky interface) or we don't care about implementation (and then it's just new edition of GetMessage/PostMessage).

> The fact that there's a stack unwind/rewind or promises (plus closures) or some other deep magic underneath, along with an event loop or an io_uring or whatever, is entirely irrelevant to a high-level programmer, for much the same reason that most high-level languages no longer have a "goto" statement even though that's all the underlying hardware is capable of.

Sure, but then we should stop pretending that Google Fibers were rejected because they weren't efficient enough. After wasting 50%+ of memory complaining about 5% is just strange. And compared to overhead of dynamic typing cost of syscalls is laughable, too.

I'm not arguing about “goodness” of the fact that async/await was adopted and Google Fibers ignored. I'm just showing that in both cases choice was made not because it's actually better for something – but because it was better fit for the pile of kludges that there was accumulated already. Or maybe better to say not even “kludges” but horse's assed: in many cases it's not even kludges that dictate our choices but some past decisions that “made sense at the time”. Like use of NUL as string terminator. How many things that “clever” decisions brought? From changes to the languages (like PChar in Pascal/Delphi) and straight to design of our CPUs (e.g. RISC-V Fault-Only-First Load may as well be called “strlen load”). Lots of crazy kludges, but can we reverse that decisions? Very unlikely.

async/await

Posted Jan 3, 2025 20:59 UTC (Fri) by kleptog (subscriber, #1183) [Link]

> And what's the difference from call to PostMessage that also look like “just a function call”,

Well, await returns a value, and PostMessage() return void, so they're apples and oranges really. Sure, there's GetMessage(), but that returns *a* message which is useless from the programmer's perspective. You want the results of the async function you just called, not just any response.

Sure, it's a point at which your code can be preempted to run other code, but in a multithreaded program that's everywhere, so not really a new problem.

Unix sockets

Posted Jan 12, 2025 19:10 UTC (Sun) by mrugiero (guest, #153040) [Link]

> P.S. And, again, I'm not saying that C++ or Rust designers did bad work with async/await. On the contrary, they did superb work (Rust guys, especially). But it's important to understand why that work was even needed… and that's to not touch the accumulated pile of kludges… not to save 16KiB of kernel memory.

This is simply historically inaccurate, given Rust had green threads at the beginning and only later on decided to ditch them and go for async.

Unix sockets

Posted Jan 1, 2025 11:05 UTC (Wed) by kleptog (subscriber, #1183) [Link] (5 responses)

I think you're missing why async/await are so popular: because they're easy to reason about.

There are basically two forms of multiprocessing that are easy to reason about:

* Share-nothing threading, essentially multiprocessing a la Unix processes. This is used in the Actor model, like Erlang, but also (I think) Go and Rust essentially does this too, as the compiler ensures no shared data. Here locking is unnecessary because data races are impossible.

* Multi threading with programmer defined preemption points, aka async/await. Because you know exactly where you can be rescheduled, you can minimise locking requirements and easily reason about the parts that do need it.

The share-everything that is C(++) threading with rescheduling possible at any point is nearly impossible to reason about by mère humans. Which is why it's not popular (although still common).

Btw, async/await is not done using green threads, that would be stupid. The compiler turns your program transparently into a state-machine and your program consists of: pull job from queue, run job, repeat. You could call the tiny fragments of your program that get scheduled "threads" but nobody does that.

The two-colour problem is overstated. Fundamentally you have functions that can sleep and functions that can't. Obviously you can call one way, but not the other. Even the Linux kernel has this, except the compiler doesn't check for you that you did it correctly. This is essential complexity which cannot be removed, only made more ergonomic by language design.

I think this patch set is being sold wrong. You don't want the kernel to manage millions of threads. But if you look at the Erlang VM that schedules millions of user-space processes, it uses one thread per CPU, it can ensure that syscalls are run in separate IO threads, or use io_uring to parallelise the IO calls. But some sleeps are unexpected, like page faults and being able to wake up another thread in that case to continue working is useful. So you'd have maybe 2 threads per CPU with only one running at a time.

Unix sockets

Posted Jan 1, 2025 12:09 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (3 responses)

> This is used in the Actor model, like Erlang, but also (I think) Go and Rust essentially does this too, as the compiler ensures no shared data.

I don't think Go does shared-nothing and, while there are actor frameworks for Rust, they are by no means typical. The borrow checker is Rust's no-race solution, not shared-nothing.

Unix sockets

Posted Jan 1, 2025 12:31 UTC (Wed) by khim (subscriber, #9252) [Link] (2 responses)

> The borrow checker is Rust's no-race solution, not shared-nothing.

Borrow-checker works quite poorly with async because you, very often, couldn't pass borrows throw await point (and righfully so) thus you end up with bunch of Arc's and correctness is no longer checked by the compiler.

I'm not blaming Rust developers for that design: it's hard to imagine how could they made something better.

But if not for the need to pile that work on top of many layers of kludges then they could have either removed sync entirely (but then you need to do syscalls differently) or, alternatively, could have adopted “millions of threads” Google Fibers model (and the borrow checker would have been much more usable).

That wasn't done and for obvious reason: adding yet another layer of kludges on top of kludges is just simply easier than changing the foundations.

Unix sockets

Posted Jan 1, 2025 15:19 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (1 responses)

> Borrow-checker works quite poorly with async

Note that I claimed nothing of the sort related borrow checker and async. I was pointing out that Rust does not use anything resembling Erlang's actor model.

> Borrow-checker works quite poorly with async because you, very often, couldn't pass borrows throw await point (and righfully so) thus you end up with bunch of Arc's and correctness is no longer checked by the compiler.

I believe there are efforts in the works so that variables which do *not* live across an `.await` point do not affect things like `impl Send` for `async fn` synthesized types.

> That wasn't done and for obvious reason: adding yet another layer of kludges on top of kludges is just simply easier than changing the foundations.

Note that Rust *is* tackling "changing the foundations" problems, but "require all new kernel functionality" isn't one of them (Rust also aims for practical use with existing platforms after all).

Unix sockets

Posted Jan 1, 2025 17:40 UTC (Wed) by khim (subscriber, #9252) [Link]

> Rust also aims for practical use with existing platforms after all

And that's exactly what we are talking about here. Remember where the whole discussion started: For me the problem is not how the things were done initially, but rather why they stayed that way even after the original reasoning were no longer relevant.

> Note that Rust *is* tackling "changing the foundations" problems

Only in a sense that it's one of the [very few] modern languages that are not sitting on top of massive runtime written in C pr C++.

And yes, Rust is very good at what it does, only scope of what it does is still limited by the fact that it have to deal with all the accumulated piles of kludges everywhere.

In particular an attempt to embrace “modern green threads” (that are called async/await today) makes Rust's safety story weaker…

Technically Google Fibers are much smaller and simpler thing than Rust, but because it changes the foundations people reject it. Often that rejection is even happening on the spiritual level that's fig-leafed by the reasoning that kernel thread have to use 16KiB which means we shouldn't even try to improve 1-to-1 model but have to go with green threads (that are called async/await today). I wonder why it, suddenly, these 16KiB have became so critical today… they weren't critical 20 years ago, so what have changed not?

Unix sockets

Posted Jan 1, 2025 12:24 UTC (Wed) by khim (subscriber, #9252) [Link]

> I think you're missing why async/await are so popular

No, I know why they are popular.

> because they're easy to reason about.

No, they are pretty hard to reason about. Much harder then when you work with threads and channels. And they are pretty hard to bugfix.

But on the web they are the only game in the town (because JavaScript is single-threaded language) and on Windows you have to use them because fibers can efficiently support async/await model, but couldn't handle millions of threads.

They are popular because you can add them as another layer of kludges of top of what you already have – and that's precisely my original point that started the whole discussion.

> Because you know exactly where you can be rescheduled, you can minimise locking requirements and easily reason about the parts that do need it.

You only “know the locking requirements” if you are very careful at avoiding sync function. Otherwise they “clog the pipes” and you spend crazy amount of time looking for the root cause. Especially if these sync function not “always slow”, but “sometimes slow”.

> The share-everything that is C(++) threading with rescheduling possible at any point is nearly impossible to reason about by mère humans. Which is why it's not popular (although still common).

It's the exact same model that C# or Java uses. It's hard to say that something that's used by 90% of apps (if not 99%) “is not popular”.

> Btw, async/await is not done using green threads, that would be stupid.

Sure. There are no need to implement async/await with green threads because async/await are already a green threads.

> You could call the tiny fragments of your program that get scheduled "threads" but nobody does that.

Of course. But that's marketing. If you admit that Box<Future<Foo>> (which includes frames for all functions that your function may need to call directly or indirectly) is indistinguishable from Novell Netware style green thread then you wouldn't be understood… even if technically there are very little difference (only in case of Novell Netware it was human who was calculating the size needed for that construct and recursion was allowed, in Rust compiler does the same job and recursion is not allowed… both have pluses and minuses, but essentially they are one and the same… oh, right, await was called something like PostMessage 30 years ago – and you had to use it in preemption points even if you had nothing to “post”).

> This is essential complexity which cannot be removed, only made more ergonomic by language design.

It can be removed, it's not even conceptually hard, but it have to be done at the root. Just make sure there are no sync functions. Like in one well-know OS.

But of course that wouldn't be done. We like our pile of kludges way too much to do something about it.

> I think this patch set is being sold wrong.

I don't think it's “sold” anymore. Google wanted to reduce difference between their kernel and mainstream one. After looking on the reaction… I think at this point they just decided to carry that patch instead of upstreaming it.

> So you'd have maybe 2 threads per CPU with only one running at a time.

That would be an interesting project, but that would be entirely different project. Google Fibers design was the solution for the two-colors problem. And with kernel threads used as basis you no longer have it. You don't need special mutexes (like Tokio mutex, your Fiber-aware code can include communications with regular threads, etc.

That was the goal. And it's achieved, that's how Google Fibers were used for more than a decade (and would probably be used for the foreseeable future).

Every single reviewer tried to bend that patchset into Erlang-style model or M:N model… which made it entirely uninteresting for Google: sure, someone else may benefit from all that… but Google would still need to carry patch for Google Fibers… what's the point of pushing it, then?