Systemd improves image features and adds varlink API
The systemd v257 release brings a number of incremental enhancements to various components and utilities for working with Linux systems. This includes more support for varlink, automated downloading of disk images at boot time, and a number of improvements to the secure-boot process for unified kernel images (UKIs), which we have covered in a separate article.
Lennart Poettering announced the release The release was announced on
December 10 to the systemd-devel mailing list, and Lennart Poettering followed up
with a blog
post that linked to his Mastodon threads about new features in
v257.
varlink
One of the interesting changes in systemd v257 is the project leaning into varlink. Poettering has been championing the varlink inter-process communication (IPC) protocol as an alternative to D-Bus for some time. (LWN covered varlink in 2018, shortly after its announcement.) At the All Systems Go! (ASG) conference in September, he delivered a "Varlink Now!" presentation to explain why systemd is adopting it heavily, and to make the case for other software to adopt it as well. (A video of the talk is available, as are the slides.)
One of the primary problems with D-Bus, at least from the systemd perspective, is that it is not available until late in the boot process—too late for many things systemd needs to do. Poettering also said that it is easier to write varlink services than D-Bus services, because a varlink service can handle each connection in a separate process:
To give one example, "bootctl" is a small tool that installs the systemd-boot boot loader into the ESP for you. It's a command line tool that synchronously copies a bunch of files into the target mount. We always had the plan to turn that into a D-Bus service, but never actually did it, because doing that is pain: we'd have to turn it into an event loop driven thing, which is just nasty for something so simple that just copies some files.
In the slides from ASG he also refers to its security model as
"garbage
", and says that D-Bus cannot be used for many basic
IPC services since D-Bus itself uses them (causing a cyclic loop). On
Mastodon he provided
the example of journald providing logging to the D-Bus broker
and being unable to provide APIs via D-Bus.
He said
in November that systemd had tended to use varlink "where D-Bus was
just too bad to use, and only internally
". That has changed in
recent releases of systemd—as of v257 it now has 19
varlink interfaces or services, but only 11 D-Bus API
services. In v257 the varlink API has been added to libsystemd
as sd-varlink,
a public JSON API to help in implementing varlink clients and
services. (The API existed in systemd prior to v257, but was not
exposed as a public API.) It also brings
along a "companion API
" called sd-json, which is a C
library for JSON. There are many of those, so why another? Poettering
said:
My answer to that is that sd-json is much nicer to use than most, since it doesn't just allow you deal with JSON objects, it also helps you with building them from C data structures, and to map JSON objects back to C data structures.
Poettering noted
that one might say that documentation for sd-json and
sd-varlink is "barely existing
", but there are
examples of using them within the systemd source tree.
The systemd-machined service for registering and tracking virtual machines has also gained a set of varlink APIs, as an alternative to the D-Bus interface. The project has also beefed up the varlinkctl utility that is used to introspect and invoke varlink services.
For example, varlinkctl has gained a --timeout= option to allow the user to specify how long it should wait to invoke a service. Users can now invoke services with systemd over SSH by using ssh-exec: to specify the host to connect with. This command, adapted from the varlinkctl man page example, will get a report from the varlink service about the identity of a remote host using ssh-exec::
$ varlinkctl call ssh-exec:host.example.com:/usr/bin/systemd-creds \ org.varlink.service.GetInfo '{}'
This could be useful for administrators who want to make a quick change on a remote system without the need to log in first.
Working with system images
Image-based Linux systems are increasingly popular. In the last year or so we've looked at quite a few image-based distributions, including Aeon, Bluefin, Endless OS, and Vanilla OS. As various projects look to image-based deployment as a way to simplify building, deploying, and updating Linux systems, systemd adds a few tools especially for these types of distributions.
Systemd has been able to use system extension (sysext) images to add software to so-called immutable systems by layering on an image at runtime without making permanent changes to the base image. LWN's report from the 2024 Image-Based Linux Summit discusses recent developments with Flatcar Linux and sysext. Even though sysext is primarily aimed at image-based distributions, it can be used with traditional Linux distributions as well. For example, early testing instructions for the COSMIC Desktop recommended using a sysext to try out COSMIC without making permanent changes to a system.
A new tool, systemd-import-generator makes it possible to specify an image URL on the kernel command line, so that systemd will automatically download the various images at boot using the systemd-importd service. It is not limited to sysexts or image-based systems, however. It can also be used to download configuration extension (confex) images, container images to be run with systemd-nspawn, or virtual machine images for systemd-vmspawn.
The systemd-sysupdate tools have also been improved in v257. Poettering said that the Sovereign Tech Fund (STF) had sponsored work as part of the GNOME STF grant to improve user-controllable update functionality. The result of that work is an experimental D-Bus API for systemd-sysupdate that allows unprivileged clients to update the system using the updatectl command-line tool. For example, "updatectl list host" would show the user which updates are available, while "updatectl update host --reboot" would perform the update and then reboot after the image is staged. The --offline option will tell it to only list image versions on the local system, not those available via the network.
Administration and user improvements
One of the perpetual problems in system configuration is when two (or more) tools make changes to the same settings. For instance, if parameters for network devices are set in /etc/sysctl.d to be configured by systemd-sysctl and then later tweaked by systemd-networkd, that may result in conflicting settings with the most recent change winning.
To help avoid this, systemd-networkd now loads an eBPF program into the kernel to report any changes to sysctls on devices that it is managing. It does not prevent or revert those changes, but running "networkctl status" will display a log of any conflicts to help users and administrators troubleshoot conflicts.
The systemd-inhibit command is used to prevent the system from being shut down or going to sleep while a command is running. This can be useful, for example, to prevent a system going to sleep while running an operation to copy files to an external disk or remote server. It can be frustrating to start a long-running task, step away from the computer for the night, and return to find that it took a nap while it was supposed to be working. In v257, the utility gained a new feature to use polkit to perform interactive authentication for privileged operations like inhibiting shutdown, though that can be disabled with the --no-ask-password option.
Systemd can restart services it manages when they terminate unexpectedly if the service file has the Restart= directive configured. Usually, if a service starts terminating unexpectedly, it means that there's debugging to be done. The RestartMode= setting now has a new value, debug, to make that easier.
If RestartMode=debug is set, systemd will restart the service and raise its log level to "debug" so that it's easier to troubleshoot. Systemd will also set an environment variable, DEBUG_INVOCATION=1 for the service so that the service can be aware it was restarted in debug mode after a failure. Poettering encourages system-service developers to consider supporting the new debug variable.
/etc/os-release
Many Linux users are familiar with the /etc/os-release file that provides information about the distribution and version that their system is running. Fewer are aware that this is, in Poettering's words, a "systemd-ism". In v257, the project is adding three additional fields to os-release, RELEASE_TYPE, EXPERIMENT, and EXPERIMENT_URL.
The RELEASE_TYPE describes whether a release is stable, in development (e.g., a beta release), a long-term-support (LTS) release, or an experimental release. The experimental type is for releases without an expected stable release, such as distribution builds produced to test specific components. The EXPERIMENT field can contain a description of what makes the build experimental, with the EXPERIMENT_URL field to provide a link to more information about the build experimental.
Readers may remember that RELEASE_TYPE popped up in a recent Debian disagreement. Systemd contributor Luca Boccassi wanted to use RELEASE_TYPE=pre-release for Debian testing releases, but did not succeed in persuading the stakeholders or the Debian Technical Committee in making the change.
Removed or deprecated
The next systemd release will completely remove support for version 1 control groups (cgroups). This may sound familiar to those who have been reading the systemd release notes—systemd v256 introduced a requirement to use a kernel command-line option to re-enable cgroup v1 support. In systemd v258 support for "legacy" and "hybrid" cgroup hierarchies will be entirely removed.
The systemd project has set its recommended baseline kernel to Linux 5.4, which is an LTS release from 2019. The 5.4 kernel has a projected end-of-life date of December 2025. This doesn't mean that systemd will not work on older kernels, but users should not expect testing for releases older than that.
In bad news for users still depending on System V (SysV) service scripts, support for those scripts is deprecated and will be removed in v258. Poettering published a blog post in 2010 on how to convert SysV init scripts into service files.
systemctl summary
One thing to watch is whether systemd's emphasis on varlink in this release influences other projects to adopt it as an alternative or in addition to D-Bus. It may be that systemd's pain points are unique to its use case. Overall, this release is a solid update. It is full of minor enhancements across all of the parts of a Linux system that systemd touches—which is most of them, at this point.
The milestones tracker for v258 shows fixes and features that may be coming in the next release. At the moment, most of the issues tracked seem to be minor enhancements and bug fixes, but a few exciting features may sneak in before the next release.
Posted Dec 24, 2024 15:34 UTC (Tue)
by bluca (subscriber, #118303)
[Link] (1 responses)
That's actually done by a bot, that acts when the tag is pushed to Github: https://lists.freedesktop.org/archives/systemd-devel/2024...
Posted Dec 24, 2024 15:41 UTC (Tue)
by jzb (editor, #7867)
[Link]
Posted Dec 25, 2024 11:42 UTC (Wed)
by smurf (subscriber, #17840)
[Link] (4 responses)
… a change which is rather likely to be (temporarily … we can only hope) reverted by some distributions.
Posted Dec 25, 2024 19:51 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (3 responses)
Posted Dec 27, 2024 8:04 UTC (Fri)
by pabs (subscriber, #43278)
[Link] (2 responses)
Posted Dec 27, 2024 13:56 UTC (Fri)
by anselm (subscriber, #2796)
[Link] (1 responses)
In a distribution like Debian GNU/Linux, if there are packages that even now only offer SysV init scripts, their maintainers could get together and set up a new package containing only the systemd SysV init script unit generator, and the packages in question could (pre-)depend on that package. Not that huge a deal, in the grand scheme of things. If that doesn't happen, that should also tell us something – possibly that finally coming up with native systemd support is the easier option after all.
Posted Dec 27, 2024 14:34 UTC (Fri)
by bluca (subscriber, #118303)
[Link]
Posted Dec 25, 2024 14:23 UTC (Wed)
by ibukanov (subscriber, #3942)
[Link] (73 responses)
I always wondered why Unix sockets were not used earlier to implement various system services. For example, why su needs to be setuid rather than talking to a service at the socket? Or why do we have /etc/resolv.conf pointing to 127.0.0.1 rather than having a Unix socket from the start? As all such services can be start by inetd there is no overhead of persistently running process.
Posted Dec 25, 2024 16:06 UTC (Wed)
by khim (subscriber, #9252)
[Link] (63 responses)
The short answer: because PDP-7 where Unix was initially developed only had 16KB of RAM. Lots of decisions in Unix were really a pretty horrible kludges, but without them it was impossible to create something usable for pitifully underpowered hardware that was the norm back then. Some were later reverted by Linux, but many are still with us. Because, initially, resolv.conf would point to another machine. Again, because running DNS resolver was too much for tiny systems so dedicated device was used for that role. The big question is always not “how to design something perfect” but “how to go from the mess that we have now to a slightly smaller mess and if it's worth even doing that”. Computer systems are almost never designed from scratch (and the ones that are designed from scratch are failure in 99% of cases). They evolve from existing ones to another ones. We have learned perfectly to grow them from small hairballs into larger hairballs by adding stuff to them, but changing or, even worse, cutting them down… that's not yet a well-established art. And it's not clear if we even need it: maybe approach that kernel uses (thinking long and hard before adding new interface instead if delivering series of replacements that would inevitably tire users and developers who would then leave for something more stable) would be better? I don't know, the area of long-term API development still looks like some kind of dark art, rather than science so far. Very hard to predict and control.
Posted Dec 25, 2024 16:42 UTC (Wed)
by ibukanov (subscriber, #3942)
[Link] (62 responses)
I.e. it is one thing to stay with things that are proven to work instead of trying new-and-shiny-and-broken, but staying with things that are known to be harmful is another.
Posted Dec 25, 2024 17:08 UTC (Wed)
by khim (subscriber, #9252)
[Link]
“Things that are proven to work” and “things that are known to be harmful” are not mutually exclusive conditions. Worse: for the vast majority of people “things that are known to be harmful” are “things that are proven to work”. That's why we only have rapid development of something when it's adoption is very low. When new people arrive who don't know what are these “things that are proven to work” rapid development of things is possible, when it the same people year after year they tend to cling to their pile of kludges till it really, really, REALLY starts to hurt them. Just recall how people clung to their sysvinit systems! Because you don't need justification to keep something but need enormous one to change anything. “It was done like this yesterday” is a very powerful justification by itself! QWERTY is still with us even in spite of the fact that decisions that were used to justify it are long gone. Or, heck, backslashes as delimiters in Windows and these The only way to change something quickly is when you have a monopoly and people just don't have a choice… but that approach have their own nasty side-effects, too. Otherwise you are stuck with decisions (any decisions) that “made sense at the time” because of ossification.
Posted Dec 25, 2024 20:04 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (49 responses)
A cascade of problems within the UNIX ideology. In UNIX, there is no standardized way to do services with service discovery, supervision, and ironclad guarantees that they are available at the early boot. Everything is designed so that a theoretical `init=/bin/bash` can result in a workable system that can download files from the Internet, and get user information from an LDAP directory.
So this forces you to implement things like DNS via in-process shared libraries. But then you need plugins, so you get the horrifying NSS/PAM infrastructure.
Now that Linux basically killed all other UNIXes, other than Darwin, we might finally start getting better infrastructure.
Posted Dec 25, 2024 20:18 UTC (Wed)
by khim (subscriber, #9252)
[Link] (48 responses)
Why? In my experience infrastructure is only ever improved when it's not just failing but when you are running out of stock of bailing wire and scotch and no one wants to ever touch it with ten-foot pole while wearing the Hazmat suit. That's how we end up with web apps that emulate terminal and pass the data via the emulation of IBM 3270 to connect to COBOL programs or something equally crazy. People grumble about infrastructure always, but only when losses are started to be measured in trillions, only then some movement starts. Do you have any evidence that pile of kludges that we inherited from XX century is failing so badly that we can not go from losses measured in trillions to losses measured in hundreds of billions? I'm pretty sure that's where “the powers that be” would stop and declare the process of overhaul finished.
Posted Dec 25, 2024 20:48 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (47 responses)
Because people are no longer held by the compatibility restrictions. Getting anything done across multiple UNIX versions was close to impossible (see: kqueue vs epoll), so the overall infrastructure had ossified at around the level of functionality in 1996.
Now that this is no longer the case, innovation is much easier. So we're getting reliable infrastructure (systemd and friends) because companies (including hyperscalers) need reliable basic infrastructure.
Posted Dec 25, 2024 21:22 UTC (Wed)
by khim (subscriber, #9252)
[Link] (46 responses)
Systemd was introduced more than a decade ago and it's based on work that Google more than two decades ago. Around decade ago Google did another attempt to improve infrastructure, but there was no interest in adopting that work. Instead we are continuing to pile kludges on top of kludges (this time with C++ and Rust) – including, from what I understand, in Google, too. Sorry, couldn't see that. I would say that innovation in the usual fashion of XKCD 2347 is easier, while changes in the foundation are slower than ever. Heck, even obvious, “totally no-brain” move that I (and others) expected to happen around decade ago took more than decade! And let's not forget about an elephant in the room. I'm not saying that foundations would never change… it just would take many decades… like it does with cars, clothes or any other such industry. In fact clothes is the best analogy: while changes “in the foundation” of how we make clothes are happening but they occupy maybe 1% of efforts related to “innovation in clothes”, 99% goes into “fashion industry” that doesn't really change “foundations” but finds ways to the sell the exact same thing again and again… why should software be different?
Posted Dec 26, 2024 4:21 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (45 responses)
Sure. And that's the speed of progress in complicated ecosystems. If anything, it's actually blazing fast compared to many other areas (e.g. ICD-10 adoption).
> Instead we are continuing to pile kludges on top of kludges (this time with C++ and Rust) – including, from what I understand, in Google, too.
I'm not sure how an experimental patchset with unclear long-term support considerations is a relevant example. Migration to systemd fixed a glaring issue: lack of reliable process supervision system. User-controlled concurrency does not appear to be fixing stuff that is relevant for most systems.
If anything, io_uring is a better example. It fixes the abysmal state of async APIs in Linux, and it's being adopted at a lightning speed. eBPF is yet another example, there's a very real need to be able to instrument the running kernel, so it's also being adopted quickly (much to my personal distaste).
Like it or not, the relative monoculture of Linux makes changes possible. They still take years to propagate, but they are possible. On the other hand, diverse ecosystems just ossify in place, actively resisting any attempts to change them (see: IPv6 adoption).
Posted Dec 26, 2024 11:22 UTC (Thu)
by khim (subscriber, #9252)
[Link] (44 responses)
It's extremely relevant example because at the time of introduction it was underpinning literally all services in Google. From search to map and everything in between. And it made it possible to not create a pile of per-language ad-hoc solutions. Same with Google Fibers: it made it possible to create, literally, millions of userspace threads – which essentially made complicated schemes around async/await and io_uring efforts superfluous. But industry have picked the other approach. Only it's part of the exact same example: the same problem, different solution. One may say that “millions of threads” is worse that io_uring, but I'm not 100% convinced. But if you have “millions of threads” then do you even need “and async API”? Yup. Because it's “something new and shiny” and not “change to the foundations”. And, again, that's “something new and shiny” and not “change to the foundations”. It makes it possible to pile kludges on top of kludges much faster, that's true. As for changes… they are happening, but in cases where every attempt to add yet another kludge to the pile of already collected ones threaten to unravel the whole thing. IPv6 adoption is around 50% and I'm writing that from IPv6 only system. Some Java-related things don't work (but that's expected, it's “enterprise”, it would adopt IPv6 somewhere near XXV century), but LWN work fine, without proxy. Thus I don't see the critical difference: switch to Wayland took approximately the same time and is in approximately the same state: half-way done.
Posted Dec 26, 2024 12:19 UTC (Thu)
by smurf (subscriber, #17840)
[Link] (4 responses)
If your threads are not kernel-level threads, which applies to Fibers, then you need something like io_uring under the hood.
> But if you have “millions of threads” then do you even need “an async API”?
You do not want millions of kernel level threads. The context switching overhead is an order of magnitude or two higher than at user level. So, yes, the kernel needs to afford an async API for you to implement your Fibers library on top of.
Posted Dec 26, 2024 14:28 UTC (Thu)
by khim (subscriber, #9252)
[Link] (3 responses)
Care to explain? Google Fibers are kernel-level threads. That's the whole point. They are scheduled with help of the userspace, but each fiber is separate, full, kernel-level thread. Watch the video. Why not? Google did it. It works. From what I understand they are “following the industry” now to “not be an island”, but for more than decade that's how all Google services worked. “Following the industry” to “not be an island” is precisely an admission that one couldn't fix the foundations and have to pile kludges on top of kludges like everyone else. But you avoid lots of complexity related to all that async/await machinery. And lots of memory allocations/deallocations. It's not obvious that overhead to switching to kernel would be onerous enough to dwarf overhead from all that complexity. Google served billions of users using that approach, it's not some kind of “low-scale experiment”. Only if you are unwilling to change the foundations and just want to pile kludges on top of kludges.
Posted Dec 27, 2024 21:36 UTC (Fri)
by smurf (subscriber, #17840)
[Link] (2 responses)
Owch. My bad. Searching for that term apparently gave me a rather nonsensical idea of the concept.
Anyway, what happened to it? that video is 11 years old.
Posted Dec 27, 2024 22:27 UTC (Fri)
by khim (subscriber, #9252)
[Link] (1 responses)
And what happened to Google Borg? Videos about it are approximately as old as videos about Google Fibers. Nothing much happened, for what I understand. Google attempted to open-source kernel side part, but without library that was designed to use that part people weren't interested because io_uring craze has already started. And since it's closer to what others are using people embraced iOS/macOS/Windows-friendly approach instead of Linux-centric one. So much for “Linux is the monopoly now, so it could innovate”, lol. If I understand what my friends in Google say currently the idea is to embrace yet another favor of green threads and, maybe, eventually, even deprecate Google Fibers, but I strongly suspect that with literally everything built on top of them for more than a decade (as you noted) this wouldn't end up well: Google would still stay in the “it's own little island position” for the foreseeable future with their own Google Fibers, but would now be burdened with another thing that is supposed to deprecate their internal solution “eventually”. Technically the whole concept can be summarized in a couple of paragraphs, really: And then you can do things that are now built on top of The only difference are: Guess which part was important for the Google and which part was important outside of Google?
Posted Jan 12, 2025 18:56 UTC (Sun)
by mrugiero (guest, #153040)
[Link]
The regular policy in Linux has long been "no kernel-side only patches", so I don't get why you blame the io_uring craze. If there's no open source user space using the feature, maintainers can't test the functionality nor debug it properly, so they can't claim responsibility for it working.
Besides, I'd say fibers are much less a change to the foundations than io_uring is and much more about "keeping our kludge working", given the design prioritizes keeping the threading model unchanged over efficiency.
Posted Dec 26, 2024 22:29 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (38 responses)
Good for them. Now make it useful for other people and this solution might get adopted.
> Same with Google Fibers: it made it possible to create, literally, millions of userspace threads – which essentially made complicated schemes around async/await and io_uring efforts superfluous.
I don't believe that lightweight userspace threads in Google's proposal create 1-to-1 kernel threads. So something like epoll/io_uring is still needed to multiplex the scarce kernel threads.
> IPv6 adoption is around 50% and I'm writing that from IPv6 only system.
I have yet to see a hotel with IPv6 WiFi. And I check every time, out of curiosity.
Posted Dec 26, 2024 22:52 UTC (Thu)
by khim (subscriber, #9252)
[Link] (37 responses)
Do you have any reason to believe that Google publicly tried to deceive everyone outside of Google and also everyone inside Google? We can ask any Googler (NYKevin is one, I think) to go look on the http://go/fibers#fibers-vs-threads and check if text Fibers are threads (bold-emphasized in the original document!) is still there. I'm pretty sure it's still here, it was there 10 years ago definitely… precisely to ensure that this: Wouldn't be true. The whole point of Google Fibers was to avoid the need for new kernel API: with userspace-scheduled threads each having their own kernel context everything “just works” without any special efforts. You don't need special “async”-aware mutexes, you don't need complicated “async”-languages and runtimes… things “just work”™ – but only on a platform where fibers are available. Which means it doesn't work on iOS or Windows – and that meant that non-Google developers are not interested. They tried. People are not interested, really. They would rather pile up kludges on top kludges rather than change anything in the foundation. I think these days Google have decided to “stop being an island” and follow the async/await craze but from what I understand that's not because “millions of threads” haven't worked for them but rather because people outside of Google weren't interested and that meant that new hires were not prepared for the entirely alien world totally different from what they were taught in the colleges.
Posted Dec 26, 2024 23:05 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (29 responses)
I just don't believe that you're describing the patchset correctly. I'm fully aware that fibers are way better than async/await, I'm just skeptical that the Google's patchset is a replacement for io_uring and async kernel-level APIs.
Otherwise, it would be easy to demonstrate something like an Apache server with synchronous API operating at performance levels similar to nginx.
Posted Dec 26, 2024 23:12 UTC (Thu)
by khim (subscriber, #9252)
[Link] (28 responses)
Patchset was only including kernel part. You still need to build kernel library that would manage pool of kernel threads, etc. How could it be replacement for io_uring and async kernel-level APIs if it predates it by about 10 years (maybe more)? It's io_uring, async kernel-level APIs and all that mess with C++ coroutines and JavaScript and Rust's counterpart that are replacements for small and much simpler library. Why do you think Apache server should beat nginx? Do every Node.JS async/await server beats nginx? Yet they are using all these new and shiny interfaces. Code on top of your API matters, too, API limits maximum speed, it doesn't guarantee absence of waste.
Posted Dec 30, 2024 18:16 UTC (Mon)
by bronson (subscriber, #4806)
[Link] (27 responses)
It's not Apache vs nginx, it's fibers vs the incumbent/async. Fibers need to show their merit somehow, otherwise nobody will adopt them.
From your description, it sounds like Fibers is kind of like a demoscene experiment? Just mind-blowing in its little context, but nobody did the work to demonstrate real world success and make it broadly relevant?
Posted Dec 30, 2024 19:12 UTC (Mon)
by khim (subscriber, #9252)
[Link] (26 responses)
Except incumbent in not an nginx here, but Apache. Nginx is very specialized codebase, you couldn't plug Apache modules into it. Why the same logic doesn't apply to Well… if you want to call something that drives google.com and most other Google services “a demoscene experiement” because it's not used outside of Google – then sure. It's more of “no one wanted it badly enough outside of Google for that to happen”. On Google servers it's technology that drives almost everything. At least that was the case at the time when video was created (and an answer of NYKevin pretty much tells us it's still the case, for if Google Fibers went from “basis for everything” to “failed experiment” by now his answer would have been quite different). Outside of Google… we have kernel part and that's it. From what I understand the idea wasn't “to opensource Google Fibers”, but more of “to reduce difference between upstream kernel and Google kernel”.
Posted Dec 31, 2024 9:58 UTC (Tue)
by smurf (subscriber, #17840)
[Link] (25 responses)
which seems to be the problem.
Isn't Go supposed to have exactly the sort of light-weight threads that should be able to take advantage of this?
Posted Dec 31, 2024 10:42 UTC (Tue)
by khim (subscriber, #9252)
[Link] (24 responses)
It's the exact opposite: Go's “green threads” are designed as yet-another- It's not atypical for Google to develop more than one solution and then see which one sticks. And since we know Google is adopting Rust and doing lots of C++… we know that Go haven't won. In fact that phenomenon was even described by Go authors: Although we expected C++ programmers to see Go as an alternative, instead most Go programmers come from languages like Python and Ruby. Very few come from C++.
Posted Dec 31, 2024 11:13 UTC (Tue)
by smurf (subscriber, #17840)
[Link] (23 responses)
Huh. For me async/await implies the two-color problem, which Go doesn't have.
In fact whether the threads underlying the Go runtime are "green" and built on some select call, or 1:1 on top of Fibers, shouldn't really matter from a high-level perspective.
> and are Google's solution to the fact that not everyone have Google Fibers.
Sure, but if basic kernel support for them is out there (is it?) then I'd assume that promoting the Fibers solution to the problem might be in their interest.
Posted Dec 31, 2024 11:33 UTC (Tue)
by khim (subscriber, #9252)
[Link] (22 responses)
Go's approach to two color problems is to eliminate sync entirely. Except it's still there if you are trying to use foreign code. Go's solution to that is to trey to implement everything natively. I'm not sure how much Google embraced Google Go, but from what I've heard even Google Borg is still not rewritten in Go (even if both Docker and Kubernetes used outside of Google are written in Go). Yes and no. The whole point of Google Fibers is reuse synchronous code in async context. Go doesn't have any syncronyous code to reuse, thus the whole Google Fibers machinery is pretty much irrelevant for Go: if you are already committed yourself to rewrite of the whole world to solve two-colors problem in that way, then why would it matter if you can use old, synchronous, code or not? Kernel support is “out there” in a sense that Google shared the code and LWN reviewed it. Google never pushed it to the inclusion in the mainline. From what I understand when mainline people has looked on it and rewritten it… Google lost interest. Because if the whole thing would be entirely rebuilt and redesigned… it would be the same story as with Google Borg and Docker/Kubernetes: outside world get it's new toy which would help it, but Google wouldn't get a reprieve from “we are an isolated island” issue. Instead Google works on
Posted Dec 31, 2024 19:41 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link] (21 responses)
Posted Dec 31, 2024 20:16 UTC (Tue)
by khim (subscriber, #9252)
[Link] (20 responses)
Not “userspace threads”. Patches allow one to suspecr/resume kernel threads. That's the whole point. And yes, that's the only thing that you need in kernel, everything else can be done in userspace. Um. Haven't you just wrote about that? The reason it's not efficient to have millions of threads in normal Linux is not the fact that Linux scheduler becomes overwhelmed, but because to implement This patch adds precisely such mechanism. Everything else is done in userspace.
Posted Dec 31, 2024 23:33 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link] (13 responses)
I still don't see it. I looked over the patchset: https://lwn.net/ml/linux-kernel/20210520183614.1227046-1-... and it's just another N:M scheduling system, with only concession being that the kernel can kick the userspace scheduler when it blocks a thread.
> The reason it's not efficient to have millions of threads in normal Linux is not the fact that Linux scheduler becomes overwhelmed, but because to implement async/await story with “green threads” that are communicating via channels one needs a mechanism to wake up “the other side of the channel” when you pass the work to it.
No, it's not the case. A million of threads is inefficient because the underlying kernel overhead is unavoidable. Golang has about ~500 bytes overhead for each thread (because it can rescale the stack as needed), Rust can get down to a hundred bytes for each coroutine (because it essentially flattens the entire state machine). The kernel is at least 16kB.
This is fundamental. That's why kernel threads can never be cheap, and the userspace needs some way to multiplex onto them. UCMG is just another mechanism to help with that multiplexing, but it's neither essential, nor does it obviate the need for io_uring/epoll/...
Posted Jan 1, 2025 0:05 UTC (Wed)
by khim (subscriber, #9252)
[Link] (12 responses)
Where does that idiotic conviction comes from? Google even made a video where it explain how and why they decision not to pursue M:N model – and you still go on and about how that couldn't ever work. That doesn't mean one couldn't have millions of threads. Million times by 16kB is only 16GB. Not too much by modern standards. Many desktops have more, serves have hundred times more. Well, Google Fibers don't do that. You say as if 16kB for request is a big deal. And compared to 500 bytes it even sounds like that. But all these fancy tracing GCs take memory (most garbage collectors start behaving abysmally slow if you don't give them at least 2x “breathing” room), Even management of M:N thread system takes memory. You need very peculiar and special kind of code to squeeze all that complexity in 15.5kB per request.
Posted Jan 1, 2025 18:02 UTC (Wed)
by ibukanov (subscriber, #3942)
[Link] (11 responses)
> Million times by 16kB is only 16GB
Apple M-series uses 16K pages and it’s only a matter of time when other CPUs follow. So kernel threads will take more memory.
And the cost of context switching between the kernel and the user space will continue to grow so io_uring or similar is necessary to archive the optimal performance.
> But all these fancy tracing GCs take memory
Neither Rust nor C++ use GC. Granted in C++ code like Chromium that implements asynchronous IO using callbacks there are more heap allocations, but the overhead is minimal. And Rust can do async without heap allocations at all.
So I suspect after reading this discussion that Google Fibers are just too specialized. It works nicely in its intended niche (servers behind google.com on present AMD/Intel hardware). But outside of that its advantages are not that strong to compensate for drawbacks.
On the other hand explicitly 2-colored code can target wide areas with better performance while hiding the complexity behind libraries like Rust Tokio io_uring support.
Posted Jan 1, 2025 18:49 UTC (Wed)
by khim (subscriber, #9252)
[Link] (10 responses)
How is that related? Kernel used 8KiB per thread for years. It was expanded to 16KiB ten years ago – but that happened because kernel stack was overflowing, not because pages have become larger. And since it's already 16KiB… Why should switch to 16KiB pages make it larger? It would just use one page instead of four… Again: why should context switching have become more expensive? It's more-or-less constant for last ~20 years. The biggest problem that Yes, but And the last time I've heard there was no pressure to abandon tracing GC languages because they waste memory… why is it the problem for the “million threads” model, then? The fact that languages that actually care about every last bit of performance resisted Maybe, but the reason Chromium uses that model is not to save some memory for sure: Chromium is one of the most memory-wasteful programs that exist. Chromium have many advantages, but “frugal about memory” is not in the list of its fortes… Rather the impetus for Chromium is the need to work well on Windows… so we are back to the “pile of kludges” justification. Sure. And maybe, on bare metal, it would even go back to where we started and would implement interface with kernel code properly. But everywhere else people are using Tokio and similar executors which efficiently converts Sure. But “specialized” here just means “need special API in OS foundations to be usable”. And most apps (including Chromium) need to work on iOS, Windows, macOS… they couldn't adopt Google Fibers. Except is not “hiding anything”. One couldn't just magically “flip the switch” and make Tokio use The decisive advantage It's just ridiculous to claim that P.S. And, again, I'm not saying that C++ or Rust designers did bad work with
Posted Jan 1, 2025 19:59 UTC (Wed)
by ibukanov (subscriber, #3942)
[Link] (8 responses)
The user space stack will grow from 4K to 16K.
> Again: why should context switching have become more expensive? It's more-or-less constant for last ~20 years.
Protection against SPECTRe and other hardware bugs is expensive.
> The fact that languages that actually care about every last bit of performance resisted async/await as much as they could
If async/await existed 30 years ago in C it would make programming in C GUI frameworks so much easier and allow to make GUI apps that do not freeze when writing to slow disks on single-core processors of that age without using threads or mutexes.
Which is another advantage of async/await model. It allows to mix events from different sources without much complexity. For example, consider the problem of adding a timeout to code that is doing a blocking io operation in traditional C++ code. One is forced to use a second thread that just calls sleep(), then sets an atomic flag and the IO thread then checks for the cancel after the IO operation. Alternatively instead of a flag one can try to close a file descriptor to force the IO operation to return. But doing that properly is hard and requires a non-trivial synchronization on the file descriptor access.
Yet await implementation allows to await for results of several IO operations with simpler and more efficient code. Heck, even in plain C on Linux I would rather replace the IO operation with non-blocking one and do a poll loop with the timeout than using threads if I have a choice. Then I just close the file descriptor with no need to worry about any synchronization.
Even Golang is plagued by that problem of having to create extra threads when the code needs to react to a new event. Of cause the threads are green and their creation is cheap and ergonomic, but that still brings a lot of complexity. Golang solution of cause was to add the color in the form of Context argument and gradually change libraries to respect that Context cancellation. They are far from there as file/pipe IO still has no support for that. Plus cancel is rather limited and still does not allow to mix arbitrarily events.
As I understand on Google servers there is simply no need to react to extra events. If necessary the OS just kills the whole process or starts a new one.
So async/await is just simply more universal and performant when implemented right like in Rust even if the code can be less straightforward than would be in cases when Fibers are used at Google.
Posted Jan 1, 2025 21:08 UTC (Wed)
by khim (subscriber, #9252)
[Link] (7 responses)
Usespace stack can always be swapped out. Kernel threads are limiting factor, not userspace. 30 years ago every single platform that was embracing GUI was already implementing async/await. Only Nope. To make GUI apps that don't freeze you don't need What you need are preemptively scheduled threads. And they were embraced as soon as thay became feasible. Except for one platform: Web. JavaScript and DOM were designed as fundamentally single-threaded thing and thus couldn't benefit from the preemptively scheduled threads. And on Windows these threads were so heavy they even tried to add green threads back on the OS level in WindowsNT 3.51SP3. And in these circumstances, on platforms where threads were too heavy to even contemplate “mullions of threads” approach Please don't try to play “this was done for the efficiency” tune: this doesn't sound even remotely plausible when languages that should care about efficiency most added No, that was a kludge for inefficient runtimes. And then it was brought to C++ and Rust “from above”… but no one ever compared it to alternatives. Except Google – but since Google embraced “millions of threads” approach decades before And doing that with You can do the exact same thing with Google Fibers and even normal threads. The only platform where such cancellation is “easy” (for some definition of “easy”) is Rust – and that's because when heavily pressured to embrace But that was just a side effect from how was implemented (Rust doesn't have a runtime and was unwilling to readd it back just for IOW: what you portray as “shining advantage” of Why does it even matter? Yes, something that “millions of threads” approach doesn't even need can be easier and faster with You have created serious problems for yourself and then heroically managed to overcome them… congrats… but isn't it yet another example of how we build kludges on top of kludges? And that's what every other language does, too. Even Rust. Only C# starts with non-cancellable green threads and Rust with infinitely cancellable Thread-based word also have it's own solution, BTW. And it doesn't work for the exact same reason. Arbitrary cancellation in arbitrary place is just not safe! Maybe, this remains to be seen. But that's not even the point that we are discussing here. If you would say that design of It would be interesting to see what Rust would manage to do with kludges that it embraced in And Rust had to adapt Nowhere in that journey an attempt to solve concurrency problems in a different fashion, without “green threads”, was considered – and even Rust (that did an interesting experiment by embracing Even more exciting experiment, but very much result of the set of kludges and design decisions Rust had when it tried to adopt
Posted Jan 2, 2025 10:13 UTC (Thu)
by smurf (subscriber, #17840)
[Link] (1 responses)
Another platform where cancellation Just Works is Python. I've been using the Trio async runtime for ages now (or what feels like them, given that the library is ~7 years old), and there's even a way to side-step the two-color problem if necessary ("greenback").
> Thread-based word also have it's own solution
Umm, might you be persuaded to not mix up "its" and "it's"? Thanks.
> Arbitrary cancellation in arbitrary place is just not safe!
which is one reason why the two-color problem sometimes is not a problem and async/await is actually helpful: at every place you can possibly get cancelled (or rescheduled) there's a glaringly obvious "async" or "await" keyword.
Posted Jan 2, 2025 12:38 UTC (Thu)
by Wol (subscriber, #4433)
[Link]
> Umm, might you be persuaded to not mix up "its" and "it's"? Thanks.
Mind you, he's in very good company - the number of NATIVE speakers who mess it up ...
Herewith a Grammar Lesson - the apostrophe should *only* be used to indicate an elision (aka missing letters).
In "ye olde English" (and that's a thorn, not a y), possessives were created by adding "es". In time, people dropped the "e", and the standard ending became " 's ". Except there's a special rule for words that naturally end in "s", for example "James", which may have passed through "James's" before the official standard of " James' " took hold. Mind you, people are happy to use either version.
Now to the thorny question of "its". Here the possessive has always been "its", with no "e". Hence it does NOT have an apostrophe, because there was no letter there to elide. "It's" always is an elision of "it is".
Cheers,
Posted Jan 2, 2025 15:10 UTC (Thu)
by kleptog (subscriber, #1183)
[Link] (4 responses)
No, async/await is a language feature that allows programmers to write asynchronous code in a natural way. I don't see the relevance of PostMessage, it doesn't sleep so is not useful in implementing await. The magic of await is not the triggering of other code, but the fact that the response comes back at the exactly the same point in the calling code.
GUI apps use an event loop, which is not what async/await is about. It may work that way under the hood, the point is *the programmer doesn't need to know or care*. And indeed, Rust apparently does it without event loop (neat trick I may add).
The reason async/await become popular is because it makes the Reactor pattern usable. Creating several new threads for every request is tedious overhead. You may say that 16GB for millions of kernel stacks is cheap, but it's still 16GB you could be doing more useful things with. Most of those threads won't do any syscalls other than memory allocation, so they're wasted space.
(I guess it depends on the system, but in my experience, the I/O related threads form a minority of actual threads.)
I don't know if anyone else here used the Twisted framework before yield/send (the precursor to async/await) were usable. That was definitely the definition of unreadable/hard to reason about code, especially the exception handling. Async/await make sense to anyone with minimal training.
(Interestingly, Twisted got inlineCallbacks also around 2007, right about the time F# got async/await... Something in the air I suppose.)
>Arbitrary cancellation in arbitrary place is just not safe!
Just like arbitrary preemption in arbitrary places is unsafe. Which is why it's helpful if it only happens when you use await. Just like cancelling async tasks is easy, just have await throw a cancellation exception, like Python does.
(I just looked at C# and the cancellation tokens and like, WTF? Why don't they just arrange for the await to throw an exception and unwind the stack the normal way? If the answer is: it was hard to change the runtime to support this they should be shot. Cancellation tokens are an ergonomic nightmare.)
> > Yet await implementation allows to await for results of several IO operations with simpler and more efficient code.
> Why does it even matter? Yes, something that “millions of threads” approach doesn't even need can be easier and faster with async/await… so what?
Umm, you receive a request, which requires doing three other requests in parallel and aggregating the responses. Creating three threads just for that is annoying, but with the right framework you can make it look the same I suppose. But creating a bunch of kernel stacks just so they can spend their lifetime waiting on a socket and exiting seems like overkill.
results = await asyncio.wait([Request1(), Request2(), Request3()], timeout=30)
Is quite readable and does what it says on the tin. And even supports cancellation!
Despite this thread going on for a while, it still appears the only real argument for Google Fibres is "Google uses them internally a lot but no-one else is interested". If there were even one other large business saying they used them and had published sources we'd actually be able to compare results.
Posted Jan 2, 2025 16:19 UTC (Thu)
by khim (subscriber, #9252)
[Link] (3 responses)
It doesn't sleep today which means it was useless as 30 years ago is year 1995, that's before Windows 95 and on Win16 That's how multitasking worked in an era before preemptive multitasking… almost like And it was a nightmare: GUI wasn't “responsive”, it was incredibly laggy because any app could hog the whole system… just like today may happen with single-tread executors (mostly used in tests). That's, ideed, some kind of magic which can be achieved by the use of some kinds of heavy drugs because it doesn't match the reality. It's just not how Sure there are lots of develops who naïvely believe that's how What's the difference? Conceptually “an event loop” does the exact same thing as an executor in When programmer doesn't know or care the end result is incredibly laggy and unpleasant mess. Abstractions are leaky and And if you are Ok with laggy and unpleasant programs then you don't need neither Rust doesn't do it “without event loop”. It just makes it possible to write your own event loop. That was possible to do in Windows 1.0 about 40 years ago, too. So you don't want to lose 5% of memory because you need it… for what exactly? Ah, right. You need these 5% and CPU cycles of memory saved to waste 95% of computer power on a highly inefficient runtime of a highly inefficient language with GIL. Makes perfect sense. Not. Oh, absolutely. But you just have to understand and accept what problems are solved with these tools. And these problems are not problems of efficiency or kernel memory waste (after all wasting 16KiB of kernel memory is still less painful than wasting 16MiB of userspace memory). No, That was the real justification, everything was developed in the expected “we would put another layer of kludges on top of pule of kludges that we already have” fashion. And Google Fibers were developed for that, too… just they were developed for different pole of kludges – but at least they are somewhat relevant to a discussion because they do show that changes in the foundation are possible, even if in limited scope. Why? They were developing solution for yet another pile of kludges. And invented something that worked best for that particular pile. But using language that uses 95% of resourtces just for it's own bookkeeping that have nothing to do with the task on hand is not overkill? Get real, please. And the only counterargument is “we shouldn't waste 5% of resources on Google Fibers after we already wasted 95% of resources on our language runtime”. Which is even sounding funny. And couldn't be real in any sane world. The real argument (and pretty sensible one) is “Google Fibers do solve the problem they claim to solve but we couldn't use them because of pile of kludges that we have while Sure, but nothing prevents you from implementing such API with Google Fibers. Except if you have language with heavy threads (like C#) or GIL (like Python). But if you have committed yourself to such a language then you have already wasted more resources than Google Fibers may ever waste! It's stupid and pointless to talk about P.S. Again, if someone would have went to the foundations and fixed that, by replacing syscalls with something better then complains about wasteful nature of Google Fibers would have sounded sensible. Google Fibers are not perfect. But the fact remains that all current
Posted Jan 3, 2025 9:54 UTC (Fri)
by smurf (subscriber, #17840)
[Link] (2 responses)
> That's, ideed, some kind of magic which can be achieved by the use of some kinds of heavy drugs because it doesn't match the reality. It's just not how async/await works.
The point is that this is how async/await looks like to the programmer – just another method call, with an additional keyword.
The fact that there's a stack unwind/rewind or promises (plus closures) or some other deep magic underneath, along with an event loop or an io_uring or whatever, is entirely irrelevant to a high-level programmer, for much the same reason that most high-level languages no longer have a "goto" statement even though that's all the underlying hardware is capable of.
Posted Jan 3, 2025 13:28 UTC (Fri)
by khim (subscriber, #9252)
[Link] (1 responses)
And what's the difference from call to You couldn't have both ways: either we do care about implementation (and then Sure, but then we should stop pretending that Google Fibers were rejected because they weren't efficient enough. After wasting 50%+ of memory complaining about 5% is just strange. And compared to overhead of dynamic typing cost of syscalls is laughable, too. I'm not arguing about “goodness” of the fact that
Posted Jan 3, 2025 20:59 UTC (Fri)
by kleptog (subscriber, #1183)
[Link]
Well, await returns a value, and PostMessage() return void, so they're apples and oranges really. Sure, there's GetMessage(), but that returns *a* message which is useless from the programmer's perspective. You want the results of the async function you just called, not just any response.
Sure, it's a point at which your code can be preempted to run other code, but in a multithreaded program that's everywhere, so not really a new problem.
Posted Jan 12, 2025 19:10 UTC (Sun)
by mrugiero (guest, #153040)
[Link]
This is simply historically inaccurate, given Rust had green threads at the beginning and only later on decided to ditch them and go for async.
Posted Jan 1, 2025 11:05 UTC (Wed)
by kleptog (subscriber, #1183)
[Link] (5 responses)
There are basically two forms of multiprocessing that are easy to reason about:
* Share-nothing threading, essentially multiprocessing a la Unix processes. This is used in the Actor model, like Erlang, but also (I think) Go and Rust essentially does this too, as the compiler ensures no shared data. Here locking is unnecessary because data races are impossible.
* Multi threading with programmer defined preemption points, aka async/await. Because you know exactly where you can be rescheduled, you can minimise locking requirements and easily reason about the parts that do need it.
The share-everything that is C(++) threading with rescheduling possible at any point is nearly impossible to reason about by mère humans. Which is why it's not popular (although still common).
Btw, async/await is not done using green threads, that would be stupid. The compiler turns your program transparently into a state-machine and your program consists of: pull job from queue, run job, repeat. You could call the tiny fragments of your program that get scheduled "threads" but nobody does that.
The two-colour problem is overstated. Fundamentally you have functions that can sleep and functions that can't. Obviously you can call one way, but not the other. Even the Linux kernel has this, except the compiler doesn't check for you that you did it correctly. This is essential complexity which cannot be removed, only made more ergonomic by language design.
I think this patch set is being sold wrong. You don't want the kernel to manage millions of threads. But if you look at the Erlang VM that schedules millions of user-space processes, it uses one thread per CPU, it can ensure that syscalls are run in separate IO threads, or use io_uring to parallelise the IO calls. But some sleeps are unexpected, like page faults and being able to wake up another thread in that case to continue working is useful. So you'd have maybe 2 threads per CPU with only one running at a time.
Posted Jan 1, 2025 12:09 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link] (3 responses)
I don't think Go does shared-nothing and, while there are actor frameworks for Rust, they are by no means typical. The borrow checker is Rust's no-race solution, not shared-nothing.
Posted Jan 1, 2025 12:31 UTC (Wed)
by khim (subscriber, #9252)
[Link] (2 responses)
Borrow-checker works quite poorly with I'm not blaming Rust developers for that design: it's hard to imagine how could they made something better. But if not for the need to pile that work on top of many layers of kludges then they could have either removed That wasn't done and for obvious reason: adding yet another layer of kludges on top of kludges is just simply easier than changing the foundations.
Posted Jan 1, 2025 15:19 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link] (1 responses)
Note that I claimed nothing of the sort related borrow checker and async. I was pointing out that Rust does not use anything resembling Erlang's actor model.
> Borrow-checker works quite poorly with async because you, very often, couldn't pass borrows throw await point (and righfully so) thus you end up with bunch of Arc's and correctness is no longer checked by the compiler.
I believe there are efforts in the works so that variables which do *not* live across an `.await` point do not affect things like `impl Send` for `async fn` synthesized types.
> That wasn't done and for obvious reason: adding yet another layer of kludges on top of kludges is just simply easier than changing the foundations.
Note that Rust *is* tackling "changing the foundations" problems, but "require all new kernel functionality" isn't one of them (Rust also aims for practical use with existing platforms after all).
Posted Jan 1, 2025 17:40 UTC (Wed)
by khim (subscriber, #9252)
[Link]
And that's exactly what we are talking about here. Remember where the whole discussion started: For me the problem is not how the things were done initially, but rather why they stayed that way even after the original reasoning were no longer relevant. Only in a sense that it's one of the [very few] modern languages that are not sitting on top of massive runtime written in C pr C++. And yes, Rust is very good at what it does, only scope of what it does is still limited by the fact that it have to deal with all the accumulated piles of kludges everywhere. In particular an attempt to embrace “modern green threads” (that are called Technically Google Fibers are much smaller and simpler thing than Rust, but because it changes the foundations people reject it. Often that rejection is even happening on the spiritual level that's fig-leafed by the reasoning that kernel thread have to use 16KiB which means we shouldn't even try to improve 1-to-1 model but have to go with green threads (that are called
Posted Jan 1, 2025 12:24 UTC (Wed)
by khim (subscriber, #9252)
[Link]
No, I know why they are popular. No, they are pretty hard to reason about. Much harder then when you work with threads and channels. And they are pretty hard to bugfix. But on the web they are the only game in the town (because JavaScript is single-threaded language) and on Windows you have to use them because fibers can efficiently support They are popular because you can add them as another layer of kludges of top of what you already have – and that's precisely my original point that started the whole discussion. You only “know the locking requirements” if you are very careful at avoiding sync function. Otherwise they “clog the pipes” and you spend crazy amount of time looking for the root cause. Especially if these sync function not “always slow”, but “sometimes slow”. It's the exact same model that C# or Java uses. It's hard to say that something that's used by 90% of apps (if not 99%) “is not popular”. Sure. There are no need to implement Of course. But that's marketing. If you admit that It can be removed, it's not even conceptually hard, but it have to be done at the root. Just make sure there are no sync functions. Like in one well-know OS. But of course that wouldn't be done. We like our pile of kludges way too much to do something about it. I don't think it's “sold” anymore. Google wanted to reduce difference between their kernel and mainstream one. After looking on the reaction… I think at this point they just decided to carry that patch instead of upstreaming it. That would be an interesting project, but that would be entirely different project. Google Fibers design was the solution for the two-colors problem. And with kernel threads used as basis you no longer have it. You don't need special mutexes (like Tokio mutex, your Fiber-aware code can include communications with regular threads, etc. That was the goal. And it's achieved, that's how Google Fibers were used for more than a decade (and would probably be used for the foreseeable future). Every single reviewer tried to bend that patchset into Erlang-style model or M:N model… which made it entirely uninteresting for Google: sure, someone else may benefit from all that… but Google would still need to carry patch for Google Fibers… what's the point of pushing it, then?
Posted Dec 27, 2024 21:51 UTC (Fri)
by kleptog (subscriber, #1183)
[Link] (5 responses)
Honestly, this feels like a terminology problem. Both threads and fibres have well-known meanings, so when someone writes something like "Fibres are threads" without context, it's just nonsense. The talk doesn't use the word "fibres" at all, thankfully.
From my understanding of that talk, it looks like Google was trying to promote an Actor types threading model like in Erlang, but then transparently for C(++), Python, etc. And indeed, in Erlang it's normal to have thousands of threads all communicating with each other, threads per requests that create more threads. But at the end of the day the number of running threads is always equal to the number of CPUs in the system. The main issue is what to do about blocking syscalls.
AIUI they essentially have the concept of threads the normal scheduler will not schedule, but instead userland has an extra system call "sleep and wake thread X on this CPU" making task switches very quick. It's an interesting idea.
The flaw is that the Actor model is dependant on a shared-nothing threading style, and that just not how most developers have been taught to code. It takes a bit of getting used to, but it scales amazingly. However, since I don't think C(++) developers are going are write shared-nothing threaded programs anytime soon, I really don't see how Google's ideas are going to gain any traction.
Posted Dec 27, 2024 22:34 UTC (Fri)
by ibukanov (subscriber, #3942)
[Link]
Posted Dec 27, 2024 22:43 UTC (Fri)
by khim (subscriber, #9252)
[Link] (1 responses)
Not even remotely close. The idea of Google Fibers is to have all your typical Completely sidestepping two color functions problem but working only on kernels with appropriate syscalls. They already are gaining traction… except they use entirely different foundation. The one that requires rewrite of literally everything… but doesn't require any changes deep in foundations of your OS. That's why I brought them into a discussion of why things have stayed that way even after the original reasoning were no longer relevant: simply because development of software works very similar to evolution. This approach even has a [pretty cryptic] name: open-closed principle – software entities should be open for extension, but closed for modification. 99% of time of time we only add kludges on top of the kludges with only [very rare] situations where something is ever changing deep in the foundations of the existing design.
Posted Dec 28, 2024 13:53 UTC (Sat)
by kleptog (subscriber, #1183)
[Link]
That's disappointing. The Actor model doesn't require async/await (Erlang doesn't have it) so I was hoping you could avoid that altogether and do everything synchronous.
(Internally Erlang ofcourse uses something like select(2), but the language doesn't need it.)
Posted Dec 28, 2024 16:55 UTC (Sat)
by khim (subscriber, #9252)
[Link] (1 responses)
Just noticed that this part have been skipped. No, there are no such issues. Precisely because fibers are threads, too! It would schedule them! Precisely when it makes sense to schedule them: when blocking syscall gave kernel a chance to engage with hardware – and freed CPU for something else! Not “instead”, but “in addition”. When you have lots of fibers (aka “working tasks”, etc) working in your system (maybe even millions of them) there are two reasons for them to go to sleep:
In case of #1 userspace scheduling is useful and obvious choice: userspace knows which fiber (“working task”, etc) is supposed to produce data, while kernel have no clue. In case of #2 userspace have no idea when hardware (or outside world) would decide to provide something… while kernel may have some idea (and if not it may try to guess like it does on normal system without any fibers). Of course balancing #1 and #2 would need some fine-tuning… but that's essential complexity, you can not eliminate it, it's just the nature of the whole system with “millions of threads” that are, nonetheless, interacting with thousands (or maybe hundreds of thousands) of outside actors (aka humans visiting
Posted Dec 31, 2024 21:55 UTC (Tue)
by foom (subscriber, #14868)
[Link]
While the kernel scheduler does handle scheduling of UMCG-managed worker threads ("fibers") to CPUs, it only schedules those in the "RUNNING" state. It will _not_ resume a UMCG-managed thread which the userspace scheduler has marked as "IDLE". All those threads are effectively invisible to the kernel scheduler until explicitly resumed. The intent is that if you have 100 CPUs available for running a particular workload, the userspace scheduler will only mark approximately 100 worker threads as "RUNNING" at a time. So, while the kernel scheduler is still there, it has basically no work to do (unless there's other processes on the machine competing for the CPU time, of course).
If any of the 100 threads block inside the kernel (be it from IO, needing to page in memory from disk, etc), the kernel will automatically wake the associated userspace "server" thread, which is responsible for nominating a new worker to transition into the "RUNNING" state, and then switch to that. The userspace scheduler can also decide to preempt a running thread (e.g. after some amount of time), force it to go into "IDLE" state, and switch another thread to "RUNNING".
This allows application-specific (and thus, hopefully, better) choices to be made as to which thread to run: yes, in the case where the application knows exactly which thread must run next to make progress, but also when a thread blocks on external I/O and you just need to choose "any" other thread to run, and even when there are more potentially-runnable "fiber" threads available than CPUs and you need to switch between them periodically.
Posted Dec 30, 2024 16:05 UTC (Mon)
by NYKevin (subscriber, #129325)
[Link]
Just to clarify: We're generally not supposed to do that. When I discuss Google internals, I try very hard to only talk about things that are public. For example, in a recent thread about precise timekeeping, I talked a lot about Spanner, but that's because there is a public whitepaper that goes into great detail about how it works internally, so I was just repeating things that were already public. I cannot just look up some internal document and tell you what it says right now - if someone who gets paid a lot more than I do has decided that a document should be internal, I don't get to second guess that decision (regardless of whether I may agree with it personally).
Posted Dec 26, 2024 8:45 UTC (Thu)
by Wol (subscriber, #4433)
[Link] (10 responses)
For the same reason that people are (unnecessarily) vulnerable to choking - which kills a lot of people, mostly the young and elderly. When we lived in water, the relative position of the trachea and oesophagus didn't matter. When we moved onto land, it turned out fish had made the "wrong" choice, but the cost of changing it was too high.
Cheers,
Posted Dec 26, 2024 23:07 UTC (Thu)
by ibukanov (subscriber, #3942)
[Link] (8 responses)
And again, for an individual land animal the cost of switching is impossible as it means death. Yet in IT one can implement the solution, try it, show that it works yet it is extremely hard to make others to use it even if it is beneficial.
Posted Dec 26, 2024 23:54 UTC (Thu)
by khim (subscriber, #9252)
[Link] (7 responses)
Software development works in very similar way. The fact that in software development mutations are not random is compensated by the fact that we may only try maybe dozen or two dozen of ways in a very short time while evolution can try billions of combinations for millions of years. Same with software. Almost all attempts to “radically redesign” something leads to a stillborn creation like Workplace OS or BB10. The reason is the exact same as with animals: intermediate stages where two solutions to the same problem are implemented (like Lungfish) need very special and “strange” environment to survive. And without such environments it's not possible to replace things piecemeal. Essentially after software passed the threshold of development times measured in thousands of man-years we entered the exact same situation as with evolution: any small, self-contained element can be rewritten… pretty radically… feathers are very different from fish scales. But change that effects bazillion other pieces? Fuhgeddaboudit. The only way to fix something is to build a replacement from scratch… and if we are talking about something “big” that usually fails.
Posted Dec 27, 2024 10:05 UTC (Fri)
by ibukanov (subscriber, #3942)
[Link] (6 responses)
Those were not stillborn. They worked for a small group of people and were beneficial for them. Yet others were not interested. Or consider Google Fibers described in another thread. They were beneficial for Google and used on a massive scale there.
An evolutionary analogy is if beneficial mutations spanning huge number of genes can regularly happens, but then disappear without traces. This just does not happen.
In general according to our understanding the observed evolution in Nature is a random process *without memory* in a sense that a rate of mutations and their direction does not depend on the previous mutation history. Software mostly designed and clearly its process depends very much on success or failure of the previous steps.
Posted Dec 27, 2024 11:15 UTC (Fri)
by khim (subscriber, #9252)
[Link]
Who benefited from them? Workplace OS was dead before anything was released on it's base. To anyone! Development team was disbanded even before the only product based it was released. Just like Trilobites once dominated the world. They were Apex predators, back then. And then disappeared without leaving any descendants behind. So 99% like a biological evolution then? Why do you think the remaining 1% is so important. They don't depend on the “mutation history” but they definitely depend on the result of that process. Same with software development. We may debate about how ABCDE…Z mutated into QWERTY, but that's only of interest for some hobbysts and historicians. The others are confronted with the fact that most keyboards put keys in the QWERTY order and not in the ABCDE…Z (or, in some case, the opposite, TI Nspire, an TI-92 descendant, specifically doesn't use QWERTY to avoid being classified as “computer”). Evolution of biological species and software development are closer than you think. Even situation when someone's pours billions on a development of something that would, most likely, never go anywhere (like Google currently does with Fuchsia) have a close analogue in a biological world.
Posted Dec 27, 2024 11:49 UTC (Fri)
by magfr (subscriber, #16052)
[Link]
Trilobites were wildly successful - killing them off required multiple mass extinction events.
As for specialist environments with runaway adaptations, that is usually known as island endemic species like pygmy mammoths, Komodo dragons and lots of other examples.
Posted Dec 27, 2024 21:34 UTC (Fri)
by smurf (subscriber, #17840)
[Link] (3 responses)
You're using the past tense. Are they still used?
Posted Dec 27, 2024 21:58 UTC (Fri)
by khim (subscriber, #9252)
[Link] (2 responses)
Last time I have heard about them from reliable sources they were still massively used for, more or less, everything, but people were speaking about them in the past tense because Google was talking about “stopping being a little island” and embracing what everyone else is embracing. And I'm pretty sure the end result would be the reverse of the outcome in other places: they would still continue to power the majority of Google services for decades to come, while people would start pretending that Google is now “aligned with the rest of the world”, only need to change “few vestigial places” while in reality it would still be 90% Google Fibers and 10%
Posted Dec 30, 2024 13:37 UTC (Mon)
by surajm (subscriber, #135863)
[Link] (1 responses)
One of the reasons to pursue async await in c++ is that it will make future interop with a c++ successor easier as well. For example, it's likely going to be possible to get rust and c++ code to share async machinery rather than need two language specific runtimes running in parallel.
Lastly it's worth mentioning that the fiber model is not perfect. Microsoft abandoned something similar in Windows and anyone who has used go understands some of the sharp edges it can have.
Posted Dec 30, 2024 14:11 UTC (Mon)
by khim (subscriber, #9252)
[Link]
How can it abandon something that it never had? Microsoft never had anything resembling Google Fibers. Even from the description it's completely obvious: fibers are not preemptively scheduled… if a fiber accesses thread local storage (TLS), it is accessing the thread local storage of the thread that is running it… if a fiber calls the ExitThread function, the thread that is running it exits. Microsoft Fibers do have the issues that spooked my opponents: they do need non-blocking syscalls, they couldn't use synchronous code, etc. They are closer to Sure. Because go is yet another machinery that's similar to Which is the main selling point of Google Fibers: fibers are threads, patterns that work for threads such as Mutex-based synchronization “just work” with Fibers as well and new constructs introduced with Fibers “just work” when used by a non-Fiber thread. Have they removed that from Yeah. “Let's not be an island” mantra that Google started promoting few years ago. We'll see how well would it work. So far I'm skeptical, but then I was skeptical about crazy C++ idea of generating tons of useless code that compiler would then be eliminating – and that one worked pretty decently (in fact without that work of LLVM developers Rust wouldn't be viable). P.S. My biggest grief with
Posted Jan 20, 2025 10:08 UTC (Mon)
by sammythesnake (guest, #17693)
[Link]
[0] That was supposed to say "amusingly" but I quite liked what autocucumber did on this occasion!
[1] See, for example, https://www.npr.org/2010/08/11/129083762/from-grunting-to...
[2] Birds are an interesting case - they have very different throat anatomy[3] capable of some startlingly complex sounds, including in some cases remarkably faithful reproductions of not only speech, but random other noises. While they don't often choke on their food, most (all?) can't swallow against gravity, so another compromise was involved with their complex sounds production...
[3] See https://en.m.wikipedia.org/wiki/Syrinx_(bird_anatomy)
Posted Dec 25, 2024 16:34 UTC (Wed)
by Sesse (subscriber, #53779)
[Link]
I guess part of it is because they came late; more than 10 years into Unix (4.2 BSD), and not POSIX-standardized before 2000. I don't know if that first version even included SO_PASSCRED (which you'd need for authentication).
Posted Dec 25, 2024 17:20 UTC (Wed)
by bluca (subscriber, #118303)
[Link] (5 responses)
That's because D-Bus _is_ better as an IPC to control system services, if not for the one fatal flaw of not being available in early boot, which is essentially unsolvable without having the IPC primitives in the kernel, which is what kdbus first and bus1 later were trying to do. Unfortunately "the kernel is the wrong place to implement IPC primititves" (or so they said, before merging Binder to which this reasoning doesn't apply, for some reason), so here we are.
The reason Varlink usage can be expanded now is thanks to a particular recent kernel feature, PID FDs. WIth that, it is now possible to reliably identify processes for the purpose of interactive authentication. It just wasn't possible to do so earlier, given PIDs and all other per-process metadata can be trivially spoofed by an unprivileged attacker.
Posted Dec 25, 2024 18:39 UTC (Wed)
by lunaryorn (subscriber, #111088)
[Link] (1 responses)
I can call a varlink service with the Python or Demo standard libraries, or even with nc, jq, and bash, but scripting or glueing anything with DBus is still cumbersome even after all these years.
Perhaps things would be different now if we had gotten kdbus l, but as things stand today I'm more than happy to see a well specified DBus alternative emerge which seems to hit the sweet spot between simplicity for simple use cases while still allowing for incremental complexity for more complicated situations.
Posted Dec 25, 2024 18:40 UTC (Wed)
by lunaryorn (subscriber, #111088)
[Link]
Posted Jan 3, 2025 15:05 UTC (Fri)
by nim-nim (subscriber, #34454)
[Link] (1 responses)
Well then as quotemstr pointed out the correct thing would be to build upon Binder now that it’s merged. I don’t pretend to understand the problem space at your level but I can not parse your logic. If D-Bus is the best solution as an IPC to control system services except for the lack of IPC primitives in the kernel, why should not it be preferred to varlink, now that there are IPC primitives in the kernel. Unless binder is fundamentally flawed as a d-bus underlay ?
Posted Jan 3, 2025 15:20 UTC (Fri)
by bluca (subscriber, #118303)
[Link]
Posted Jan 12, 2025 19:52 UTC (Sun)
by mrugiero (guest, #153040)
[Link]
I seem to recall Binder was there before Bus1 and was simpler in some sense than kdbus? I also remember Torvalds saying if Greg thought it was a good idea then he trusted it or something along those lines? It's quite likely I'm misremembering, but you being one of the interested parties I think you might share some more data about whether I remember right and what happened after that caused Bus1 to stagnate.
Posted Dec 25, 2024 19:52 UTC (Wed)
by quotemstr (subscriber, #45331)
[Link] (1 responses)
This churn is unfortunate. Binder is *already* in the kernel and completely solves IPC. It gives Linux an object-capability system (https://wiki.c2.com/?ObjectCapabilityModel), which is "inherently modular and secure". Object-capability systems (like COM and Binder) are the pinnacle of IPC technology and Linux should adopt them instead of churning through "simple" but ultimately incapable JSON protocols every decade or so. DBus's lack of real object handle support makes Linux IPC much more awkward than it should be.
Posted Jan 12, 2025 19:57 UTC (Sun)
by mrugiero (guest, #153040)
[Link]
Posted Dec 26, 2024 9:24 UTC (Thu)
by gray_-_wolf (subscriber, #131074)
[Link] (2 responses)
Posted Dec 26, 2024 11:15 UTC (Thu)
by bluca (subscriber, #118303)
[Link] (1 responses)
> "stable" is for normal releases of the system, suitable for production use. Generally, stable releases become end-of-life soon after the next major stable release is out, although this might not be the case if, for example, a distribution adopts a rolling release model and still be production ready. Examples include Fedora 40, Ubuntu 23.10, OpenSUSE Tumbleweed, and Arch Linux.
Posted Dec 27, 2024 23:21 UTC (Fri)
by gray_-_wolf (subscriber, #131074)
[Link]
Release mail on the mailing list is sent by a bot
Release mail on the mailing list is sent by a bot
SysV service scripts
SysV service scripts
SysV service scripts
SysV service scripts
SysV service scripts
Unix sockets
And as sockets can be protected by normal Unix permissions there is no need to provide custom permission management.
> For example, why su needs to be setuid rather than talking to a service at the socket?
Unix sockets
Unix sockets
> I.e. it is one thing to stay with things that are proven to work instead of trying new-and-shiny-and-broken, but staying with things that are known to be harmful is another.
Unix sockets
code:
drive letters… they only made any sense for a couple of years… but now so many things depend on these decisions made decades ago that changing these decisions is no longer feasible!Unix sockets
> Now that Linux basically killed all other UNIXes, other than Darwin, we might finally start getting better infrastructure.
Unix sockets
Unix sockets
> So we're getting reliable infrastructure (systemd and friends) because companies (including hyperscalers) need reliable basic infrastructure.
Unix sockets
Unix sockets
> I'm not sure how an experimental patchset with unclear long-term support considerations is a relevant example.
Unix sockets
Unix sockets
> If your threads are not kernel-level threads, which applies to Fibers, then you need something like io_uring under the hood.
Unix sockets
Unix sockets
> Anyway, what happened to it? that video is 11 years old.
Unix sockets
A Fiber is a lightweight execution construct, layered on top of a thread. Fibers are threads, so patterns that work for threads such as Mutex-based synchronization “just work” with Fibers as well. Likewise, new constructs introduced with Fibers “just work” when used by a non-Fiber thread.
A given Fiber is uniquely associated with a single underlying kernel thread for the Fiber's lifetime. That kernel thread is likewise bound to the Fiber – until the Fiber terminates, that kernel thread serves no other Fiber. Once the Fiber does terminate, the hosting kernel thread may later be re-used by the implementation to host another Fiber.async
runtimes: channels, select and other such things.Unix sockets
I wish fibers were there because it's a nice, low-effort approach, but I don't see how it changes the foundations any more than eBPF or io_uring do.
The async approach suffers from perfectionism tho, that much is true. It tries to go to the last byte and the price is non-negligible human effort to adapt. Some experiments have been done in avoiding the colouring problem in Zig, but it seems they didn't go well given they deprecated async AFAIK.
Unix sockets
> I don't believe that lightweight userspace threads in Google's proposal create 1-to-1 kernel threads.
Unix sockets
Unix sockets
> I just don't believe that you're describing the patchset correctly.
Unix sockets
Unix sockets
> It's not Apache vs nginx, it's fibers vs the incumbent/async.
Unix sockets
io_uiring
? Do we have version of Apache with io_uiring
that beats Nginx?Unix sockets
> Isn't Go supposed to have exactly the sort of light-weight threads that should be able to take advantage of this?
Unix sockets
async
/await
system and are Google's solution to the fact that not everyone have Google Fibers.Unix sockets
> Huh. For me async/await implies the two-color problem, which Go doesn't have.
Unix sockets
async
/await
… slowly (I wonder if they have even allowed the use of async
/await
in Rust by now). Microsoft Windows style (where work on NT started in year 1989, but most users only got their home edition is year 2001, after 12 years of development).Unix sockets
> They merely add a primitive to forcibly suspend/schedule the user-space threads.
Unix sockets
async
/await
story with “green threads” that are communicating via channels one needs a mechanism to wake up “the other side of the channel” when you pass the work to it.Unix sockets
> and it's just another N:M scheduling system
Unix sockets
io_uring
pools take memory, everything takes memory.Unix sockets
> Apple M-series uses 16K pages and it’s only a matter of time when other CPUs follow. So kernel threads will take more memory.
Unix sockets
io_uring
solves is not high cost of context switching (it's still needed to actually perform the desired operation, anyway), but green threads (aka async
/await
) blocking. With “million threads” model blocking is not an issue.async
/await
wasn't born on C++ or Rust. Both were pressured into adoption of async
/await
by people who were using it in C#, JavaScript, Go and other languages that “waste” memory and CPU cycles in a much worse fashion than Google Fibers.async
/await
as much as they could and were literally pressured into it by people who were used to it on their bloated tracing-GC based memory and CPU hungry runtimes tells us everything we need to know about the real reason for async
/await
… no, that's not efficiency.async
/await
into a set of green threads. Heck, C++ even backed green threads into their version of async
/await
on the language level and Rust haven't done that only because of incredible resistance of Rust language developers, people who clamored for async
/await
wanted the same thing they got in C++ (and most other languages) and still grumble about “useless complexity” that Rust attempt to avoid allocations brings.io_uring
. Tokio-uring is separate and different runtime. You have to support it explicitly in your code.async
/await
have over Google Fibers is the fact that it leaves hundreds of layers of kludges that we accumulated by now undisturbed. And buzzword compliance, of course.async
/await
is used because it's “more efficient” in a world where people are using it in Node.JS and/or Electron-based apps that then come with their own copy of everything, including the kitchen sink (well… a userland USB driver for xbox360 controllers…).async
/await
. On the contrary, they did superb work (Rust guys, especially). But it's important to understand why that work was even needed… and that's to not touch the accumulated pile of kludges… not to save 16KiB of kernel memory. You may save few kernel threads by going with Electron app with async
/await
JavaScript framework, but you sure as heck don't actually save neither memory nor CPU cycles. What you do save is investment into “pile of kludges”… and that's not a bad thing, I'm not saying that everyone should drop things they use and embrace Google Fibers… but it's important to understand real reason for why one thing or the other thing is used.Unix sockets
> The user space stack will grow from 4K to 16K.
Unix sockets
await
was called PostMessage back then. It worked, but poorly.async
/await
and, in fact, adding async
/await
doesn't help.async
/await
was added to F#, C#, Haskell, Python, TypeScript, JavaScript (in that order), and only after all that – to C++ and Rust.async
/await
last.async
/await
was added to C++ it's biased in the other direction.async
/await
is not easier and requires the exact same same dance. Read about how CancellationTokens work in C#.async
/await
paradigm Rust adopted it in a radically different fashion from all other languages.async
/await
) - and many still think it was a mistake: in practice people don't understand that any async
function in Rust can be stopped and then dropped at any await
point and there are attempts to fix that problem!async
/await
is only available in one implementation that arrived late and is considered a problem there by many developers!async
/await
… so what?async
/await
but they both realize that you need to bake cancellation into context and only cancel in places where it's safe to cancel.async
/await
that Rust specifically adopted in year 2019 (and that was designed in years 2016-2018, you can find the references here) influenced the decisions of Microsoft in year 2007 (when F# got async
) or even in year 2012 (when C# got it in version 5.0) then would recommend you to study how casualty works: “after does not mean because”, sure… but “before” does means “not because”!async
/await
, it's really a very exciting work… only it's brought to life by the set of kludges that Rust had when it embraced async
/await
and not by the fact that Google Fibers are inefficient.async
/await
not because they wanted to build something new and unique, nope, Rust embraced async
/await
because that was popular paradigm in other languages, most of them either not very efficient or highly inefficient languages – and developers wanted the same paradigm in Rust.async
/await
while trying to reject the “green threads”) got all that it got not because “millions of threads” approach was not considered efficient enough. Also, in practice, only embedded world tried to embrace async
/await
without “green threads” in Rust.async
/await
and not result of someone's decision to go redesign the foundations.Unix sockets
Unix sockets
Wol
async/await
> I don't see the relevance of PostMessage, it doesn't sleep so is not useful in implementing await.
async/await
await
30 years ago? What kind of logic is that? Besides .await
is not about sleeping (in fact it would return right back if there are no other work to do anywhere in the system).PostMessage
gives the chance for other GUI window procedures to run and process their GUI changes – and that's the essence of await
.async
/await
works today.async
/await
works.async
/await
works, and it may even be the reason for async
/await
popularity… but it's not how it works.async
/await
world. Sure, most languages hide these details from the developer, but so do applications frameworks (MFC, OWL, etc).async
/await
is not an exception.async
/await
nor “millions of threads”, producing mess is easy with any paradigm used.async
/await
was solving entirely different problem: on OSes and languages that had no efficient threading (or, in some cases no threading at all) it made it possible to write some kind of concurrent processing that actually worked.async
/await
are more complicated, but they fit into said pile of kludges better”. Because:
Is quite readable and does what it says on the tin. And even supports cancellation!
async
/await
as if they are better in general if the only reason they exist is deficiency in our OSes and language runtimes.async
/await
alternatives are even more wasteful! Except Rust, maybe, but then Rust was very late to the party and Rust developers have also accepted “pile of kludges” reasoning, they just kinda said ”if we need to adopt async
/await
then we may as well use that opportunity to add something to the language that we wanted to have since day one”.async/await
> The point is that this is how async/await looks like to the programmer – just another method call, with an additional keyword.
async/await
PostMessage
that also look like “just a function call”, but have many subtle implications, including possible switch of context to handle other GUI objects?async
/await
is just a set of green threads with funky interface) or we don't care about implementation (and then it's just new edition of GetMessage
/PostMessage
).async
/await
was adopted and Google Fibers ignored. I'm just showing that in both cases choice was made not because it's actually better for something – but because it was better fit for the pile of kludges that there was accumulated already. Or maybe better to say not even “kludges” but horse's assed: in many cases it's not even kludges that dictate our choices but some past decisions that “made sense at the time”. Like use of NUL
as string terminator. How many things that “clever” decisions brought? From changes to the languages (like PChar
in Pascal/Delphi) and straight to design of our CPUs (e.g. RISC-V Fault-Only-First Load may as well be called “strlen
load”). Lots of crazy kludges, but can we reverse that decisions? Very unlikely.async/await
Unix sockets
Unix sockets
Unix sockets
> The borrow checker is Rust's no-race solution, not shared-nothing.
Unix sockets
async
because you, very often, couldn't pass borrows throw await
point (and righfully so) thus you end up with bunch of Arc
's and correctness is no longer checked by the compiler.sync
entirely (but then you need to do syscalls differently) or, alternatively, could have adopted “millions of threads” Google Fibers model (and the borrow checker would have been much more usable).Unix sockets
> Rust also aims for practical use with existing platforms after all
Unix sockets
async
/await
today) makes Rust's safety story weaker…async
/await
today). I wonder why it, suddenly, these 16KiB have became so critical today… they weren't critical 20 years ago, so what have changed not?
> I think you're missing why async/await are so popular
Unix sockets
async
/await
model, but couldn't handle millions of threads.async
/await
with green threads because async
/await
are already a green threads.Box<Future<Foo>>
(which includes frames for all functions that your function may need to call directly or indirectly) is indistinguishable from Novell Netware style green thread then you wouldn't be understood… even if technically there are very little difference (only in case of Novell Netware it was human who was calculating the size needed for that construct and recursion was allowed, in Rust compiler does the same job and recursion is not allowed… both have pluses and minuses, but essentially they are one and the same… oh, right, await
was called something like PostMessage
30 years ago – and you had to use it in preemption points even if you had nothing to “post”).Unix sockets
Unix sockets
> The flaw is that the Actor model is dependant on a shared-nothing threading style, and that just not how most developers have been taught to code.
Unix sockets
async
/await
machinery (channels and select are provided, among other things) while simultaneously using code that uses normal synchronyous API, normal synchronyous Mutex, etc.Unix sockets
> The main issue is what to do about blocking syscalls.
Unix sockets
google.com
domains)… you couldn't just ignore that complexity, it's there from the beginning, it's part of the problem space!Unix sockets
Unix sockets
Unix sockets
Wol
Unix sockets
> The difference between the fish and, say, setuid is that evolution is blind and proceeds by killing things that do not work while randomly mutating things.
Unix sockets
Unix sockets
> They worked for a small group of people and were beneficial for them.
Unix sockets
Unix sockets
Unix sockets
Unix sockets
async
/await
for the foreseeable future.Unix sockets
> Microsoft abandoned something similar in Windows
Unix sockets
async
/await
machinery then to Google Fibers.async
/await
and different from Google Fibers. Goroutines also need non-blocking syscalls and couldn't use regular thread-pinned Mutexes and other such constructs.go/fibers
? From what I understand these were main selling points when Google Fibers were developed – and they were never present in Microsoft's Windows Fibers and/or goroutines.async
/await
model is that instead of “going back the root cause” and dropping POSIX for something better it tries to add yet more lipstick on the pig. Rust gives the hope, though: it's growing popularity in the embedded space, where POSIX limitations don't exist means that someone may finally build something usable and working out of async
/await
model. That would actually be better than Google Fibers. And then it can be adopted by Linux… but chances of that happening are very slim. That's a lot of work and most developers don't even know that such alternative may exist, they are too accustomed to what POSIX does.Unix sockets
Unix sockets
Unix sockets
Unix sockets
Unix sockets
Unix sockets
Unix sockets
Unix sockets
Unix sockets
Unix sockets
What RELEASE_TYPE for rolling distribution?
What RELEASE_TYPE for rolling distribution?
What RELEASE_TYPE for rolling distribution?