|
|
Log in / Subscribe / Register

ABI stability funding

ABI stability funding

Posted Nov 24, 2025 21:29 UTC (Mon) by DemiMarie (subscriber, #164188)
In reply to: Shared libraries by willy
Parent article: APT Rust requirement raises questions

The problem is real. The funding to solve it is missing.

Server software is often shipped as containers nowadays, and containers don’t benefit much from dynamic linking. In fact, static linking is often considered a benefit in the server world due to ease of deployment.

Embedded systems do benefit from dynamic linking, and Android uses dynamic linking for its Rust crates. However, updates for embedded devices are usually complete images, so ABI stability is of very little value. The only advantage would be allowing binary dependencies to use Rust APIs.

The systems that benefit greatly from ABI stability are “traditional” distros with mutable root filesystems. However, none of them have been willing to fund the needed improvements. Furthermore, many of these distros are run by volunteers.

Like fishface60, I hope that Canonical, SUSE, Red Hat, or Valve steps up and funds a solution.


to post comments

ABI stability funding

Posted Nov 24, 2025 23:17 UTC (Mon) by bluca (subscriber, #118303) [Link] (12 responses)

> Server software is often shipped as containers nowadays, and containers don’t benefit much from dynamic linking.

Except of course that's not really true, as proven by companies like Redhat spending tons of dev time to implement very, very complex solutions to post-facto deduplicate said containers, because that whole docker mess doesn't really scale beyond a handful of instances. Storage, memory and loading time costs are through the roof because of the intense duplication.

ABI stability funding

Posted Nov 25, 2025 20:37 UTC (Tue) by jhoblitt (subscriber, #77733) [Link] (11 responses)

If this was true, kubernetes would have been dead upon arrival. Even an extreme case of needing a terabyte of OCI layers is an incremental cost per server on the order of $100, which is probably less than 1% of the acquisition cost of a new 1U server.

At my $day_job, I have increased the k8s per node pod limit on most clusters up to 250 from the default of 110 as nodes were routinely hitting the pod limit but could easily handle more load.

ABI stability funding

Posted Nov 25, 2025 20:49 UTC (Tue) by bluca (subscriber, #118303) [Link] (10 responses)

So with a million servers one has to spend an extra 100 millions for no particular reason other than because kubernetes is crap? Sounds about right

ABI stability funding

Posted Nov 25, 2025 21:22 UTC (Tue) by khim (subscriber, #9252) [Link] (9 responses)

If you have million servers then you would spend way more than $100 million on stuff not even remotely related to what you would pay for these servers. Wouldn't be surprised to find out that just building permits would cost more.

Heck, with million servers one, single, outage caused by problems with shared library compatibility may cost you more than $100 million!

An attempt to inflate price of something by shouting “but what if there are thousand, ten thousand, millions servers” would never work because not just expenses grow linearly but also cost of potential problems also grow linearly. Time when dynamic linking was feasible is long in the past for this very reason: in a world where human labor is cheap and hardware is expensive saving of one byte made sense. In today's world… not so much.

You may win some, in rare cases, if you have some component that's shared between hundreds and thousand of different programs (maybe some core OS library) but everything above that is cheaper not to share.

ABI stability funding

Posted Nov 25, 2025 21:48 UTC (Tue) by bluca (subscriber, #118303) [Link] (8 responses)

...and yet, tons of money is being spent developing various solutions to this very problem. How curious!

ABI stability funding

Posted Nov 25, 2025 21:52 UTC (Tue) by khim (subscriber, #9252) [Link] (7 responses)

Nothing curious, really. It's just matter of priorities: upgrade of one shared library on a server farm with million servers that would bring down your whole datacenter may cost you a lot more than $100 million thus you install Kubernetes and don't do that, but, of course, $100 million are still $100 — if you may, somehow, save them without exposing yourself to instability caused by distros quicksand then you will do that.

It's matter of priorities.

ABI stability funding

Posted Nov 26, 2025 1:06 UTC (Wed) by bluca (subscriber, #118303) [Link] (6 responses)

Yeah because famously kubernetes runs on thin air, it most definitely doesn't run on "distros quicksand". Also it never needs to be updated, and never, ever breaks

ABI stability funding

Posted Nov 26, 2025 2:53 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (5 responses)

> Yeah because famously kubernetes runs on thin air

That's actually close to truth. There was even a project to run K8s as PID1. Most installations don't go _that_ far, and just limit themselves with something minimalistic like Alpine.

ABI stability funding

Posted Nov 26, 2025 3:18 UTC (Wed) by jhoblitt (subscriber, #77733) [Link] (1 responses)

There are compelling reasons to let kubelet use systemd to manage slices, so Alpine is probably not a popular host OS. However, it is incredibly popular as an OCI base layer.

ABI stability funding

Posted Nov 26, 2025 5:29 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

I was thinking more about the host for the control plane (kube-apiserver and such). Worker nodes are more diverse (there are even Windows nodes).

My desktop Docker host uses a cut-down Debian version without systemd running (there are just 4 processes: /initd services, /usr/bin/containerd-shim-runc-v2, /usr/bin/containerd, /usr/bin/rosetta-mount).

ABI stability funding

Posted Nov 26, 2025 11:15 UTC (Wed) by bluca (subscriber, #118303) [Link] (2 responses)

> That's actually close to truth.

In your fantasy land perhaps - back down here in the real world, everyone and their dog deploys on Ubuntu as the host, with RHEL and its derivatives distant contenders

> something minimalistic like Alpine

...which is also famously not a "distribution" but consists entirely of aether, right

ABI stability funding

Posted Nov 27, 2025 17:12 UTC (Thu) by ssmith32 (subscriber, #72404) [Link] (1 responses)

Er, actually, if we're going to go down the "well, back in the real world" route:

1) In the real world, most folks are going to use whatever OS their cloud provider uses for their k8s solution. So for EKS, Amazon Linux or Bottlerocket or something. And if it's GCP, they're probably rebuilding the whole world for funsies even if they use Ubuntu, because Monorepos Are (not) Awesome.

2) In the real world, people don't choose one or the other. Both have trade-offs, and most end up using both, depending on the situation. A base image of Ubuntu, with the occasional application installed via flatpak, etc.. is often the best solution for home use. k8s when deploying a large number of services at scale for commercial use.

ABI stability funding

Posted Nov 27, 2025 18:50 UTC (Thu) by bluca (subscriber, #118303) [Link]

For managed solutions sure, but the topic at hand here was custom hosts that one chooses and sets up to run their containers or VMs or whatevers.

ABI stability funding

Posted Nov 25, 2025 8:58 UTC (Tue) by taladar (subscriber, #68407) [Link] (10 responses)

I would argue that the funding isn't there because you can only lose in terms of performance and language capability when you remove any inlining and use of generics across crates. Most likely you would end up with the same "hope and pray it works" approach that C++ is using but it would work less reliably in languages like Rust that use generics and optimizations (e.g. struct field reordering) even more than C++ uses templates.

ABI stability funding

Posted Nov 25, 2025 13:35 UTC (Tue) by khim (subscriber, #9252) [Link] (8 responses)

The funding is not there because there are no actor who may benefit from that work and have some money to spare.

Google and Microsoft don't have an incentive to fund anything like that because they are not providing Rust ABIs (at least not yet) and distros are not in position to develop anything and don't even feel it's their responsibility to develop anything.

Story about “awful inlining” is entirely moot point: you have the same thing with dyn Trait already, what this would would do, in terms of the language is to bring dyn Trait to parity with impl Trait, if you want inlining then simply don't use dyn Trait and you are done.

ABI stability funding

Posted Nov 30, 2025 17:11 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (7 responses)

dyn Trait can never reach parity with impl Trait. impl Trait is Sized, and that implies a lot of API flexibility that dyn Trait is physically incapable of expressing under Rust's current data model.

Alternative data models, and why they do not solve this problem:

* dyn Trait in a Sized context becomes syntactic sugar for Box<dyn Trait> (autoboxing) - There is no guarantee that we even have a heap in the first place. It would create a bizarre inconsistency where T is not the pointee of &T. And there are several problems (see below) that it doesn't actually solve.
* Methods with a self receiver implicitly specialize to accept Box<dyn Trait> as the receiver - Does nothing for non-receiver Self arguments, since we cannot prove that they have the same concrete type as the receiver. In the general case, it is possible to impl Trait for Box<dyn Trait> (or write an inherent impl for dyn Trait), and then this is ambiguous. And it does nothing for associated types and non-method associated functions, which would remain dyn-incompatible anyway.
* Introduce &owned references that "confer ownership" and can be moved out of (unlike &mut, which can be swapped but not moved-from) - Similar to the previous bullet, this fixes the self receiver and does not help with much else. On top of that, it is yet another type of reference we would need to think about.
* Steal dynamic_cast from C++ - This has already been done, see std::any::Any (or you could reinvent your own with unsafe code, if for some reason Any is too inflexible for you). But that forces these APIs to become either fallible or unsafe when dispatched from dyn Trait, and that's much less ergonomic than impl Trait.

ABI stability funding

Posted Nov 30, 2025 17:32 UTC (Sun) by khim (subscriber, #9252) [Link] (6 responses)

> impl Trait is Sized, and that implies a lot of API flexibility that dyn Trait is physically incapable of expressing under Rust's current data model

Where have you read the idea that it wouldn't change the “current data model”? It would, of course. The same way Swift did: by permitting !Sized types on stack and so on. Large work, sure, but nothing impossible.

> Alternative data models, and why they do not solve this problem:

Have you excluded not just obvious, but already implemented (in Swift) solution on purpose? Introduce dynamic, parametrised types — and that's it. Yes, it would change language in a subtle ways: str would no longer be one, single, type, but would become parametrised type with length fixed at runtime… so pointer or reference would need to include reference and also length… oh, right, that's how things already word, isn't it? That would actually simplify the language instead of making it more complex. Would remove lots of corner cases related to !Sized types.

The big question is not whether that can be done, but how hard would it be. Probably lots of simple, tedious work… but nothing really crazy.

Backward compatibility could be problematic, though.

The worst thing that would happen: some panics that currently happen at compile-time would start happening at runtime… consider function that removes one element from [u8] (slice, not array!) and returns the result… what should it do if slice is already empty?

Currently all that zoo is kept out of stable, but there are lots of struggles in attempting to make it all work… struggles that, ironically enough, become trivial if you introduce runtime-parametrised types and make dyn Trait identical to impl Trait.

It's more of a political decision than technical decision: now one would fund such work without promise of it being included in the compiler — and no one would give such a promise if only ideas on paper exist and there are no working code. But maybe someone can do that as Rust fork?

The biggest question is with functions that may return unknown type (known type, unknown parameters: think about function that removes duplicates from a slice). As first step these may be forbidden outright: every type that leaves function should be describable in terms of input (but then the aforementioned function that removes one element from slice is also impossible) — a bit line lifetimes are handled today.

ABI stability funding

Posted Dec 1, 2025 23:12 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (5 responses)

> The same way Swift did: by permitting !Sized types on stack and so on. Large work, sure, but nothing impossible.

I refer the honorable gentleperson to the answer I gave some moments ago:

> Similar to the previous bullet, this fixes the self receiver and does not help with [associated types, associated functions, and various other dyn-incompatible things besides the self receiver].

ABI stability funding

Posted Dec 1, 2025 23:14 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (4 responses)

(And yes, I can see that you wrote a bunch of stuff that boils down to "let's make method dispatch fallible." But that is so obviously absurd that it is undeserving of a serious response.)

ABI stability funding

Posted Dec 1, 2025 23:41 UTC (Mon) by khim (subscriber, #9252) [Link] (3 responses)

One may face random compilation error today with impl even if all formal trait-level checks succeed. Naturally if we want parity between impl and dyn we couldn't skip that quirk, too. And because with dyn they couldn't be done at compile time they have to be done at runtime.

As I have said: that's not a big deal, in practice, the majority of existing software is written in languages that have that “defect” and it's rarely a problem, in practice. Whether Rust team wants to deliver something usable in realistic timeframe or would prefer to punish users with what they have now is up to them, ultimately: using extern "C" doesn't solve that problem, in fact it makes it worse.

ABI stability funding

Posted Dec 2, 2025 8:44 UTC (Tue) by taladar (subscriber, #68407) [Link] (2 responses)

As a Rust user anything that causes more failures at runtime that could instead be compile time failures would be something I would definitely strongly oppose.

ABI stability funding

Posted Dec 2, 2025 10:21 UTC (Tue) by khim (subscriber, #9252) [Link] (1 responses)

We are talking about Rust, not Haskell, here. Uses of extern "C" definitely generates more problems at runtime that higher-level interface ever could — only they generate random crashes, not predictable panics.

In the past for things like integer overflow or array access Rust usually picked up panics in place of UB, but with stable ABI approach is the opposite.

That looks a bit illogical to me, but then, people are not always rational.

ABI stability funding

Posted Dec 9, 2025 0:37 UTC (Tue) by gmatht (subscriber, #58961) [Link]

The current convention for static linking avoids this. If dynamic linking became more common in Rust, making a missing library a compile time error may not be possible, but it would be nice to at least be a link time error.

ABI stability funding

Posted Nov 25, 2025 16:52 UTC (Tue) by Wol (subscriber, #4433) [Link]

> but it would work less reliably in languages like Rust that use generics and optimizations (e.g. struct field reordering) even more than C++ uses templates.

And there's no possibility to declare an interface as "extern", which means that anything crossing that interface cannot be optimised in a way that would break an external app that doesn't know about the changes?

Of course, that then means a strict separation of declarations, inline definitions, and generics, but might that not be a good thing?

I can see that trying to turn generics into concretes might be a little tricky, but a dummy call for every generic you want to concrete, over an extern definition, would do it?

And just like with "unsafe", you could offload the responsibility to the programmer to make sure the use of the definition files is consistent. With automated traps as far as possible.

Cheers,
Wol


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds