|
|
Subscribe / Log in / New account

Defining the Rust 2024 edition

Defining the Rust 2024 edition

Posted Feb 3, 2024 9:19 UTC (Sat) by mb (subscriber, #50428)
In reply to: Defining the Rust 2024 edition by jschrod
Parent article: Defining the Rust 2024 edition

Guys, can we please all educate ourselves a bit? Please?

It is possible to have dynamic libraries with Rust.

It is just not possible to have a stable ABI for *arbitrary* programs. That just as impossible for Rust as it is impossible for C++ and even C.
It doesn't make any sense to link generic libraries dynamically. It doesn't make sense in Rust and it also doesn't make sense in C++ (templates, preprocessor) or C (Preprocessor generics). These kinds of libraries always have to be statically linked.

But if you fear world rebuilds as an issue for your library, you *can* avoid that by carefully making it a shared library. No different from C++ or C.

The reality is that these world rebuilds rarely are an issue.
I don't know of a single case of stdlib bug that made a world rebuild mandatory. There probably are a couple of these incidents, though.
And that is simply because certain kinds of bugs cannot happen in Rust. Because either the compiler outright refuses to compile it, or the programmer's mindset when writing unsafe Rust is completely different from programming C.

Even if a CVE is assigned to a certain bug in a central library, it rarely affects any *actual* program. In Rust programs CVEs are assigned for all unsoundness bugs. An unsoundness bug *could* result in an invalid program iff the program exploits the unsoundness. But in practice that rarely happens, because usually such code would not be idiomatic Rust code.

Lots of really bright people who wrote great software are commenting here. I really appreciate that and it makes LWN kind of unique.
But if you have no clue about Rust, please try it. Please educate yourself. It's not a shame. It will bring you forward. And it doesn't mean that you will have to like Rust. But you will *know* it and you will not look like a fool in discussions like this anymore.
(And no, I also don't know everything about Rust and I might also have commented incorrectly at times).
Rust is very different from C. You cannot simply apply your existing C knowledge to Rust. It will automatically result in such ridiculous discussions as seen here.


to post comments

Defining the Rust 2024 edition

Posted Feb 3, 2024 12:22 UTC (Sat) by pizza (subscriber, #46) [Link] (13 responses)

> It is just not possible to have a stable ABI for *arbitrary* programs. That just as impossible for Rust as it is impossible for C++ and even C.

The entire C world disagrees with you, using the fact that it _exists_ as evidence.

Defining the Rust 2024 edition

Posted Feb 5, 2024 10:07 UTC (Mon) by farnz (subscriber, #17727) [Link] (12 responses)

I don't believe that the C world has a stable ABI for arbitrary programs; rather, the C language is sufficiently limited that it's easy to avoid the sorts of things that result in an ABI change.

Otherwise, you're claiming that if I reduce MAX_BYTES in the below header file and rebuild the dynamic library that used it, any program that was built against the old definition (and hence the old ABI) will automatically adjust to match:


const size_t MAX_BYTES=65535;

struct priv_context;
struct context {
    uint8_t buffer[MAX_BYTES];
    struct priv_context *private;
};

/// Allocate a new context. Note that priv_context has entries sized based on MAX_BYTES
struct context * allocate_context();

/// Supply new data to the context; if the input is more than MAX_BYTES, then the behaviour of this function is undefined
void new_data(struct context * context,  uint8_t * bytes, size_t size);

This does not match the C standard I'm used to working with.

Defining the Rust 2024 edition

Posted Feb 5, 2024 13:17 UTC (Mon) by pizza (subscriber, #46) [Link] (11 responses)

> Otherwise, you're claiming that if I reduce MAX_BYTES in the below header file and rebuild the dynamic library that used it, any program that was built against the old definition (and hence the old ABI) will automatically adjust to match:

You changed the *API*, of course you're going to have issues. This is true of _any_ language, static or dynamic; computer or human.

Meanwhile, in the C world, I can take a shared library that is literally over two decades old (and its accompanying header file, which desctives the API) and expect it to JustWork(tm) with my brand-new software and the latest compiler and libc for my platform [1] That's what "Stable" refers to in this context.

But going back to your example *API*, it's not well thought out. But defining "good" APIs is a discipline all of its own.

[1] To be fair, some of that credit is due to glibc's commitment to backwards compatibility.

Defining the Rust 2024 edition

Posted Feb 5, 2024 14:19 UTC (Mon) by farnz (subscriber, #17727) [Link] (5 responses)

I did not change the API at all - and in a statically linked world, everything works just fine. The API remains (both before and after the change) "you must not supply data in chunks larger than MAX_BYTES"; if I supply you a .a and matching .h for that library, you can statically link, and a simple recompile will fix things if you obey the API as documented (e.g. by reading in at most MAX_BYTES at a time via read(2), then supplying them to this library).

If I was committing to a stable ABI, I'd use a version script (as glibc does) to provide both old and new versions of the symbols, and to do whatever it takes to handle both old and new versions with the same underlying algorithms. This is a lot of work, and it's to the glibc maintainers' credit that they do this work, so that even if they change APIs (not just ABIs), things Just Work.

Part of the problem here is that you've internalised a whole pile of rules around ABI stability, and you're assuming that they're part of the C language - but they're not, they're things that you have to do to have a stable ABI even in C. It's just that in Rust (and C++), the things you have to do to have a stable ABI are much more visibly painful than they are in C, because there's not many constructs in C that don't translate directly to the ELF psABI for your platform (const is one, the preprocessor is another, I can't think of a third off the top of my head). This, in turn, is because the C language simply doesn't have useful features (like generics) which aren't directly representable in ELF psABIs (in part because the ELF psABIs themselves aim to fully define the ABI for a C-like language, and not for something more capable).

Defining the Rust 2024 edition

Posted Feb 5, 2024 16:23 UTC (Mon) by pizza (subscriber, #46) [Link] (4 responses)

> The API remains (both before and after the change) "you must not supply data in chunks larger than MAX_BYTES"

You changed the definition of MAX_BYTES, which is a change to the API -- No ifs, buts, or hand-waveys.

That said, not every change to the API necessarily represents a change to the ABI (for existing stuff), and there are ways to design the APIs to make your example (ie changing the definition of MAX_BYTES) have no effect on the ABI. Just off the top of my head:

1) Give the structure an explicit data length field and use a variable-length structure. This also means one doesn't get to use sizeof(structure) or static allocations.
1a) Instead of a variable-length structure, use a pointer to an aribitrarily-sized blob.
2) Have the application query/be told, at runtime, the maximum size of a given structure/field.
3) Make the structure opaque, with the API providing contstructors/destructors and access/manipulation functions.

For example, the decades-old BSD socket API uses a combination of (1) and (2). I can take a binary that was only ever aware of IPv4 addressing (32-bit addresses) and use it with a library that is aware of IPv6 (128-bit addresses), and it will work. Not because of glibc's fancy symbol versioning, but because the API was crafted with care to ensure that additions wouldn't change the ABI for the older stuff.

Another way of looking at this is that the shared library boundary is akin to a network protocol; a change in MAX_BYTES in your strawman example means the over-the-wire data format will also change, meaning you have to update both sides in lockstep for nearly any change, but with care (eg adding an explicit length field) you can check for a size greater than [your idea of] MAX_BYTES and handle/fail it gracefully.

Defining the Rust 2024 edition

Posted Feb 5, 2024 16:46 UTC (Mon) by farnz (subscriber, #17727) [Link] (2 responses)

No, MAX_BYTES's value is not part of the API; it's part only of the ABI. The fact that you don't give reasoning for why the value of a constant (as opposed to the constant's name) is part of the API implies that you have an internalised view of API that you define as "the things that affect the C ABI", at which point your whole argument is circular.

As an application programmer, following the published Application Programming Interface, I don't care what value MAX_BYTES takes; I just know that I must use the named constant MAX_BYTES to refer to it, and my application will do the right thing across the interface. It's not until you build a binary that you refer to a concrete value; and, indeed, if C was a more capable language, my header file equivalent would only tell you that there is a constant MAX_BYTES, but you'd not get the value of that constant until you combined that with an implementation. It's just that C lacks encapsulation, so when I tell you in my API that there is a named constant, I also have to tell you what value to use for it - I can't hide that from you.

And your "network protocol" example is exactly why C does not have a stable ABI by default - you have to carefully define your ABI in order to avoid problems, and the only "advantage" C has over Rust in this respect is that the things that you use to define your C ABI are also the full power of the C language, whereas in Rust, they're what you get when you remove significant features (like monomorphized generics) from the language.

Defining the Rust 2024 edition

Posted Feb 5, 2024 16:57 UTC (Mon) by pizza (subscriber, #46) [Link] (1 responses)

> As an application programmer, following the published Application Programming Interface, I don't care what value MAX_BYTES takes; I just know that I must use the named constant MAX_BYTES to refer to it, and my application will do the right thing across the interface.

Sure. And then you went and arbitrarily switched from the library-provided MAX_BYTES symbol to one of your own creation/definition, and complained that it caused problems.

You don't get to arbitrarily change the local definition of something and expect to successfully interoperate/communicate/whatever with something else.

Defining the Rust 2024 edition

Posted Feb 5, 2024 18:05 UTC (Mon) by farnz (subscriber, #17727) [Link]

No I did not - I changed the library-provided MAX_BYTES, and rebuilt the library with a smaller version. If the library ABI was stable (as would be the case if C provided a stable ABI), then I'd find that applications built for the older version of the API would work with the newer library. As it is, by rebuilding the library with an ABI change (but not an API change), I've broken all applications that were written against the old DSO, but now dynamically link against the new DSO.

And this is the point - I've done a change that's local to the library (the library's API does not change), that breaks the library ABI, and yet the C language doesn't do anything to stabilize that ABI. You're trying to declare this as somehow "out of bounds" because it shows up that C's ABI is also unstable, unless you take care at the library level to also keep a stable ABI.

Once you're carefully doing a stable ABI for your library, then Rust and C provide similar tools - you can define your ABI in terms of the things that the ELF psABIs (and other platform equivalents) provide, and you know that it's a special module that you need to be careful with. The only difference is that most of C is stuff that lives in the psABI for your OS, but most of Rust does not, and thus when you write your ABI module in Rust, you find yourself feeling much more restricted than you would be if you were writing C.

But this feeling is not an advantage of C - it's that C is sufficiently impoverished as a language (AFAICT, there's nothing in C that wasn't invented by 1960) that you don't have much that isn't already in the psABI for your platform. And that goes double for ELF platforms, since the ELF psABI was defined in terms of the (already well-understood) needs of ld for C and FORTRAN 77, not for anything more modern.

Defining the Rust 2024 edition

Posted Feb 5, 2024 16:51 UTC (Mon) by mb (subscriber, #50428) [Link]

You see? You have to restrict yourself to a subset of C's possibilities to get a stable ABI.
There is no stable ABI for arbitrary C programs.
Yet, you demand a stable ABI for arbitrary Rust programs.
Not going to happen.

Defining the Rust 2024 edition

Posted Feb 5, 2024 16:11 UTC (Mon) by mb (subscriber, #50428) [Link] (4 responses)

>Meanwhile, in the C world,

Rust does have a stable C interface.
It's there. You can use it to have stable ABIs.

Please compare apples to apples.
It's not correct to compare the toy-ABI of C to something as complex as Rust or even C++ and demand that if C has a semi-stable ABI, then C++ and Rust should also have them with support for the whole type system. They should not. And they can't.

>But going back to your example *API*, it's not well thought out.

Yep. That's the thing here. To have a stable ABI you have to constrain your API. That is true for Rust, that is true for C++ (d-pointer, anyone?) and that is also true for C (#define). Thou shalt not modify the #defined values is one of the rules that you applied.

It will probably happen that a non-generic subset of Rust will be defined as stable ABI in the future.
Just like we can have a subset of C++ being stable.
Simple C-like Rust functions with simple Rust types going in and out can probably be made stable.

But I think today is not the time to do that. We should stabilize the APIs a bit more, before we stabilize a subset of the ABIs. Due to the inter-crate compatibility guarantee between editions, we already have a big constraint on what editions can change. I think it's not the time to constrain that further, yet. Just use a C ABI with simple types, if you need a stable ABI.

Defining a stable ABI for the whole language probably is impossible.

Defining the Rust 2024 edition

Posted Feb 5, 2024 16:32 UTC (Mon) by pizza (subscriber, #46) [Link] (3 responses)

> Rust does have a stable C interface.
> It's there. You can use it to have stable ABIs.

Well, duh. That the entire point; to get a stable ABI in Rust today, you have to essentially demote it to what C provides, on both the library provider and library user, even if both are written in Rust.

> It will probably happen that a non-generic subset of Rust will be defined as stable ABI in the future.
> Defining a stable ABI for the whole language probably is impossible.

"probably in the future" is not something that someone [considering] using Rust today should ever plan on happening.

Defining the Rust 2024 edition

Posted Feb 5, 2024 16:44 UTC (Mon) by mb (subscriber, #50428) [Link] (1 responses)

>That the entire point; to get a stable ABI in Rust today, you have to essentially demote it to what C provides,
>on both the library provider and library user, even if both are written in Rust.

Which makes it no less stable or usable for stable ABIs than C.
I don't really see the point.

We need stable Rust ABIs, because C has a stable ABI?
Rust *has* a stable C ABI.
What's your point?

> "probably in the future" is not something that someone [considering] using Rust today should ever plan on happening.

What should they do instead? Use C? How does that make any sense? Rust has a C ABI.

Defining the Rust 2024 edition

Posted Feb 5, 2024 17:10 UTC (Mon) by pizza (subscriber, #46) [Link]

> What should they do instead? Use C? How does that make any sense? Rust has a C ABI.

If you want/need a stable ABI with Rust, you have to present (and/or consume) a C ABI, because because Rust is unlikely to gain a "Stable Native Rust" ABI in the foreseeable future.

Defining the Rust 2024 edition

Posted Feb 5, 2024 16:52 UTC (Mon) by farnz (subscriber, #17727) [Link]

We have to demote to less than what C provides, actually - we have to demote down to the things that are provided by the platform ABIs (Win32, Win64, ELF psABIs, mach-O), and we have to do that in both cases, since there are things in C that are not represented in the platform ABIs, either.

The difference is that much, much more of useful Rust is not directly represented in the platform ABIs, because there's a much bigger chunk of Rust that's more powerful than the platform ABIs directly provide (not least because the platform ABIs originally intended to cover most useful C and FORTRAN code, so were built to provide most of what C needs). But if you want a stable ABI with Rust, you can do it; it's just more obvious how much you're throwing away to get a stable ABI.

Defining the Rust 2024 edition

Posted Feb 3, 2024 12:24 UTC (Sat) by pizza (subscriber, #46) [Link] (1 responses)

> The reality is that these world rebuilds rarely are an issue.

They are if you don't have the complete source code to _everything_.

Which, when you are consuming commercial/proprietary libraries, you often (if not usually) don't.

Defining the Rust 2024 edition

Posted Feb 20, 2024 15:05 UTC (Tue) by natkr (guest, #123377) [Link]

And that's the beauty of it! Making rebuilding the default makes proprietary libraries largely unviable, while helping exercise the build process of F/OSS ones. Win/win!


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds