|
|
Subscribe / Log in / New account

GNOME's new versioning scheme

GNOME's new versioning scheme

Posted Sep 18, 2020 19:48 UTC (Fri) by nix (subscriber, #2304)
In reply to: GNOME's new versioning scheme by ebassi
Parent article: GNOME's new versioning scheme

Releasing GTK4 will require API bumps in other libraries, indeed; but we do have strong parallel installability guarantees for a reason, and we can phase out the GTK3 reverse dependencies in time. GTK3 is still supported, and so will be the libraries that depend on it; once all the libraries used by an application have been updated to support GTK4, porting your application should not require a major step.
Oh dear. Unless you've changed every public symbol name in Gtk 4, this will fail: the moment one library is ported to use Gtk 4, even if it bumps soname and gains parallel installability at the same time, any process that depends directly or transitively on both that library and Gtk 3 will suffer symbol name clashes between the two Gtks, a pile of unintended interposition, and likely disaster (so this also happens if you port any application to Gtk 4, if any of its libs still uses Gtk 3, and vice versa). You can hack around this to some degree using -Bsymbolic and DT_GROUP, but this is a) ELF-specific and b) won't help with data structures passed between the incompatible Gtks: so you probably have to change the names of all datatypes in the public API as well. Worse yet, this also applies to non-GNOME users of Gtk, so you can never be sure this won't trip you up.

This sort of thing caused a number of problems for distros in the Gtk 2 transition period and is probably one reason why packagers and distributors groaned at the new plan of breaking the Gtk ABI more frequently. ABI breakage hurts. I think glibc and the X libraries have the right of it here: widely-used libraries should never, ever break ABI at all, for any reason. Yes, this makes library design really hard: blame ELF :(

(You probably know all this already and have planned for it, though it's a hell of a lot of work and honestly I'm not sure how you can make it less than horrendously painful for everyone: I'm just posting this in case you overlooked this particular pile of festering nightmares and the degree to which it hurts all your direct and indirect library users. It's amazing how many major projects overlook it: GhostScript is one, which forked libcms into a multithread-capable version for internal gs use without changing any of the symbol names, so now any libcms users which also include libgs are likely to dump core. The only solution to that is to not use the gs fork of libcms, which is explicitly unsupported and planned to go away. I guess that means libgs users that want to do colour management are all screwed when that happens. Sigh.)


to post comments

GNOME's new versioning scheme

Posted Sep 19, 2020 1:15 UTC (Sat) by mathstuf (subscriber, #69389) [Link] (6 responses)

I think, but may be wrong, that other formats (Mach-O and PE mainly) actually have facilities to bind to symbols in given libraries, so collisions there are actually "okay" since you also look up symbols through a library name filter. I do wish ELF didn't do the "one big bag of symbols" approach, but we're stuck with that I guess. I suppose you could mitigate it with symbol versioning to differentiate between libgtk3 and libgtk4, but that's not always easier and the entire ecosystem would probably need to do it.

And yes, embedding libraries is a real pain. We do it because we have Windows and macOS deployments and who is going to have a viable HDF5 SDK just laying around there? No one, that's who. But I make sure that every library name, header file, and exported symbol is mangled to avoid conflicts. Sometimes static globals still bite us, but those errors usually have the symbol name in them so we can go and fix it. We don't mangle data types because no one should be including our mangled library and a real one in the same TU anyways. Alas, Python module paths/names are also nigh impossible to mangle in practice, so we do end up colliding there, but AFAIK, there's nothing we can do about that.

GNOME's new versioning scheme

Posted Sep 19, 2020 11:40 UTC (Sat) by nix (subscriber, #2304) [Link] (5 responses)

This is what the combination of -Bsymbolic (to bind to the locally-visible definitions at link time and avoid interposition: -z now might help too) and DT_GROUP (to prevent the locally-used Gtk from "leaking out" to supervening libraries) is meant to do. I'm not entirely sure it works 100%: I know glibc has historically had painful-to-even-describe bugs in this area. Still, in the absence of a working dlmopen (which is an even *bigger* pile of pain to use even when it does work), or renaming all symbols and publically-visible types (even worse!) this is the best we can do. It is not a very good best.

GNOME's new versioning scheme

Posted Sep 19, 2020 18:46 UTC (Sat) by mathstuf (subscriber, #69389) [Link] (4 responses)

Hmm, my reading of the flags is as such:

- -Bsymbolic: don't look for symbols we know are in our own library anywhere else

So this doesn't seem like it would help with other code using my symbols from getting confused. Do you have documentation links for DT_GROUP? My manpages have nothing about it. From your descriptions, it sounds like it restricts symbol lookups to those found via DT_NEEDED for the library itself? So basically, a better `RTLD_LOCAL`. How does this interact with weak symbols (like, say, Anaconda where libpython.so symbols are meant to be provided by the loading executable)?

GNOME's new versioning scheme

Posted Sep 19, 2020 19:24 UTC (Sat) by nix (subscriber, #2304) [Link] (3 responses)

-Bsymbolic applied to (say) the link of Gtk 4 itself stops Gtk's own internal calls to its own symbols from being interposed by some other Gtk that happened to have been loaded earlier.

DF_GROUP (as usual I got this one wrong, it's a flag, not a tag) constrains symbol searches from within this library to happen only within the library and its transitive DT_NEEDED libraries: it stops global symbols or symbols in other branches of the search tree from interposing. The relevant ld flag to look for is '-Bgroup'. It is *not* the same as --start-group/--end-group or the linker script GROUP command: fairly confusing naming really. Weak symbols still work, but only if they're resolved by objects loaded by this shared library or things it loaded. (So the old pthreads trick wouldn't work, not that it's a good idea anyway in glibc 2.32+.)

GNOME's new versioning scheme

Posted Sep 22, 2020 16:33 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (2 responses)

Hrm. The fact that it is a library-wide flag works most of the time, but really fails for plugins. I'd like them to load their novel dependencies directly, but those from the loading application usually should just be expected to be there. Maybe `--warn-unresolved-symbols` can help to alleviate that though.

I have a PR (that's likely fallen on deaf ears) for macOS' linker to have a "any symbol you find in this library should be looked up at runtime" behavior which is, I think, exactly the kind of behavior I'd like. https://github.com/apple-opensource/ld64/pull/1

For reference, the search term to use for the ELF flag is `DF_1_GROUP`.

GNOME's new versioning scheme

Posted Sep 22, 2020 21:38 UTC (Tue) by nix (subscriber, #2304) [Link] (1 responses)

I don't know what "looked up at runtime" means: all symbols in dynamically-linked programs are looked up at runtime, that's what dynamic linking *is*. Perhaps you mean "look for these symbols in the main program"? Because if so, that's the ELF default: the search scope for all symbols starts with the main program and goes on from there. (Well, OK, actually it starts with ld.so itself, and the main program comes right after that, usually followed by libc.)

GNOME's new versioning scheme

Posted Sep 22, 2020 21:54 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

As opposed to getting "this symbol was not found" errors at link time. So if I use `-Bgroup`, symbols will be searched for in the loading chain and then *only* the DT_NEEDED libraries? Or does the library/executable providing a plugin API also need to be in the DT_NEEDED section for DF_1_GROUP to work appropriately?

GNOME's new versioning scheme

Posted Sep 19, 2020 3:04 UTC (Sat) by pabs (subscriber, #43278) [Link] (4 responses)

Does GTK not use symbol versioning?

GNOME's new versioning scheme

Posted Sep 19, 2020 11:32 UTC (Sat) by smcv (subscriber, #53363) [Link] (3 responses)

No, GTK doesn't have versioned symbols.

Versioned symbols have two main uses: disambiguating between incompatible SONAMEs of the same library (which is the use you have in mind here), and keeping track of which version is required.

In the GObject world, versioned symbols are not useful for the first of those uses, because the GObject type registry is a flat global namespace that is basically a hash table in libgobject-2.0.so.0, a library that hasn't broken ABI for something like 20 years. So you'd still have the same collisions with versioned symbols, but instead of being over which version of gtk_widget_activate() is the right one to find in ELF symbol lookup, they'd be over which version of g_type_from_name("GtkWidget") is the right one to find in GObject type lookup.

GObject could in principle gain a concept of versioned type-names analogous to versioned symbols, and people have experimented with doing just that, but that would likely lead to an API and ABI break throughout the GLib-based stack, which conflicts with the fact that GLib/GObject is effectively following the glibc model for ABI compatibility (i.e. don't break ABI, ever). A SONAME bump in GObject would force every GObject-dependent library to also change its SONAME (or crash a lot, I suppose, but let's not go there). You might think moving from GTK 2 to 3 (and in future to 4) has already been a disruptive transition, but moving from GObject 2 to 3 would also involve non-GUI libraries, and lower-level-than-GTK GUI libraries like GDK-Pixbuf and Pango. That's a major reason why GLib/GObject didn't break ABI in GNOME 3.

Versioned symbols can still be useful in GObject-based libraries whose namespace conventions collide with a non-GObject library (for example json-glib, json-c and jansson, which unfortunately all use json_*), and they can still be useful to track when symbols where introduced (for example in telepathy-glib), but neither of those is immediately relevant here.

GNOME's new versioning scheme

Posted Sep 19, 2020 11:44 UTC (Sat) by nix (subscriber, #2304) [Link] (2 responses)

GObject could in principle gain a concept of versioned type-names analogous to versioned symbols
I'm wondering if you could do that by adding a g_mangle_name() with which you could say g_mangle_name("GtkWidget", 2, 0) and get back GtkWidget@2 (the 0 is a minor number): equally you could unmangle it to find that the version of GtkWidget@2 was 2, etc. As long as version 1 had an unmangled name, it seems to me you could bring this in without disrupting existing users of (necessarily unmangled) GObject names at all, and *certainly* without the horror show which would be breaking glib's ABI.

I didn't sleep well last night so probably there is some really obvious reason why this wouldn't work which I just can't identify right now.

(... and now I'm wondering if CTF could help fix the type side of this. Alas, probably not, not unless we augment CTF with version info *and* ld.so starts using it somehow. Which is a possibility for the distant future, hmmm... another reason for me to come up with the do-less-mallocs version of libctf, since ld.so's early-operation malloc is so crude it's best to avoid doing complex patterns of temporary allocations in it.)

GNOME's new versioning scheme

Posted Sep 21, 2020 14:02 UTC (Mon) by smcv (subscriber, #53363) [Link] (1 responses)

> I'm wondering if you could do that by adding a g_mangle_name() with which you could say g_mangle_name("GtkWidget", 2, 0) and get back GtkWidget@2 (the 0 is a minor number)

This is more or less how the experiments that were done in the past worked (I think they might have just hard-coded the name-mangling as a proof of concept, but that's an implementation detail).

> As long as version 1 had an unmangled name, it seems to me you could bring this in without disrupting existing users of (necessarily unmangled) GObject names

For *some* users, yes. The experiments I saw were on GTK 2/3 compatibility (back when GTK 2 was still maintained) so they assumed that significant changes to GTK 2 and GTK 3 source code wouldn't be allowed, and they also made the simplifying assumption that the versioned naming would have to apply globally, to all GObject libraries; but you're right to say that continuing to use the simple names for existing libraries, and only bringing in the versioned names on an opt-in basis for new SONAMEs, would be a migration strategy less likely to cause regressions.

It might already be too late for that in GTK 4, but perhaps it's viable for GTK 5, or for the next time Evolution breaks ABI.

For GObject users that have to be able to deal with unknown types generically, I could see the mangled names still causing breakage: for example if code generation tools like Vala assume that the type named "GtkWidget" is a "GtkWidget *" in C source code, that assumption will no longer hold. (But it already wasn't true for some particularly creative/evil/broken GObject libraries, like dbus-glib with its parameterized types.)

GNOME's new versioning scheme

Posted Sep 21, 2020 14:30 UTC (Mon) by nix (subscriber, #2304) [Link]

> For GObject users that have to be able to deal with unknown types generically, I could see the mangled names still causing breakage: for example if code generation tools like Vala assume that the type named "GtkWidget" is a "GtkWidget *" in C source code, that assumption will no longer hold. (But it already wasn't true for some particularly creative/evil/broken GObject libraries, like dbus-glib with its parameterized types.)

Yeah, some things would definitely need adjusting to know that mangling was *possible*. (This was, of course, true when ELF gained symbol versioning, too: some things needed to know what the @ and @@ in symbol names meant. But most things didn't need to know at all.)

GNOME's new versioning scheme

Posted Sep 19, 2020 13:00 UTC (Sat) by ebassi (subscriber, #54855) [Link] (1 responses)

> any process that depends directly or transitively on both that library and Gtk 3 will suffer symbol name clashes between the two Gtks, a pile of unintended interposition, and likely disaster (so this also happens if you port any application to Gtk 4, if any of its libs still uses Gtk 3, and vice versa)

Yes, we know. That's why you cannot load different versions of GTK in the same process; as soon as you do, you will hit an assertion failure, and since initialising GTK must be the first thing that you do, it'll happen pretty much instantaneously.

It's not our first rodeo.

GNOME's new versioning scheme

Posted Sep 19, 2020 14:15 UTC (Sat) by nix (subscriber, #2304) [Link]

Oh good! In hindsight I have dim memories of hitting this, but it was so long ago I'd forgotten.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds