|
|
Subscribe / Log in / New account

OT: dependency and ABI mismanagement

OT: dependency and ABI mismanagement

Posted Sep 25, 2010 12:52 UTC (Sat) by tialaramex (subscriber, #21167)
In reply to: Carrez: The real problem with Java in Linux distros by rgmoore
Parent article: Carrez: The real problem with Java in Linux distros

The soname changes. That has nothing to do with packaging, sloppy or otherwise. The runtime linker assumes that libfoo.so.18 and libfoo.so.19 have different ABIs since the _whole point_ of the soname versioning is to manage this ABI compatibility.

Someone in the GNOME project (and most likely, specifically Evolution) had something so vitally important they wanted to change, that it was worth throwing away compatibility.

There's a good chance it didn't really touch compatibility, in my experience the majority of small library owners work on a mixture of superstition, urban legend and outright guesswork to manage their ABI. Some may believe re-ordering public structures to be prettier is harmless (yes, that's why some libpng versions were incompatible despite claiming the same soname...) while others imagine that renaming a structure member needs an ABI bump. Nobody like that _should_ be managing code in your out-of-box GNOME install, but we rely on volunteers, and since I'm not volunteering to go fix this I can't complain (well, clearly I do, but arguably it's not fair to)


to post comments

OT: dependency and ABI mismanagement

Posted Sep 27, 2010 3:15 UTC (Mon) by cmccabe (guest, #60281) [Link] (9 responses)

> The soname changes. That has nothing to do with packaging, sloppy or
> otherwise. The runtime linker assumes that libfoo.so.18 and libfoo.so.19
> have different ABIs since the _whole point_ of the soname versioning is to
> manage this ABI compatibility.

Copied from stackoverflow.com (I couldn't find a HOWTO for some reason):

> The way you're supposed to form the x.y.z version is like this:
>
> 1. The first number (x) is the interface version of the library.
> Whenever you change the public interface, this number goes up.
> 2. The second number (y) is the revision number of the current
> interface. Whenever you make an internal change without changing the
> public interface, this number goes up.
> 3. The third number (z) is not a build number, it is the
> backwards-compatability count. This tells you how many previous interfaces
> are supported. So for example if interface version 4 is strictly a
> superset of interfaces 3 and 2, but totally incompatible with 1, then z=2
> (4-2 = 2, the lowest interface number supported)

So if the developer bumped the major version number from libfoo.18.0.0 to libfoo.19.0.0, he basically waved a big red flag saying "ABI change!" In theory, at least.

OT: dependency and ABI mismanagement

Posted Sep 27, 2010 10:35 UTC (Mon) by tialaramex (subscriber, #21167) [Link]

I agree that's what it means when you bump this number.

But riddle me this: if the change was so major as to need this ABI change, why doesn't it deliver even a single new feature worth telling the world about in the release notes?

Imagine if libc took this approach "download all new apps, we changed the order of the structure members in struct sigaction because we think the mask should be first" and then next week "sorry, download fresh again, this time we decided stat should put the inode number before the device ID..." and the week after "the arguments to recvmsg are re-ordered, and it was renamed recvmessage, we supply a macro so that your code will still build, but existing binaries no longer work".

OT: dependency and ABI mismanagement

Posted Sep 27, 2010 13:25 UTC (Mon) by paulj (subscriber, #341) [Link] (7 responses)

Soname versioning is pretty crude.

ELF formats have more fine-grained versioning systems now that can be granular at a symbol level, and even allow 1 library to support multiple *incompatible* versions of a symbol at the same time. This is used, e.g., by Glibc. The symbol versions are specified in a linker map. There are very few reasons to break compatibility with previous, stable interface once you use symbol versioning.

It's "just" a question of spending a little extra time on paying attention to the compatibility issues.

OT: dependency and ABI mismanagement

Posted Sep 29, 2010 6:22 UTC (Wed) by jamesh (guest, #1159) [Link] (6 responses)

ELF symbol versioning only really works well when the entry points you are versioning use simple types as inputs and outputs.

As the types get more complex, it becomes harder to support multiple versions of those data structures within the same library. And object oriented designs that make use of inheritance are probably the most difficult (as found on most C++ projects and glib GObject based projects).

It is possible to design the data structures so they can be extended without breaking compatibility (e.g. GTK has maintained ABI compatibility for quite a long time, despite extensive changes to some widgets), but people don't always follow those guidelines, or get things wrong the first time. If the library is high enough in the stack with few users, the developers might not even feel it worth while to plan for future changes and just bump the soname when needed.

OT: dependency and ABI mismanagement

Posted Sep 29, 2010 11:01 UTC (Wed) by paulj (subscriber, #341) [Link] (5 responses)

Yeah, data type compatibility takes a little bit of care.

Still though, even if you must introduce a new, incompatible data type, there's still no reason why your library can not support the same (runtime) call using both old and new data types. The old symbol, to which old binaries bind, simply expects the old data type - and the new symbol the new data type.

Compile time backward compatibility may require a little extra work again, of course, but its not rocket science.

Have a read of the Solaris and GNU linker documentation on symbol version scripts/maps. It's a pretty powerful mechanism. Solaris makes heavy use of them, given Suns' strong desires to have binary compatibility as a feature (also requires carefully documenting what guarantees you make for the stability of interfaces, and testing). It's a pretty old feature too...

The trouble is, this is effort and work that benefits unknown users - it doesn't immediately benefit the developer much and its not much fun. So it usually simply doesn't get done in the free software world, other than exceptions like, e.g., projects where there's a corporate sponsor to provide a focus on customer experience.

From a quick look with readelf at my local GTK+ library, it doesn't look like GTK+ uses symbol versioning.

OT: dependency and ABI mismanagement

Posted Oct 5, 2010 22:00 UTC (Tue) by nix (subscriber, #2304) [Link]

Another exception: projects where ABI breaks would be hell to fix, and where everyone knows that. (e.g. glibc.)

I suspect the X libraries don't use it because *introducing* symbol versioning would itself break the ABI, and the current ABI of libX11 et al predates symbol versioning by years.

OT: dependency and ABI mismanagement

Posted Oct 6, 2010 9:31 UTC (Wed) by jamesh (guest, #1159) [Link] (3 responses)

While providing two versions of the function is certainly possible, I think you are misjudging the effort to do so.

Using GTK as an example, if we made an incompatible change to the GtkWidget structure, there are 179 gtk_widget_* symbols that we'd need two versions for.

Now every widget in the library (and every library built on top of GTK) embeds the GtkWidget structure, so we would need two versions in order to support both the old and new API. There is more than 3800 symbols in GTK alone, so this is not a small job. If my application uses any libraries built on top of GTK, they will need to be updated in a similar way to support the new GtkWidget data type if I am to use the new version of the API.

Granted the problems are smaller if the incompatible change is made further down the class hierarchy, but I hope this explains why symbol versioning isn't the first tool developers reach for in these cases.

OT: dependency and ABI mismanagement

Posted Oct 6, 2010 19:32 UTC (Wed) by paulj (subscriber, #341) [Link] (2 responses)

For that kind of wide-ranging change, I agree it might be classed as one of the "few legit reasons to break compatibility". Though, you could take the approach of making a new library; compiling old and new separately; then combining 2 objects for each library into one appropriately symboled one (I think you might have to write your own ELF tool to remap the symbols of one library though, if such doesn't already exist).

However, you're mistaken that the applications must be updated. You can retain *source* compatibility even if binary compatibility is broken in some way. I.e. you're assuming the old GtkWidget definition retains that name and the new one gets a new name. However, you can also rename the _old_ definition (GtkWidgetOld or GtkWidget2_2) and have the new definition use the well-known source-level name, presuming it is still source compatible. With linker maps you can direct old apps (compiled with the old GtkWidget definition, i.e. GtkWidgetOld when it was still called GtkWidget) to functions that expect GtkWidgetOld. There is no requirement at all that the name of the structure be the same in the caller and the function, it's not part of the ABI.

Solaris made heavy use of this kind of stuff to preserve runtime compatibility even as data types could be changed incompatibly without changing source-level name (be it changed by default, or changed in the presence of whatever feature selection defines). Glibc probably does too.

OT: dependency and ABI mismanagement

Posted Oct 7, 2010 9:58 UTC (Thu) by jamesh (guest, #1159) [Link]

My point about applications was when they make use of multiple libraries providing GTK widgets.

Since this thread started on evolution-data-server, consider an application using one of the widgets from the libedataserverui library. If GTK broke the ABI of GtkWidget, you would need a new version of the libedataserverui widgets to use with the new GtkWidget ABI. If that was not available, then your app would need to use the old GTK ABI.

As I said previously these sort of ABI breakages are quite painful, so effort is made to avoid them. For GTK itself we've maintained compatibility for 8 years, so it certainly is possible (although is a bit painful at times).

Would it be nice if evolution-data-server went through fewer ABI breakages? Sure, but I don't think symbol versioning would solve the problem.

OT: dependency and ABI mismanagement

Posted Oct 9, 2010 22:14 UTC (Sat) by nix (subscriber, #2304) [Link]

glibc can do it *better* than Solaris, as the GNU linkers have default symbol versions, which Solaris does not (or didn't last I checked, but that was way back in the Solaris 9 days, maybe they added it in 10 or OpenSolaris?)


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds