C library system-call wrappers, or the lack thereof

Posted Nov 13, 2018 19:16 UTC (Tue) by Cyberax (✭ supporter ✭, #52523)
In reply to: C library system-call wrappers, or the lack thereof by rweikusat2
Parent article: C library system-call wrappers, or the lack thereof

> That's a policy you may consider sensible but the people who work "on the freaking libc" apparently don't. Eg, bugs are user-visible behaviour. Some people want them fixed.
That's because glibc is written by very, very misguided people. Seriously.

There are non-idiotic libcs out there and somehow they work just fine without breaking stuff. Then we have Linux itself that somehow manages to preserve even more complicated ABI than glibc's.

> This is not "my model" as I'm not the GNU C library but that's how the model of the GNU C library is supposed to work: A program linked against version a.b.c of symbol frxblz[*] will use version a.b.c of symbol frxblz even if a newer version d.e.f is also available. Which wouldn't be possible if the soname had been changed instead.
You can add new symbols if you must. Windows does this (WriteFile, WriteFileEx, etc). But nothing requires you to change the user-visible behavior of existing symbols.

C library system-call wrappers, or the lack thereof

Posted Nov 13, 2018 20:06 UTC (Tue) by nybble41 (subscriber, #55106) [Link] (27 responses)

> You can add new symbols if you must. Windows does this (WriteFile, WriteFileEx, etc).

That's just ad-hoc symbol versioning. Programs which use the new symbols still won't run on older systems. It's strictly worse than what glibc does.

As for your "simple program that uses ages-old epoll/read/print functions", the symbol versions for these APIs in glibc 2.27 are as follows:

epoll_create, epoll_wait, epoll_ctl - GLIBC_2.3.2
read, printf - GLIBC_2.2.5

That means any application using *only* these symbols would be able to run without modification on RHEL-3.9 (glibc 2.3.2). RHEL6 should not be a problem.

C library system-call wrappers, or the lack thereof

Posted Nov 13, 2018 21:18 UTC (Tue) by ldarby (guest, #41318) [Link] (26 responses)

The common problem that I suspect Cyberax is actually moaning about is if software uses other calls like memcpy() which on centos 7 gets a version of GLIBC_2.14:

readelf -a foo | grep memcpy
000000601020 000300000007 R_X86_64_JUMP_SLO 0000000000000000 memcpy@GLIBC_2.14 + 0
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memcpy@GLIBC_2.14 (3)
55: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memcpy@@GLIBC_2.14

and this doesn't work on centos 6:

ldd ./foo
./foo: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by ./foo)

C library system-call wrappers, or the lack thereof

Posted Nov 13, 2018 22:12 UTC (Tue) by rweikusat2 (subscriber, #117920) [Link] (22 responses)

That was eight years ago.

It's also unquestionable that increasing the symbol version was justified as this was a change which did cause "really existing software" to break.

But this is really a problem with no good solution, only different tradeoffs which all end up being detrimental to someone. Either no code which was released must ever be changed in a new version as applications will depend on undocumented and even on unintentional properties of it. Or loads of existing binaries suddenly break in interesting ways and there's the very real possibilty that "recompile the code" is not an option. Or people developing new code who can recompile that because they have both the code and the necessary tools/ environment available must restrict themselves to actually using systems said to be supported.

As to the change in question: My opinion on this is that this here

void *memcpy(void *dst, void *src, size_t n)
{
    char *d, *s;

    d = dst;
    s = src;
    while (n) --n, d[n] = s[n];
    
    return dst;
}

is a nice, simple and portable implementation of memcpy and instead of terror-optimizing this algorithm using whatever "latest and greatest" CPU support for absurdly large block memory copies happens to be available, one should stick to something like this as default implementation and leave it to people to whom absurdly large block memory copies are a real performance problem to figure out how they can either avoid these or speed them up.

But that's just my opinion and certainly not a universal consensus.

C library system-call wrappers, or the lack thereof

Posted Nov 13, 2018 22:42 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (16 responses)

> It's also unquestionable that increasing the symbol version was justified as this was a change which did cause "really existing software" to break.
This just means you need to create another function - "memcpy_fast" or whatever.

C library system-call wrappers, or the lack thereof

Posted Nov 14, 2018 0:19 UTC (Wed) by nybble41 (subscriber, #55106) [Link] (15 responses)

> This just means you need to create another function - "memcpy_fast" or whatever.

There is already a function with the semantics of "memcpy_fast" included in the C standards. It's called "memcpy". There is approximately zero chance that a new, non-portable alternative to memcpy() would have been introduced just to maintain compatibility with applications abusing memcpy() in situations where they should have employed memmove().

Even if they did take that approach programs would need to be modified to use the new API, which would be a colossal undertaking as well as a major step backward in terms of portability. The main performance benefit of eliminating the extra branch required for memmove() semantics comes from the myriad *small* memory copies performed throughout any non-trivial application. The change would be mostly pointless if it didn't automatically encompass all existing memcpy() users.

Without symbol versioning the most likely resolution would have been to simply let the non-compliant programs break.

C library system-call wrappers, or the lack thereof

Posted Nov 14, 2018 0:33 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (14 responses)

> There is already a function with the semantics of "memcpy_fast" included in the C standards. It's called "memcpy". There is approximately zero chance that a new, non-portable alternative to memcpy() would have been introduced just to maintain compatibility with applications abusing memcpy() in situations where they should have employed memmove().
This is EXACTLY why a new function should have been introduced. Handling of overlapping became an implicit part of the memcpy interface. It shouldn't have but it did.

So the ONLY correct choice is to stick with it and improve the language standard to include a new function that explicitly defines the handling of overlaps.

> Even if they did take that approach programs would need to be modified to use the new API,
And that's fine. Nobody died because one optimization of memcpy became impossible.

> which would be a colossal undertaking as well as a major step backward in terms of portability.
Nonsense. Other systems either have the implicit overlap handling guarantees of memcpy or the software is already non-portable to them.

C library system-call wrappers, or the lack thereof

Posted Nov 14, 2018 12:51 UTC (Wed) by nix (subscriber, #2304) [Link] (13 responses)

> This is EXACTLY why a new function should have been introduced. Handling of overlapping became an implicit part of the memcpy interface. It shouldn't have but it did.

Ah, but introducing new functions is *also* a compatibility problem, because there is only one namespace for symbols, and now you will collide, either at compile time or runtime, with other programs already using the function you chose. (When getline() was introduced, it broke compilation of a *lot*. But only compilation, because glibc was not written by idiots: at runtime it still worked, where your proposals would fail utterly. It appears you don't know the ELF symbol resolution rules.)

C library system-call wrappers, or the lack thereof

Posted Nov 14, 2018 19:00 UTC (Wed) by quotemstr (subscriber, #45331) [Link] (12 responses)

We should move to a two-level namespace like macOS and Windows use. Instead of importing symbol X, you import symbol X *from some specific library* Y. This way, you only need (X, Y) pairs to be unique, not X generally. This approach resolves a lot of weird symbol conflict issues. It breaks LD_PRELOAD-style symbol interposition, but I think that's a misfeature anyway.

C library system-call wrappers, or the lack thereof

Posted Nov 14, 2018 21:07 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (10 responses)

Note that for things like plugins that can be loaded by multiple application binaries (e.g., Python modules), you should not directly link to an implementation of the plugin API. In the case of Python, the application provides the core Python API symbols and the module should just expect them to exist at runtime. Otherwise you have the problem of a Python module being compiled against the macOS libPython.dylib and then it's useless for macports' Python since then you're mixing Python interpreters. Fun for everyone involved when you have to recompile Python modules to use a different application.

The problem is spelled out here:

https://blog.tim-smith.us/2015/09/python-extension-module...

where it is hacked around by just using `-undefined dynamic_lookup` and not linking libPython.dylib so that it's just found at runtime. But, this also means that all missing symbols are ignored until runtime. I have this patch:

https://github.com/mathstuf/ld64/commit/4eebe0c07e8ab706e...

which is a better fix, but I don't know how to convince Xcode to build the damn project and I need to figure out how/where to send the patch to get it reviewed by Apple.

C library system-call wrappers, or the lack thereof

Posted Nov 15, 2018 10:51 UTC (Thu) by lgerbarg (guest, #57988) [Link] (9 responses)

It is nice to see someone working on ld64 patches, but ld64 can already build a bundle with the semantics you are trying to achieve. From the ld(1) man page on macOS:

-bundle_loader executable
This specifies the executable that will be loading the bundle output file being linked. Undefined symbols from the bundle are checked
against the specified executable like it was one of the dynamic libraries the bundle was linked with.

This will find exported symbols in the main executable, encode the binds such that dyld will apply them to whatever the main executable in the current process is. This has the same runtime semantics that the linked patch is trying to achieve, except in the error case that occurs if a newer version of the interpreter accidentally removes an exported symbol that you bundle needs. Using -bundle_loader it will fail immediately after searching the main executable, instead of continuing to search other images in the process. Given that you know the symbols are supposed to be provided by the interpreter that is probably a preferable behavior.

C library system-call wrappers, or the lack thereof

Posted Nov 15, 2018 18:15 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (8 responses)

Hmm. I'll have to experiment with that. Thanks for the pointer.

C library system-call wrappers, or the lack thereof

Posted Nov 15, 2018 19:07 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (7 responses)

Using `-bundle_loader` doesn't work (at least as you've said):

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -Dmod_EXPORTS -I/System/Library/Frameworks/Python.framework/Versions/2.7/Headers -fPIC -MD -MT CMakeFiles/mod.dir/mod.c.o -MF CMakeFiles/mod.dir/mod.c.o.d -o CMakeFiles/mod.dir/mod.c.o -c ../mod.c

makes this:

$ otool -L mod.so
mod.so:
/System/Library/Frameworks/Python.framework/Versions/2.7/Python (compatibility version 2.7.0, current version 2.7.10)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1213.0.0)

which still references a specific Python and is wrong anyways:

$ /opt/local/bin/python2.7 -m mod # Use a MacPorts Python
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python: No code object available for mod

C library system-call wrappers, or the lack thereof

Posted Nov 15, 2018 23:25 UTC (Thu) by lgerbarg (guest, #57988) [Link] (6 responses)

It definitely works the way I described, and has since 10.1, it is how plugins for apps like photoshop are built and usable across multiple revisions of those app. I can’t tell you exactly what is going on in this case, because the line you pasted is not the linker invocation, it is a compiler invocation (it creates a .o from a .c file, it does create the final MH_BUNDLE, and does not actually have the -bundle_loader flag).

If you can find the actual linker invocation (either the driver with -W,-bundle_loader, or the actual call through to ld64) I can probably tell you what’s going wrong with the invocation, though off the top of my head I ha w no idea out how to get CMake to pass through the correct flags.

C library system-call wrappers, or the lack thereof

Posted Nov 15, 2018 23:33 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (5 responses)

Oops, sorry. I did mean to grab the link line. Neither of these work (both come back with the same "no code object" error message):

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -bundle -Wl,-headerpad_max_install_names -o mod.so CMakeFiles/mod.dir/mod.c.o -bundle_loader /System/Library/Frameworks/Python.framework/Versions/2.7/lib/libpython2.7.dylib

/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -bundle -Wl,-headerpad_max_install_names -o mod.so CMakeFiles/mod.dir/mod.c.o -Wl,-bundle_loader,/System/Library/Frameworks/Python.framework/Versions/2.7/lib/libpython2.7.dylib

C library system-call wrappers, or the lack thereof

Posted Nov 16, 2018 21:53 UTC (Fri) by lgerbarg (guest, #57988) [Link] (4 responses)

No problem. You are pointing -bundle_loader at a dylib, it should be pointed at an MH_EXECUTABLE. Maybe I misunderstood, but I thought you said the symbols were export from the python executable itself?

C library system-call wrappers, or the lack thereof

Posted Nov 16, 2018 22:06 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (3 responses)

The symbols are provided to the module by loading a libpython.dylib first. The executable has barely any symbols in it at all. Giving the executable complains that the symbols aren't defined.

C library system-call wrappers, or the lack thereof

Posted Nov 16, 2018 22:08 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (2 responses)

And it is libpython.dylib that does the dlopen, not the executable. I guess the flag is meant for static compilation?

C library system-call wrappers, or the lack thereof

Posted Nov 18, 2018 21:12 UTC (Sun) by lgerbarg (guest, #57988) [Link] (1 responses)

The flag is meant to allow plugins to refer back to symbols exported by an application in order to support a plugin API. Where the dlopen() happens doesn't really impact it, but the fact that theAPI is exported by a dylib rather than the interpreter itself is what complicates the situation. I could go into a bunch of historical reasons for why it behaves that way, but the short answer is that it dates back to classic macOS and the plugin mechanisms used there.

Now that I understand what you are trying to do a bit more clearly (sorry about the confusion) -bundle_loader is not appropriate unless you can re-export the symbols from the main executable. I think that would be the best option from a technical perspective, but it is probably unreasonable since it would require changes and back ports to all of the python interpreters.

Assuming that you need to get this to work without making changes to python itself. I think you should pass ld "-undefined dynamic_lookup." It is kind of gross, but it should work for your use case. If there is ever a desire to improve this behavior for future pythons there changes to the interpreter that would make modules work better:

1) Reexporting the symbols from libpython out of the python interpreter itself and using -bundle_loader
or
2) Renaming or symlinking the library exporting the API to have an unversioned name (like "libpython.dylib") and then having python set an LC_RPATH to the directory containing the dylib/symlink. Then modules could link to them via "@rpath/libpython.dylib" .

C library system-call wrappers, or the lack thereof

Posted Nov 20, 2018 15:16 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

Hmm. OK. So how is this meant to work with something like ParaView which supports different applications (for different use cases) and uses shared libraries to hold the actual API (and I mean that there can be 100+ of them)? What's the best way to reexport symbols of all linked libraries from an executable?

> I think you should pass ld "-undefined dynamic_lookup." It is kind of gross

That works for today. And projects which don't only target macOS are almost never going to do what seems like a bad legacy behavior. The "gross" part of this solution is what led to the patch to ld64.

> Renaming or symlinking the library exporting the API to have an unversioned name (like "libpython.dylib") and then having python set an LC_RPATH to the directory containing the dylib/symlink. Then modules could link to them via "@rpath/libpython.dylib" .

The name of the file doesn't matter. The binary must be edited using `install_name_tool -id` to change what gets embedded in the linking binary. And RPATH on macOS is awful. It only applies if explicitly requested and if something says "use me via rpath", no tools (I know of) add the paths automatically and they must be manually added.

C library system-call wrappers, or the lack thereof

Posted Nov 15, 2018 16:13 UTC (Thu) by nix (subscriber, #2304) [Link]

You can do that already with DT_GROUP -- but that wouldn't help in this case, because the conflict was often at *compile-time*: two C identifiers named getline() with different semantics (sometimes differing prototypes, sometimes one of them wasn't a function at all but a variable or something).

C library system-call wrappers, or the lack thereof

Posted Nov 14, 2018 0:29 UTC (Wed) by ldarby (guest, #41318) [Link] (3 responses)

> It's also unquestionable that increasing the symbol version was justified as this was a change which did cause "really existing software" to break.

Not sure I agree with that. If Linus "We don't break userspace" Torvalds had won the argument over Ulrich "buggy programs cannot be allowed to prevent a correct implementation" Drepper (sorry, paraphrasing), they would have just made a 2nd change of aliasing memcpy to memmove, un-breaking the buggy programs and not had any of this version hassle to deal with now.

C library system-call wrappers, or the lack thereof

Posted Nov 14, 2018 16:38 UTC (Wed) by rweikusat2 (subscriber, #117920) [Link] (2 responses)

Again, a solution to this is surprisingly simple: Create a file with the following content:

#include <sys/types.h>

void *memmove(void *, void *, size_t);
int puts(char *);

void *memcpy(void *d, void *s, size_t n)
{
    puts("honk");
    return memmove(d, s, n);
}

compile it with

gcc -fpic -shared -o x.so x.c

and use it together with another compiled program calling memcpy via LD_PRELOAD. Voila --- memcpy aliased to memmove without penalizing code not relying on undefined behaviour (the puts obviously just exists to demonstrate that the function is indeed being called).

C library system-call wrappers, or the lack thereof

Posted Nov 14, 2018 19:54 UTC (Wed) by ldarby (guest, #41318) [Link] (1 responses)

Not sure what you're trying to achieve here. That fixes niether the "GLIBC_2.14' not found" error or the general case of random buggy software that's used by non-technical users, who wouldn't have a damn clue what this "surprisingly simple" soluition even means.

C library system-call wrappers, or the lack thereof

Posted Nov 14, 2018 20:01 UTC (Wed) by rweikusat2 (subscriber, #117920) [Link]

This would transparently cause such a "random buggy binary" to call memmove instead of memcpy, hence sidestepping the issue when it exists instead of forcing all users of memcpy to call memmove instead.

The "non-technical users" also "don't have a damn clue" how the other software they're using works. But as they're just using it and not developing it, this should be ok, don't you think so?

C library system-call wrappers, or the lack thereof

Posted Nov 18, 2018 10:07 UTC (Sun) by roblucid (guest, #48964) [Link]

But the idea of simple C default universal implementation, defeats the idea of a library providing added value, by superior implementation or simply because it can be compiled and installed knowing what exact hardware and OS it is running on.

Applications were in past often shared over a network, so the executables ran on different CPUs and OS versions. Even in Linux distros would install differing versions of glibc, to allow the library to use Pentium 2 features whilst the packages installed were universal i386.

From my experience, this problem raised is moot, the error experienced is simply due to an improper build environment which needs to produce executables linked for the oldest system. There was on OpenSuSE a way of installing LCD library versions to build packages against. Later a build service could build native packages for various systems making it unnecessary to trouble with that locally.

If an application stops working with a new version of a library it's typically a bug relying on undocumented features. The mentioned example of applications unportably calling memcpy, rather than memmove is such a case. Where libraries change the ABI, their names change eg) KDE3 to KDE4 or big changes to GTK. Calling functions not present in older target systems is again a portability error.

memcpy re-versioning

Posted Nov 14, 2018 12:41 UTC (Wed) by smurf (subscriber, #17840) [Link] (2 responses)

Happiness.

IMHO, any program that depends on *documented* nonsense like overlapping ranges in memcpy() (the manpage says "must not", dammit) deserves to die a fiery death and should suffer a segfault (triggered by a check in memcpy), instead of saddling everybody with a symbol version uptick that blocks backwards compatibility.

memcpy re-versioning

Posted Nov 14, 2018 16:48 UTC (Wed) by nybble41 (subscriber, #55106) [Link]

> should suffer a segfault (triggered by a check in memcpy)

That check would incur just as much overhead as aliasing memcpy to memmove, thus eliminating the undefined behavior altogether. The performance advantage of memcpy comes from *not* checking for overlapping regions.

It could be a useful debug aid, though—perhaps a validating memcpy implementation could be substituted during testing with LD_PRELOAD.

memcpy re-versioning

Posted Nov 15, 2018 16:03 UTC (Thu) by nix (subscriber, #2304) [Link]

The dependency is almost always unintentional: the author of some function that calls memcpy() doesn't realise that sometimes it accepts pointers that are in aliased ranges, and the caller doesn't realise that the function they're calling calls memcpy().