C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
Posted Nov 13, 2018 18:49 UTC (Tue) by Cyberax (✭ supporter ✭, #52523)In reply to: C library system-call wrappers, or the lack thereof by rweikusat2
Parent article: C library system-call wrappers, or the lack thereof
This is a fucking libc. The functions that I'm using have not materially changed for 15 years or so.
> A program compiled on the newer system which doesn't use symbols not available on the older one can run on the older system. This wouldn't be possible if the soname had changed.
You don't change the user-visible behavior of functions in THE FREAKING LIBC.
You can add NEW functions, and in this case software compiled on newer libcs will just fail to link with older libcs lacking the new functions.
> All programs compiled on the older system can run on the newer system without being recompiled. This wouldn't be possible if the soname had changed.
Wrong. Even in your model.
Posted Nov 13, 2018 19:09 UTC (Tue)
by rweikusat2 (subscriber, #117920)
[Link] (29 responses)
If you think so, you should determine which symbol was wrongly flagged as different and file a bug against glibc.
> You don't change the user-visible behavior of functions in THE FREAKING LIBC.
That's a policy you may consider sensible but the people who work "on the freaking libc" apparently don't. Eg, bugs are user-visible behaviour. Some people want them fixed.
Again, each individual change may or may not be appropriate and/or people might have different opinions about this. Without a specific example, this cannot be assessed.
>> All programs compiled on the older system can run on the newer system without being recompiled. This wouldn't be possible if the soname had changed.
This is not "my model" as I'm not the GNU C library but that's how the model of the GNU C library is supposed to work: A program linked against version a.b.c of symbol frxblz[*] will use version a.b.c of symbol frxblz even if a newer version d.e.f is also available. Which wouldn't be possible if the soname had been changed instead.
Whether or not this works in a specific case is a different question. But that's what it's supposed to provide.
[*] coming up with a combination of letters which isn't some sort of possibly offensive insult in American English is surprisingly difficult :->
Posted Nov 13, 2018 19:16 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link] (28 responses)
There are non-idiotic libcs out there and somehow they work just fine without breaking stuff. Then we have Linux itself that somehow manages to preserve even more complicated ABI than glibc's.
> This is not "my model" as I'm not the GNU C library but that's how the model of the GNU C library is supposed to work: A program linked against version a.b.c of symbol frxblz[*] will use version a.b.c of symbol frxblz even if a newer version d.e.f is also available. Which wouldn't be possible if the soname had been changed instead.
Posted Nov 13, 2018 20:06 UTC (Tue)
by nybble41 (subscriber, #55106)
[Link] (27 responses)
That's just ad-hoc symbol versioning. Programs which use the new symbols still won't run on older systems. It's strictly worse than what glibc does.
As for your "simple program that uses ages-old epoll/read/print functions", the symbol versions for these APIs in glibc 2.27 are as follows:
epoll_create, epoll_wait, epoll_ctl - GLIBC_2.3.2
That means any application using *only* these symbols would be able to run without modification on RHEL-3.9 (glibc 2.3.2). RHEL6 should not be a problem.
Posted Nov 13, 2018 21:18 UTC (Tue)
by ldarby (guest, #41318)
[Link] (26 responses)
readelf -a foo | grep memcpy
and this doesn't work on centos 6:
ldd ./foo
Posted Nov 13, 2018 22:12 UTC (Tue)
by rweikusat2 (subscriber, #117920)
[Link] (22 responses)
It's also unquestionable that increasing the symbol version was justified as this was a change which did cause "really existing software" to break.
But this is really a problem with no good solution, only different tradeoffs which all end up being detrimental to someone. Either no code which was released must ever be changed in a new version as applications will depend on undocumented and even on unintentional properties of it. Or loads of existing binaries suddenly break in interesting ways and there's the very real possibilty that "recompile the code" is not an option. Or people developing new code who can recompile that because they have both the code and the necessary tools/ environment available must restrict themselves to actually using systems said to be supported.
As to the change in question: My opinion on this is that this here
But that's just my opinion and certainly not a universal consensus.
Posted Nov 13, 2018 22:42 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link] (16 responses)
Posted Nov 14, 2018 0:19 UTC (Wed)
by nybble41 (subscriber, #55106)
[Link] (15 responses)
There is already a function with the semantics of "memcpy_fast" included in the C standards. It's called "memcpy". There is approximately zero chance that a new, non-portable alternative to memcpy() would have been introduced just to maintain compatibility with applications abusing memcpy() in situations where they should have employed memmove().
Even if they did take that approach programs would need to be modified to use the new API, which would be a colossal undertaking as well as a major step backward in terms of portability. The main performance benefit of eliminating the extra branch required for memmove() semantics comes from the myriad *small* memory copies performed throughout any non-trivial application. The change would be mostly pointless if it didn't automatically encompass all existing memcpy() users.
Without symbol versioning the most likely resolution would have been to simply let the non-compliant programs break.
Posted Nov 14, 2018 0:33 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (14 responses)
So the ONLY correct choice is to stick with it and improve the language standard to include a new function that explicitly defines the handling of overlaps.
> Even if they did take that approach programs would need to be modified to use the new API,
> which would be a colossal undertaking as well as a major step backward in terms of portability.
Posted Nov 14, 2018 12:51 UTC (Wed)
by nix (subscriber, #2304)
[Link] (13 responses)
Ah, but introducing new functions is *also* a compatibility problem, because there is only one namespace for symbols, and now you will collide, either at compile time or runtime, with other programs already using the function you chose. (When getline() was introduced, it broke compilation of a *lot*. But only compilation, because glibc was not written by idiots: at runtime it still worked, where your proposals would fail utterly. It appears you don't know the ELF symbol resolution rules.)
Posted Nov 14, 2018 19:00 UTC (Wed)
by quotemstr (subscriber, #45331)
[Link] (12 responses)
Posted Nov 14, 2018 21:07 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link] (10 responses)
The problem is spelled out here:
https://blog.tim-smith.us/2015/09/python-extension-module...
where it is hacked around by just using `-undefined dynamic_lookup` and not linking libPython.dylib so that it's just found at runtime. But, this also means that all missing symbols are ignored until runtime. I have this patch:
https://github.com/mathstuf/ld64/commit/4eebe0c07e8ab706e...
which is a better fix, but I don't know how to convince Xcode to build the damn project and I need to figure out how/where to send the patch to get it reviewed by Apple.
Posted Nov 15, 2018 10:51 UTC (Thu)
by lgerbarg (guest, #57988)
[Link] (9 responses)
-bundle_loader executable
This will find exported symbols in the main executable, encode the binds such that dyld will apply them to whatever the main executable in the current process is. This has the same runtime semantics that the linked patch is trying to achieve, except in the error case that occurs if a newer version of the interpreter accidentally removes an exported symbol that you bundle needs. Using -bundle_loader it will fail immediately after searching the main executable, instead of continuing to search other images in the process. Given that you know the symbols are supposed to be provided by the interpreter that is probably a preferable behavior.
Posted Nov 15, 2018 18:15 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link] (8 responses)
Posted Nov 15, 2018 19:07 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link] (7 responses)
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -Dmod_EXPORTS -I/System/Library/Frameworks/Python.framework/Versions/2.7/Headers -fPIC -MD -MT CMakeFiles/mod.dir/mod.c.o -MF CMakeFiles/mod.dir/mod.c.o.d -o CMakeFiles/mod.dir/mod.c.o -c ../mod.c
makes this:
$ otool -L mod.so
which still references a specific Python and is wrong anyways:
$ /opt/local/bin/python2.7 -m mod # Use a MacPorts Python
Posted Nov 15, 2018 23:25 UTC (Thu)
by lgerbarg (guest, #57988)
[Link] (6 responses)
If you can find the actual linker invocation (either the driver with -W,-bundle_loader, or the actual call through to ld64) I can probably tell you what’s going wrong with the invocation, though off the top of my head I ha w no idea out how to get CMake to pass through the correct flags.
Posted Nov 15, 2018 23:33 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link] (5 responses)
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -bundle -Wl,-headerpad_max_install_names -o mod.so CMakeFiles/mod.dir/mod.c.o -bundle_loader /System/Library/Frameworks/Python.framework/Versions/2.7/lib/libpython2.7.dylib
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc -bundle -Wl,-headerpad_max_install_names -o mod.so CMakeFiles/mod.dir/mod.c.o -Wl,-bundle_loader,/System/Library/Frameworks/Python.framework/Versions/2.7/lib/libpython2.7.dylib
Posted Nov 16, 2018 21:53 UTC (Fri)
by lgerbarg (guest, #57988)
[Link] (4 responses)
Posted Nov 16, 2018 22:06 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (3 responses)
Posted Nov 16, 2018 22:08 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
Posted Nov 18, 2018 21:12 UTC (Sun)
by lgerbarg (guest, #57988)
[Link] (1 responses)
Now that I understand what you are trying to do a bit more clearly (sorry about the confusion) -bundle_loader is not appropriate unless you can re-export the symbols from the main executable. I think that would be the best option from a technical perspective, but it is probably unreasonable since it would require changes and back ports to all of the python interpreters.
Assuming that you need to get this to work without making changes to python itself. I think you should pass ld "-undefined dynamic_lookup." It is kind of gross, but it should work for your use case. If there is ever a desire to improve this behavior for future pythons there changes to the interpreter that would make modules work better:
1) Reexporting the symbols from libpython out of the python interpreter itself and using -bundle_loader
Posted Nov 20, 2018 15:16 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
> I think you should pass ld "-undefined dynamic_lookup." It is kind of gross
That works for today. And projects which don't only target macOS are almost never going to do what seems like a bad legacy behavior. The "gross" part of this solution is what led to the patch to ld64.
> Renaming or symlinking the library exporting the API to have an unversioned name (like "libpython.dylib") and then having python set an LC_RPATH to the directory containing the dylib/symlink. Then modules could link to them via "@rpath/libpython.dylib" .
The name of the file doesn't matter. The binary must be edited using `install_name_tool -id` to change what gets embedded in the linking binary. And RPATH on macOS is awful. It only applies if explicitly requested and if something says "use me via rpath", no tools (I know of) add the paths automatically and they must be manually added.
Posted Nov 15, 2018 16:13 UTC (Thu)
by nix (subscriber, #2304)
[Link]
Posted Nov 14, 2018 0:29 UTC (Wed)
by ldarby (guest, #41318)
[Link] (3 responses)
Not sure I agree with that. If Linus "We don't break userspace" Torvalds had won the argument over Ulrich "buggy programs cannot be allowed to prevent a correct implementation" Drepper (sorry, paraphrasing), they would have just made a 2nd change of aliasing memcpy to memmove, un-breaking the buggy programs and not had any of this version hassle to deal with now.
Posted Nov 14, 2018 16:38 UTC (Wed)
by rweikusat2 (subscriber, #117920)
[Link] (2 responses)
Posted Nov 14, 2018 19:54 UTC (Wed)
by ldarby (guest, #41318)
[Link] (1 responses)
Posted Nov 14, 2018 20:01 UTC (Wed)
by rweikusat2 (subscriber, #117920)
[Link]
The "non-technical users" also "don't have a damn clue" how the other software they're using works. But as they're just using it and not developing it, this should be ok, don't you think so?
Posted Nov 18, 2018 10:07 UTC (Sun)
by roblucid (guest, #48964)
[Link]
Applications were in past often shared over a network, so the executables ran on different CPUs and OS versions. Even in Linux distros would install differing versions of glibc, to allow the library to use Pentium 2 features whilst the packages installed were universal i386.
From my experience, this problem raised is moot, the error experienced is simply due to an improper build environment which needs to produce executables linked for the oldest system. There was on OpenSuSE a way of installing LCD library versions to build packages against. Later a build service could build native packages for various systems making it unnecessary to trouble with that locally.
If an application stops working with a new version of a library it's typically a bug relying on undocumented features. The mentioned example of applications unportably calling memcpy, rather than memmove is such a case. Where libraries change the ABI, their names change eg) KDE3 to KDE4 or big changes to GTK. Calling functions not present in older target systems is again a portability error.
Posted Nov 14, 2018 12:41 UTC (Wed)
by smurf (subscriber, #17840)
[Link] (2 responses)
IMHO, any program that depends on *documented* nonsense like overlapping ranges in memcpy() (the manpage says "must not", dammit) deserves to die a fiery death and should suffer a segfault (triggered by a check in memcpy), instead of saddling everybody with a symbol version uptick that blocks backwards compatibility.
Posted Nov 14, 2018 16:48 UTC (Wed)
by nybble41 (subscriber, #55106)
[Link]
That check would incur just as much overhead as aliasing memcpy to memmove, thus eliminating the undefined behavior altogether. The performance advantage of memcpy comes from *not* checking for overlapping regions.
It could be a useful debug aid, though—perhaps a validating memcpy implementation could be substituted during testing with LD_PRELOAD.
Posted Nov 15, 2018 16:03 UTC (Thu)
by nix (subscriber, #2304)
[Link]
C library system-call wrappers, or the lack thereof
>Wrong. Even in your model.
C library system-call wrappers, or the lack thereof
That's because glibc is written by very, very misguided people. Seriously.
You can add new symbols if you must. Windows does this (WriteFile, WriteFileEx, etc). But nothing requires you to change the user-visible behavior of existing symbols.
C library system-call wrappers, or the lack thereof
read, printf - GLIBC_2.2.5
C library system-call wrappers, or the lack thereof
000000601020 000300000007 R_X86_64_JUMP_SLO 0000000000000000 memcpy@GLIBC_2.14 + 0
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memcpy@GLIBC_2.14 (3)
55: 0000000000000000 0 FUNC GLOBAL DEFAULT UND memcpy@@GLIBC_2.14
./foo: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by ./foo)
That was eight years ago.
C library system-call wrappers, or the lack thereof
void *memcpy(void *dst, void *src, size_t n)
{
char *d, *s;
d = dst;
s = src;
while (n) --n, d[n] = s[n];
return dst;
}
is a nice, simple and portable implementation of memcpy and instead of terror-optimizing this algorithm using whatever "latest and greatest" CPU support for absurdly large block memory copies happens to be available, one should stick to something like this as default implementation and leave it to people to whom absurdly large block memory copies are a real performance problem to figure out how they can either avoid these or speed them up.
C library system-call wrappers, or the lack thereof
This just means you need to create another function - "memcpy_fast" or whatever.
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
This is EXACTLY why a new function should have been introduced. Handling of overlapping became an implicit part of the memcpy interface. It shouldn't have but it did.
And that's fine. Nobody died because one optimization of memcpy became impossible.
Nonsense. Other systems either have the implicit overlap handling guarantees of memcpy or the software is already non-portable to them.
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
This specifies the executable that will be loading the bundle output file being linked. Undefined symbols from the bundle are checked
against the specified executable like it was one of the dynamic libraries the bundle was linked with.
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
mod.so:
/System/Library/Frameworks/Python.framework/Versions/2.7/Python (compatibility version 2.7.0, current version 2.7.10)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1213.0.0)
/opt/local/Library/Frameworks/Python.framework/Versions/2.7/Resources/Python.app/Contents/MacOS/Python: No code object available for mod
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
or
2) Renaming or symlinking the library exporting the API to have an unversioned name (like "libpython.dylib") and then having python set an LC_RPATH to the directory containing the dylib/symlink. Then modules could link to them via "@rpath/libpython.dylib" .
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
Again, a solution to this is surprisingly simple: Create a file with the following content:
C library system-call wrappers, or the lack thereof
#include <sys/types.h>
void *memmove(void *, void *, size_t);
int puts(char *);
void *memcpy(void *d, void *s, size_t n)
{
puts("honk");
return memmove(d, s, n);
}
compile it with
gcc -fpic -shared -o x.so x.c
and use it together with another compiled program calling memcpy via LD_PRELOAD. Voila --- memcpy aliased to memmove without penalizing code not relying on undefined behaviour (the puts obviously just exists to demonstrate that the function is indeed being called).
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
C library system-call wrappers, or the lack thereof
memcpy re-versioning
memcpy re-versioning
memcpy re-versioning