More hygienic linking
More hygienic linking
Posted Mar 30, 2024 16:04 UTC (Sat) by epa (subscriber, #39769)Parent article: A backdoor in xz
The backdoor initially intercepts execution by replacing the ifunc resolvers crc32_resolve(), crc64_resolve() with different code,So… how does it get to do that? When linking in a library, by default it can not only add symbols, but override them?
I think the default should be “safe linking” where each .so file can only define its own symbols and not override those from elsewhere. If it needs to override, that could be specified explicitly on the linker command line (and if necessary encoded in the resulting executable, if it’s dynamically linked). Even if you’re not concerned about security, a general respect for “what can go wrong will go wrong” and for minimizing accidental strange behaviour would suggest not allowing overrides without explicit say-so.
I dare say getting to there from here might be awkward, with the same symbol defined multiple times but nobody has had to care about it until now.
Posted Mar 30, 2024 16:48 UTC (Sat)
by smurf (subscriber, #17840)
[Link] (1 responses)
(Disclaimer: the actual details are a bit more hairy than that.)
The code in that section is typically used to set up variables and global objects and whatnot, but it can and in this case indeed does do whatever it damn well pleases. Restricting that capability is … difficult.
Posted Mar 30, 2024 17:02 UTC (Sat)
by epa (subscriber, #39769)
[Link]
Posted Mar 31, 2024 3:53 UTC (Sun)
by danobi (subscriber, #102249)
[Link] (2 responses)
((head -c +$N > /dev/null 2>&1) && head -c +$W) > liblzma_la-crc64-fast.o
Looks like the binary payload just replaces one of the ifunc implementations at build time. At least that's my guess.
Posted Apr 1, 2024 11:24 UTC (Mon)
by excors (subscriber, #95769)
[Link] (1 responses)
V='[...]extern int _get_cpuid(int, void*, void*, void*, void*, void*);\nstatic inline bool _is_arch_extension_supported(void) { int success = 1; uint32_t r[4]; success = _get_cpuid(1, &r[0], &r[1], &r[2], &r[3], [...]'
In the upstream code, is_arch_extension_supported calls __get_cpuid (provided by GCC). The backdoored build script modifies crc64_fast.c so it calls _get_cpuid (one underscore) instead. _get_cpuid is defined in liblzma_la-crc64-fast.o, which is a binary hidden in the test files.
If I understand correctly, the loader executes the ifunc resolver function (crc64_resolve) when liblzma is loaded. crc64_resolve calls [_]is_arch_extension_supported to determine if it can use the optimised version with the CLMUL instruction. But now that actually calls _get_cpuid which initialises the runtime backdoor.
So, there's nothing funny happening with the linker and there's no symbol overriding. The functions are "replaced" using sed before compiling.
(I guess they could have simply had the build script insert an __attribute__((constructor)) function, but anyone debugging or profiling a liblzma-using application might have been surprised to see liblzma functions getting executed at startup. Using ifuncs makes it less suspicious - liblzma has a good reason to run crc64_resolve and ...get_cpuid, clearly visible in the upstream code (modulo some underscores), so they probably hoped anyone investigating would stop at that point.)
Posted Apr 1, 2024 15:28 UTC (Mon)
by andresfreund (subscriber, #69562)
[Link]
I *think* init functions get executed a bit later, when the .got section for all the libraries already have been remapped read-only. So they couldn't have pulled the trick of redirecting a function despite -z now -z relro being used.
More hygienic linking
More hygienic linking
More hygienic linking
More hygienic linking
[...]
sed "/return is_arch_extension_supported()/ c\return _is_arch_extension_supported()" $top_srcdir/src/liblzma/check/crc64_fast.c
More hygienic linking