|
|
Subscribe / Log in / New account

More hygienic linking

More hygienic linking

Posted Mar 30, 2024 16:04 UTC (Sat) by epa (subscriber, #39769)
Parent article: A backdoor in xz

The backdoor initially intercepts execution by replacing the ifunc resolvers crc32_resolve(), crc64_resolve() with different code,
So… how does it get to do that? When linking in a library, by default it can not only add symbols, but override them?

I think the default should be “safe linking” where each .so file can only define its own symbols and not override those from elsewhere. If it needs to override, that could be specified explicitly on the linker command line (and if necessary encoded in the resulting executable, if it’s dynamically linked). Even if you’re not concerned about security, a general respect for “what can go wrong will go wrong” and for minimizing accidental strange behaviour would suggest not allowing overrides without explicit say-so.

I dare say getting to there from here might be awkward, with the same symbol defined multiple times but nobody has had to care about it until now.


to post comments

More hygienic linking

Posted Mar 30, 2024 16:48 UTC (Sat) by smurf (subscriber, #17840) [Link] (1 responses)

When you load a library, its .init section gets executed. Likewise .fini when the library is unloaded / program ends regularly.

(Disclaimer: the actual details are a bit more hairy than that.)

The code in that section is typically used to set up variables and global objects and whatnot, but it can and in this case indeed does do whatever it damn well pleases. Restricting that capability is … difficult.

More hygienic linking

Posted Mar 30, 2024 17:02 UTC (Sat) by epa (subscriber, #39769) [Link]

Right. Libraries would have to be written in a restricted style where there isn’t global data to initialize, and if there’s some kind of setup work that must be done by an explicit call from the application code. Which I suggest is all good practice anyway. But likely to meet pushback from those who don’t see a need to change.

More hygienic linking

Posted Mar 31, 2024 3:53 UTC (Sun) by danobi (subscriber, #102249) [Link] (2 responses)

I was wondering that as well and tried to get ifunc to behave like that. Couldn't get it to work. But then I read https://gynvael.coldwind.pl/?lang=en&id=782 and saw this bit:

((head -c +$N > /dev/null 2>&1) && head -c +$W) > liblzma_la-crc64-fast.o

Looks like the binary payload just replaces one of the ifunc implementations at build time. At least that's my guess.

More hygienic linking

Posted Apr 1, 2024 11:24 UTC (Mon) by excors (subscriber, #95769) [Link] (1 responses)

I think there's a bit more than that. The injected.txt script in Andres Freund's email includes:

V='[...]extern int _get_cpuid(int, void*, void*, void*, void*, void*);\nstatic inline bool _is_arch_extension_supported(void) { int success = 1; uint32_t r[4]; success = _get_cpuid(1, &r[0], &r[1], &r[2], &r[3], [...]'
[...]
sed "/return is_arch_extension_supported()/ c\return _is_arch_extension_supported()" $top_srcdir/src/liblzma/check/crc64_fast.c

In the upstream code, is_arch_extension_supported calls __get_cpuid (provided by GCC). The backdoored build script modifies crc64_fast.c so it calls _get_cpuid (one underscore) instead. _get_cpuid is defined in liblzma_la-crc64-fast.o, which is a binary hidden in the test files.

If I understand correctly, the loader executes the ifunc resolver function (crc64_resolve) when liblzma is loaded. crc64_resolve calls [_]is_arch_extension_supported to determine if it can use the optimised version with the CLMUL instruction. But now that actually calls _get_cpuid which initialises the runtime backdoor.

So, there's nothing funny happening with the linker and there's no symbol overriding. The functions are "replaced" using sed before compiling.

(I guess they could have simply had the build script insert an __attribute__((constructor)) function, but anyone debugging or profiling a liblzma-using application might have been surprised to see liblzma functions getting executed at startup. Using ifuncs makes it less suspicious - liblzma has a good reason to run crc64_resolve and ...get_cpuid, clearly visible in the upstream code (modulo some underscores), so they probably hoped anyone investigating would stop at that point.)

More hygienic linking

Posted Apr 1, 2024 15:28 UTC (Mon) by andresfreund (subscriber, #69562) [Link]

> (I guess they could have simply had the build script insert an __attribute__((constructor)) function, but anyone debugging or profiling a liblzma-using application might have been surprised to see liblzma functions getting executed at startup. Using ifuncs makes it less suspicious - liblzma has a good reason to run crc64_resolve and ...get_cpuid, clearly visible in the upstream code (modulo some underscores), so they probably hoped anyone investigating would stop at that point.)

I *think* init functions get executed a bit later, when the .got section for all the libraries already have been remapped read-only. So they couldn't have pulled the trick of redirecting a function despite -z now -z relro being used.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds