Free software's not-so-eXZellent adventure

Posted Apr 2, 2024 21:59 UTC (Tue) by khim (subscriber, #9252)
In reply to: Free software's not-so-eXZellent adventure by bredelings
Parent article: Free software's not-so-eXZellent adventure

> If any library that is transitively linked into an executable can override any function in the executable

Which is, of course, possible on majority of OSes (not 100% about OpenBSD, there are some mitigations in place which would require certain amount of non-trivial dance to achieve that, but I think it's possible even there).

> then the intuition of most programmers is going to be wrong.

Maybe we should fix the intuition, then?

Libraries are injected into the address space of the same process. They inherently may do “everything” in that process, including, but not limited to the rewrite of the code of other libraries (or binary). You may play cat-and-mouse game with trying to plug these holes, but that protection would never be water-tight, it's not not a security boundary in a normal process. By design.

Sure, we may recall ideas of iAPX423/e2k/CHERI, but I wouldn't expect these to become a mainstream any time soon and till that would happen changing the intuition is the only possible interim solution.

Free software's not-so-eXZellent adventure

Posted Apr 3, 2024 6:16 UTC (Wed) by tux3 (subscriber, #101245) [Link] (3 responses)

But we should play the cat and mouse game anyways. The more we push attackers to do the noisy, annoying, heavy-handed thing, the more chance we have of just detecting it instead.

We don't even need to close all the holes. Just the ones that are cheap and hard to detect.

A CI system somewhere sending an email because sshd in Debian sid is seen doing weird .text overwrite at load time? This also catches the mouse.

If you can't outrun them, force 'em to do crazy things that no normal free software package does, so that we have a chance to detect it.

Without ifuncs, I think they'd have had a much harder time making this stealthy. And for a backdoor that was detected because it disturbed a microbenchmark, stealth is everything!

Free software's not-so-eXZellent adventure

Posted Apr 3, 2024 11:22 UTC (Wed) by khim (subscriber, #9252) [Link] (2 responses)

> Without ifuncs, I think they'd have had a much harder time making this stealthy.

Without ifuncs they would just need to play with internal sshd data structures to find any suitable function pointer they may overwrite.

And without ifuncs there would be many more such pointers, because the need to have extra-fast `memcpy` wouldn't go away without ifuncs. Only instead of mostly-protected GOT these redirectors would be placed in regular data segment.

> And for a backdoor that was detected because it disturbed a microbenchmark, stealth is everything!

I'm not all that sure the end result would reduction in the attack surface and would actually suspect it would actually increase it.

If you remove central CPUID-support mechanism then you are forcing people to invent their own, ad-hoc ones. History says that usually this leads to negative impact on security, not positive one: instead of one, central mechanism which can be monitored for doing something unusual you are getting hundreds of ad-hock implementations.

You may find the example here (it's bionic, yes, but that's part which existed before they added support for ifunc's). It's a bit harder to [ab]use today, it used to have ready-to-use-GOT-replacement for most malloc-related functions.

And because crypto-routines are very speed-conscious (different CPUs offer different ways to improve them) chances are high that without ifuncs there would tons of easy-to-rewrite function pointer tables.

> If you can't outrun them, force 'em to do crazy things that no normal free software package does, so that we have a chance to detect it.

Doesn't work if you also force free software packages to do crazy things that add more entry points for malicious actors. Your attempt to neuter ifuncs would do just that.

Free software's not-so-eXZellent adventure

Posted Apr 7, 2024 17:28 UTC (Sun) by mathstuf (subscriber, #69389) [Link] (1 responses)

I'm shooting into the dark, but couldn't `ifunc` support be limited to touching symbols for the shared object itself and not others? Breaks down for fully static builds though…

Free software's not-so-eXZellent adventure

Posted Apr 7, 2024 17:36 UTC (Sun) by mathstuf (subscriber, #69389) [Link]

Seems like this is already the case: https://lwn.net/Articles/968256/

Free software's not-so-eXZellent adventure

Posted Apr 3, 2024 7:53 UTC (Wed) by epa (subscriber, #39769) [Link] (9 responses)

But the sshd code does not make any calls into the xz library. Without the ifunc hooks the malicious code would never get a chance to run. What’s surprising and dangerous is not that library code has full access to the process’s address space (we all knew that) but that by merely linking a library to satisfy some transitive dependency you can end up running arbitrary code, even though you never call any of the library’s functions.

Free software's not-so-eXZellent adventure

Posted Apr 3, 2024 10:48 UTC (Wed) by farnz (subscriber, #17727) [Link] (8 responses)

Actually fixing that edge case is going to be extremely hard. ELF has chosen to enable compilers to implement C++ dynamic initialization (for statics) as running code before main starts (via the .init_array section), which means that loading a library is enough to run code from that library.

While it is, per the standard, permissible to reimplement this so that dynamic initialization is deferred until first use of a symbol, that's a significant change to current C++ compilers (including GCC and Clang), and is likely (for efficiency reasons) to need changes to ELF so that the dynamic linker can determine whether this is the "first use" of a symbol and run the initializer at this point, rather than at program startup.

This is a behaviour change that will surprise quite a few people - there will be C++ programs that expect side-effects of globals to be run before main - but it's a major blocker to preventing a malicious library from running code just because it's linked.

Free software's not-so-eXZellent adventure

Posted Apr 4, 2024 8:15 UTC (Thu) by epa (subscriber, #39769) [Link] (7 responses)

I imagine that when adding a library you would have two choices. “Safe linking” would not allow any initialization code to run on startup, and would be the default. If you have a library with C++ initializers, or other funky stuff, you’d need to allow that on the linker command line. It would have been an obvious red flag if the xz project started telling users they now needed to enable unsafe linking.

Free software's not-so-eXZellent adventure

Posted Apr 4, 2024 11:22 UTC (Thu) by farnz (subscriber, #17727) [Link] (6 responses)

The problem with "safe linking" as a concept is indirect dependencies. I might turn off safe linking because I know that one of my dependencies uses a C++ static global; that then means that I have to (at the very least) trust it to not have dependencies that should have used safe linking, but didn't, and depending on the implementation, I might have to trust all my dependencies to not have unwanted startup code.

It'd be better to take advantage of the fact that, at its strongest, the C++ standard has required statics to be initialized before the first reference by the program to a symbol from the translation unit the static is defined in. You can, under C++ rules, delay running the initializer code until the dynamic linker has determined that a given library actually provides the desired symbol (which avoids shenanigans with weak symbols where the final link is done against a different library).

Doing this consistently (including ensuring that IFUNCs only come into play once you've resolved the IFUNC as the definition of this symbol, avoiding games with weak symbols there, too) would ensure that "merely linking" a library is harmless up until you use a symbol from that library; there's no specification that prevents this, it's just a lot of changes to make to ELF and to compilers.

Free software's not-so-eXZellent adventure

Posted Apr 4, 2024 12:07 UTC (Thu) by khim (subscriber, #9252) [Link]

> there's no specification that prevents this

There are absolute is a specification that prevents it. It's called System V Application Binary Interface.

And C++ standard is pretty explicit about it, too: It is implementation-defined whether the dynamic initialization of a non-block non-inline variable with static storage duration is sequenced before the first statement of main or is deferred.

For better or for worse, ELF have picked “sequenced before the first statement of main” option.

You couldn't just go and change it without defining new platform.

And if you are defining new platform and asking everyone to adopt it anyway then you may as well declare that this particular part of the standard doesn't apply to that new platform.

But that would help apps that are written for the existing platform.

It's incredibly common for the C++ programs to rely on registry that is filled in global constructors (think Abseil Flags) and you may expect significant pushback to such a proposal.

OpenBSD may do that, they don't care about being usable by Joe Average and they may just patch all the programs they care about. Linux couldn't.

> it's just a lot of changes to make to ELF and to compilers

It's definition of a new platform, first and foremost. It's not normal GNU/Linux anymore.

Free software's not-so-eXZellent adventure

Posted Apr 4, 2024 17:22 UTC (Thu) by epa (subscriber, #39769) [Link] (3 responses)

I was envisaging that you’d also have to specify all the libraries you link against. So if libsystemd requires libxz, you’d need -lsystemd -lxz on the command line when building something that uses libsystemd. No hidden transitive dependencies.

Free software's not-so-eXZellent adventure

Posted Apr 4, 2024 17:27 UTC (Thu) by farnz (subscriber, #17727) [Link] (2 responses)

That opens up a different problem; if only the main binary can indicate dependencies (so no indirect dependencies), how do you remove your liblzma dependency when libsystemd stops depending on it? How do you cope when a future version of libsystemd adds a dependency on libzstd2?

Remember that we're talking about the dynamic linker at this point, not the static linker - so you can't rely on the version of libsystemd you used at compile time being the one used at runtime.

Free software's not-so-eXZellent adventure

Posted Apr 4, 2024 18:54 UTC (Thu) by epa (subscriber, #39769) [Link] (1 responses)

You’re right. Removing the dependency would be a bit tardy. But having to add it explicitly is kind of a feature and what I intended. So if your program starts to link in libxz, that’s because you explicitly okayed it and not just a hidden transitive dependency of some other project.

In C programming, there are those who argue that header files should not include other header files. Rather, if a header needs something then each user should #include its dependencies explicitly. I don’t have a view on the wisdom of that for C development but it’s a similar idea to what I am suggesting at the linking step.

Perhaps given a difference between “safe linking”, which can only define new symbols and not replace them or execute arbitrary code at startup, and the full-fat kind that can do things merely by including a library, transitive dependencies could be included by default if they are “safe”, but if you have a dependency of a dependency which wants to have ifunc hooks or startup code, you must explicitly ask for it or the linker will complain.

Free software's not-so-eXZellent adventure

Posted Apr 5, 2024 9:38 UTC (Fri) by smurf (subscriber, #17840) [Link]

> In C programming, there are those who argue that header files should not include other header files. Rather, if a header needs something then each user should #include its dependencies explicitly.

There's a reason for this attitude, which is that C is somewhat brain dead and doesn't have namespaces, an explicit list of things included files export, or anything else along these lines. Thus any random macros which X exports to Y are visible in Z, and the programmer working on Z should be aware of that.

Fortunately many public include files of non-system headers use prefixed symbols, and the linker has gotten better at complaining when globals collide. Thus this is not much of an issue.

On the other hand, this practice severely hampers upwards compatibility. When A includes B which includes C which requires a new system header D, should we force A to add some random include file? not really IMHO.

Free software's not-so-eXZellent adventure

Posted Apr 7, 2024 17:19 UTC (Sun) by mathstuf (subscriber, #69389) [Link]

> It'd be better to take advantage of the fact that, at its strongest, the C++ standard has required statics to be initialized before the first reference by the program to a symbol from the translation unit the static is defined in.

FWIW, macOS does this, so while it is not compatible with the SysV ABI as khim points out, anything cross platform that needs reliable static initialization and is cross-platform needs to handle this case already. I'd be fine with a compile flag to make this doable in ELF with an opt-in compile flag…