|
|
Subscribe / Log in / New account

A backdoor in xz

A backdoor in xz

Posted Mar 30, 2024 18:23 UTC (Sat) by Cyberax (✭ supporter ✭, #52523)
In reply to: A backdoor in xz by bluca
Parent article: A backdoor in xz

This is an EXTREMELY bad move from systemd. A dlopen() is a much more worrying signal of exploitation, because it's so unused. And libsystemd will make it normal.

It also won't close off all avenues of attack. A malicious library can patch the code, ptrace() its process, modify the environment, etc.


to post comments

A backdoor in xz

Posted Mar 30, 2024 19:12 UTC (Sat) by andresfreund (subscriber, #69562) [Link] (16 responses)

There already are dlopens in things like sshd, via e.g. PAM.

A backdoor in xz

Posted Mar 30, 2024 19:36 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link] (14 responses)

Yeah, and I also forgot about the horror of nsswitch.

Still, we should start cutting back on this kind of nonsense.

A backdoor in xz

Posted Mar 31, 2024 13:46 UTC (Sun) by nix (subscriber, #2304) [Link] (13 responses)

So to you, dlopen is a signal of exploitation and should be avoided because it's so rare, until it is pointed out that it's not rare and is already used in a wide variety of processes, whereupon you switch to calling unclear things 'this kind of nonsense', cite nsswitch (which is not relevant, given that PAM is at issue here), and suggest, what? Removing PAM and nsswitch?

That's going to work really well given how many sites use both to fold in new hostname lookup mechanisms, new user lookup mechanisms, and new and fairly complex authentication patterns on the fly.

Anyway, dropping nsswitch and PAM wouldn't even really help, despite being immensely disruptive. dlopen does have its problems[1] and it is reasonable to prefer to avoid it when possible, but it is not rare even in the absence of nsswitch and PAM. Try adding reporting to glibc to see how often it's invoked on real running systems. (It's a *lot*. Even syslog daemons make extensive use of it these days, so you can't even say "perhaps daemons running as root can't dlopen".)

You cannot use 'this uses dlopen' as a signal of suspiciousness, or of anything really, any more than 'this is dynamically linked' is such a signal. "This has IFUNC resolvers that redirect symbols in other libraries" is definitely an actual sign of badness that I've never heard of anything legitimate doing, and I'm wondering if glibc could detect and block that somehow without too much cost (it would at least involve stack frame walks, but the resolver has to mess with the stack frame anyway...)

[1] now that prelink is dead, mostly that you can't use ldd to statically determine what the shared library dep tree is and what things might be potentially impacted by ABI changes, which *is* actually problematic on real systems

A backdoor in xz

Posted Mar 31, 2024 16:12 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link] (12 responses)

> So to you, dlopen is a signal of exploitation and should be avoided because it's so rare, until it is pointed out that it's not rare and is already used in a wide variety of processes, whereupon you switch to calling unclear things 'this kind of nonsense', cite nsswitch (which is not relevant, given that PAM is at issue here), and suggest, what? Removing PAM and nsswitch?

Yeah, exactly. Remove dlopen() calls by refactoring the relevant systems. For example, musl libc does not have nsswitch (and has a built-in NSCD). PAM is already optional.

A backdoor in xz

Posted Mar 31, 2024 17:02 UTC (Sun) by nix (subscriber, #2304) [Link] (11 responses)

So... that this wouldn't actually help solve this problem is not important to you, then? (You clipped that out of my original reply without comment.)

That's a sign of someone on a hobby-horse if I ever heard of one.

(As someone who needs PAM to even log on -- on account of wanting to use YubiKey OTP to do so -- and who uses nsswitch for a variety of homebrewed lookups, I would obviously not be willing to drop either.)

A backdoor in xz

Posted Apr 1, 2024 12:19 UTC (Mon) by foom (subscriber, #14868) [Link] (10 responses)

Nsswitch has an obvious replacement for dlopen: sockets. They're already used in many interesting scenarios, e.g. host lookup is via DNS to localhost, user database often comes from libnss_ldapd or sssd — both of which simply implement a private socket protocol in their nsswitch library to talk to their corresponding service on localhost.

Then of course there's nscd, as already mentioned: a socket protocol for nsswitch lookups already implemented by glibc and musl. Someone could implement a different nscd server-side which doesn't use dlopen — without even modifying glibc. Yet, as far as I know, nobody actually has done so.

On the PAM side there's no similarly easy replacement, though one could investigate OpenBSD's BSD Auth system, which is extensible via spawning subprocesses to handle auth tasks.

In any case, that nobody seems to actually be working on any of this probably shows just how unimportant avoiding dlopen is for most people...

A backdoor in xz

Posted Apr 1, 2024 16:48 UTC (Mon) by nix (subscriber, #2304) [Link] (9 responses)

Avoiding the need to dlopen in statically linked binaries, while not losing nsswitch for such binaries, actually *does* matter to upstream (it would simplify ld.so a whole hell of a lot). So switching to a socket-based protocol is definitely on the cards.

The problem, as ever, is doing that compatibly -- but I suppose if glibc itself provided the 'nss server' that loaded existing nss modules and did everything else nsswitch did, and glibc called into it using the sort of thing you describe, this sort of thing might be practical: it would probably make nscd less of a horror show, too. With a lot of work (how many nss modules depend on being in the same address space as the running process, for starters? I bet it's not zero. And I bet this would slaughter performance for simpler cases, so maybe nss_files still needs to be built in. And so forth...)

A backdoor in xz

Posted Apr 1, 2024 18:04 UTC (Mon) by foom (subscriber, #14868) [Link] (1 responses)

An "nss server" is literally what nscd _already is_!

If you run the nscd service, then glibc sends nss lookups to nscd over a socket, instead of running them inside other binaries.

Nscd comes with a caching layer (unsurprisingly given its name), but you can mostly disable that if you only want the nss-server functionality.

A backdoor in xz

Posted Apr 2, 2024 17:10 UTC (Tue) by nix (subscriber, #2304) [Link]

Oh, of course it is. I am clearly missing the obvious right now :(

A backdoor in xz

Posted Apr 1, 2024 18:23 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (6 responses)

> Avoiding the need to dlopen in statically linked binaries, while not losing nsswitch for such binaries, actually *does* matter to upstream (it would simplify ld.so a whole hell of a lot). So switching to a socket-based protocol is definitely on the cards.

glibc is the worst library in existence, so no wonder.

On the other hand, musl libc simply uses the nscd protocol to provide the NSS functionality and even allows wrapping legacy NSS modules: https://github.com/pikhq/musl-nscd

Additionally, with musl I can _already_ get a fully static system with zero dlopen()s or dynamic libraries. There are even several experimental distros that are fully statically linked. E.g.: https://framagit.org/Ypnose/solyste

A backdoor in xz

Posted Apr 2, 2024 17:12 UTC (Tue) by nix (subscriber, #2304) [Link] (5 responses)

> glibc is the worst library in existence, so no wonder.

At this point I'm wondering if you're just being intentionally unpleasant. glibc navigates a frankly horrifying pile of tradeoffs and does the job fairly well given that. If it was "the worst library in existence" it would not be *remotely* so widely used, nor work as well as it does.

A backdoor in xz

Posted Apr 2, 2024 22:54 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

I've been holding this opinion about glibc for decades now. I understand the difficulty of developing glibc, and it excuses at least some warts. But then we have musl which is so much nicer, while being standards-compliant.

I believe we should take at least _some_ of that experience and apply it to the rest of the system. Being static-friendly and not dlopen()-ing stuff is definitely a part of that.

BTW, does dlopen() in libsystemd preclude its static linking?

A backdoor in xz

Posted Apr 3, 2024 11:17 UTC (Wed) by nix (subscriber, #2304) [Link] (3 responses)

> BTW, does dlopen() in libsystemd preclude its static linking?

In the future in glibc, yes. In all other libcs I'm aware of, yes, even now. (Or, rather, you can *try* to call it in statically-linked binaries, but the call will always fail.)

This is of course one of many reasons why just statically linking everything is not the panacea some seem to think -- plugins really *are* a thing and sometimes loadable shared code in the same address space is a convenient way to implement them... there's not a chance you'll ever get KDE to work statically linked, for instance.

A backdoor in xz

Posted Apr 3, 2024 16:30 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

> This is of course one of many reasons why just statically linking everything is not the panacea some seem to think -- plugins really *are* a thing

Plugins are a thing that has no business being in the foundational parts of the runtime. And it's not like we don't have a real-world example of a system without them, Alpine Linux exists. And it's significantly nicer to work with than the glibc-based systems.

A backdoor in xz

Posted Apr 4, 2024 12:49 UTC (Thu) by nix (subscriber, #2304) [Link] (1 responses)

> Plugins are a thing that has no business being in the foundational parts of the runtime.

I am not convinced, and since as usual you didn't bother to give any reasons, relying instead on pure assertion, I'm not sure why you think this not-an-argument would ever convince anyone who didn't already agree with you.

Why on earth would you consider name lookup or authentication, both things that have had numerous wildly divergent implementations over time and which obviously have different site-by-site requirements, hence the *existence* of pluggable systems to implement them, to be things that "have no business" existing, based on the pure assertion that they are "in the foundational parts of the runtime"? People are *using* nss and PAM's extensibility, you know. They're not just there to annoy you. This is not a moribund module system with a half-dozen stale modules that have hardly changed in the last twenty years. People are plugging other things into that pluggability. (Not that this attack even *relied* on that pluggability, or NSS, or PAM, so why you think ripping them out will help here is quite beyond me.)

For that matter, what on earth even is a "foundational part of the runtime"? Is it the toolchain? Surely that counts if anything does! Better rip out LTO from GCC and clang then, since both rely on linker plugins that run the entire compiler! (Also, how many linker plugins are there? I can hardly name any but LTO. That's gotta be moribund, rip it out!) Is it the kernel? Better rip out kernel modules then, in-tree or not, since if they're not dynamically loaded plugins, nothing is... is it glibc? Surely not, since you can replace it with any other libc you like and keep the kernel and most userspace the same after a recompile: it could hardly be foundational! So I guess NSS can stay. Not sure about PAM, the idea came from Solaris and you have in the past expressed a liking for that sort of thing so maybe that's good now too?

That's the problem with arguing by pure assertion: since you give no reasons, define none of your terms, and provide no grounds to agree with you, there's no reason to accept your premises: and even if I do, it's easy to argue in the exact opposite direction since the premises are so vague, which makes your argument nothing more than a statement of personal preferences and an assertion that of course *your* personal preferences are more important than anyone else's.

Is the real definition "something Cyberax asserts without argument or rationale is foundational"? Or just "Cyberax is right"?

A backdoor in xz

Posted Apr 4, 2024 17:24 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

I thought that the reasons for NOT doing plugins are obvious. They add a huge amount of complexity, preclude useful mitigations (mseal/mimmutable), and make the system harder to analyze. You can't statically determine the dependency closure anymore.

Plugins inherently face a complicated environment that they don't control and should not perturb too much. And a crashed plugin will take down the entire application. This was reasonable 30 years ago, but it's not anymore. These days, we actually have a good architectural pattern for this: split modules into a separate daemon that is activated by systemd as needed.

> People are *using* nss and PAM's extensibility, you know.

NSS is actually hardly used these days, NIS/NIS+ have mostly died out. The only major surviving service is LDAP (usually via SSSD). It can simply be incorporated into the glibc (it's 43kb), or it can be split into a daemon that talks to glibc via the NSCD protocol.

If we're talking about PAM in particular, then it's nothing but a stack of bad design decisions. In case of SSH, they can be replaced by ephemeral SSH certificates for most of the scenarios (e.g. a shared machine in a university or for management access to the production cluster on AWS EC2).

These two items will make most non-interactive systems completely dlopen()-free.

A backdoor in xz

Posted Apr 1, 2024 14:41 UTC (Mon) by job (guest, #670) [Link]

In retrospect, I think most people would agree that the design of PAM was a mistake. It was hugely controversial at its time and many fought against its inclusion in distributions.

In the end it was included because it made possible some use cases where no one else stepped up to make a practical alternative.

I don't think that is something we want to emulate. It is certainly possible to satisfy the necessary use cases without resorting to dlopen().

A backdoor in xz

Posted Mar 31, 2024 12:12 UTC (Sun) by bluca (subscriber, #118303) [Link] (6 responses)

The main reason this is done and will happen is to reduce mandatory dependencies. If the Linux ELF format supported optional dependencies in a better way, that are loaded only when needed, then there wouldn't be any need for manually doing dlopen(). I believe OSX's shared object format implements this. But we are where we are, and hence that's the only mechanism we got.

A backdoor in xz

Posted Mar 31, 2024 13:49 UTC (Sun) by nix (subscriber, #2304) [Link]

Hmm. That's interesting! This is kind of a DT_NEEDED which kicks in (and loads dependent libs, runs constructors etc) only when the first symbol in it is called, kind of like lazy binding but doing a lot more than just a symbol resolution?

That's tricky to implement (because doing things in the resolver is *always* a bit tricky) but I can't immediately think of any reason why it's *impossible*. It would need a new dynamic tag of course, DT_LAZY_NEEDED? DT_NEEDED_OPTIONAL?

You couldn't use the simpleminded approach above for everything (good luck making this work for things like data symbols where the GOT is needed before the PLT or in general anywhere you couldn't have used lazy binding before, or where you need the shared library's ELF constructors to run early, or where TLS inadequacies would prevent dlopen from working happily -- and it has the same security implications as using lazy binding) but it should work in a fairly large proportion of cases.

A backdoor in xz

Posted Mar 31, 2024 16:13 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

> The main reason this is done and will happen is to reduce mandatory dependencies

No, it's not. It's done to _paper_ over dependencies, making them harder to discover statically and creating wonderful race conditions if mimmutable() is used at an inopportune moment. It's an all-around bad decision.

A backdoor in xz

Posted Mar 31, 2024 17:06 UTC (Sun) by nix (subscriber, #2304) [Link] (3 responses)

Since mimmutable() does not exist on Linux, making changes in Linux-only software like systemd to allow for it seems deeply bizarre, particularly when those changes *reduce* security (like, say, increasing the set of always-loaded libraries to include some which have just been seen to launch attacks when loaded, rather than loading as many as possible of them only as needed).

What next? Shall we make changes to allow for Windows's per-libc malloc(), or for Linux's not-at-all-planned upcoming transition to Mach-O?

A backdoor in xz

Posted Mar 31, 2024 18:54 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

> Since mimmutable() does not exist on Linux

This is subject to change: https://lwn.net/Articles/958438/

> particularly when those changes *reduce* security

They don't. libsystemd will _still_ depend on xz, it just will be hidden from cursory analysis.

> What next? Shall we make changes to allow for Windows's per-libc malloc(),

That's actually a pretty good idea, that will make several classes of vulnerabilities more difficult to exploit.

> or for Linux's not-at-all-planned upcoming transition to Mach-O?

I'd take PE: https://blog.hiler.eu/win32-the-only-stable-abi/

A backdoor in xz

Posted Mar 31, 2024 19:36 UTC (Sun) by nix (subscriber, #2304) [Link] (1 responses)

> They don't. libsystemd will _still_ depend on xz, it just will be hidden from cursory analysis.

I honestly wonder if you're even reading this thread. This attack depended on liblzma being loaded into sshd's memory because it was loaded by virtue of DT_NEEDED: after this commit, it would not be loaded at all, because libsystemd would only have loaded it if compressed journal reading was attempted, which sshd never attempts.

So it *would* in fact solve the problem.

But I'm tired of arguing with a brick wall with prejudged opinions, I think. Good night.

A backdoor in xz

Posted Mar 31, 2024 20:16 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link]

> This attack

Reread your words. THIS attack. As in, this _particular_ one. Sure, having the library dlopen()-ed prevents it. I can think of several ways I can backdoor liblzma to work around it.

Making the system usable with mimmutable/mseal would prevent whole categories of exploits. And promoting the dlopen() craze will make this kind of mitigation impossible.

And yeah, I absolutely hate the braindead design of nsswitch, PAM, and now libsystemd.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds