A backdoor in xz
A backdoor in xz
Posted Mar 29, 2024 20:01 UTC (Fri) by cjwatson (subscriber, #7322)In reply to: A backdoor in xz by bkw1a
Parent article: A backdoor in xz
That said, it's rather tempting to amend that patch to talk to NOTIFY_SOCKET directly rather than by linking against libsystemd, just to reduce exposure to gadgets like this.
Posted Mar 29, 2024 20:10 UTC (Fri)
by cjwatson (subscriber, #7322)
[Link] (18 responses)
Apparently unreleased versions of systemd dlopen liblzma instead, which would have meant it wasn't in sshd's process space.
Posted Mar 29, 2024 20:28 UTC (Fri)
by intelfx (subscriber, #130118)
[Link] (3 responses)
I don't think any of that code is needed. OpenSSH as patched only needs sd_listen_fds() and plain sd_notify() which _as used_ can be implemented in about 5-10 lines of C code each.
Posted Mar 29, 2024 20:35 UTC (Fri)
by cjwatson (subscriber, #7322)
[Link] (2 responses)
Posted Mar 30, 2024 1:12 UTC (Sat)
by zdzichu (subscriber, #17118)
[Link]
Posted Mar 30, 2024 6:50 UTC (Sat)
by intelfx (subscriber, #130118)
[Link]
Yep, that's why I tried to emphasize "as used". The implementation you see is shared between several mostly-disjoint users (e. g. it is also used to communicate with hypervisors via vsock) and also implements other features of this ad-hoc protocol (such as fd passing) which are not used in openssh.
The usage in openssh (to signal readiness) is covered by writing a fixed, static text string into an AF_UNIX datagram socket pointed to by the $NOTIFY_SOCKET variable.
Posted Mar 29, 2024 21:05 UTC (Fri)
by judas_iscariote (guest, #47386)
[Link] (13 responses)
Posted Mar 30, 2024 11:04 UTC (Sat)
by fenncruz (subscriber, #81417)
[Link] (12 responses)
Posted Mar 30, 2024 12:02 UTC (Sat)
by bluca (subscriber, #118303)
[Link] (7 responses)
Posted Mar 30, 2024 14:12 UTC (Sat)
by smurf (subscriber, #17840)
[Link]
Posted Mar 30, 2024 15:27 UTC (Sat)
by dskoll (subscriber, #1630)
[Link] (1 responses)
I understand the advantages of the dlopen approach, but it still leaves me feeling uneasy. You might get shared libraries that you don't expect dlopened just by making an innocent API call.
It seems to me that the supervisor notification protocol is likely to be used by many programs, and also quite likely that they might not want anything else from libsystemd. Wouldn't it make sense to put the notification client code in its own shared library that has no external dependencies and won't dlopen anything else ever?
Posted Mar 30, 2024 15:52 UTC (Sat)
by zdzichu (subscriber, #17118)
[Link]
Posted Mar 30, 2024 18:36 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (3 responses)
Posted Mar 30, 2024 19:14 UTC (Sat)
by andresfreund (subscriber, #69562)
[Link] (2 responses)
Posted Mar 30, 2024 19:41 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
> Dlopen() doesn't change any of that?
Indeed it doesn't (right now), but expanding its usage will make it harder to enable something like mseal() later.
Posted Mar 31, 2024 13:13 UTC (Sun)
by bluca (subscriber, #118303)
[Link]
Posted Mar 30, 2024 16:53 UTC (Sat)
by judas_iscariote (guest, #47386)
[Link] (3 responses)
Posted Mar 30, 2024 19:05 UTC (Sat)
by andresfreund (subscriber, #69562)
[Link] (1 responses)
I'm somewhat surprised that nobody called for glibc's rtld-audit infrastructure to be removed. That's really what made this attack possible despite relro. As far as I know, it's not used widely.
Posted Mar 31, 2024 13:30 UTC (Sun)
by nix (subscriber, #2304)
[Link]
Posted Mar 31, 2024 6:37 UTC (Sun)
by epa (subscriber, #39769)
[Link]
Posted Mar 29, 2024 20:15 UTC (Fri)
by bkw1a (subscriber, #4101)
[Link] (4 responses)
Posted Mar 29, 2024 20:23 UTC (Fri)
by cjwatson (subscriber, #7322)
[Link] (2 responses)
Posted Mar 29, 2024 22:02 UTC (Fri)
by dilinger (subscriber, #2867)
[Link] (1 responses)
On a lot of desktops, sshd isn't even installed. Is it critical security infrastructure because it's installed on some servers you consider important? What about the other daemons installed on important servers, like nginx/apache (and often the whole lamp stack)?
If you actually look at attack vectors, you start realizing pretty quickly that A LOT of software could (or should) be considered critical security infrastructure, and it's pretty unrealistic to not have to patch all of those bits of software to work on Debian's many desktop/server environments and hardware architectures. That also assumes that we can trust upstreams to not backdoor their code, which, as this example shows us, we clearly cannot.
Posted Apr 3, 2024 5:44 UTC (Wed)
by Lennie (subscriber, #49641)
[Link]
Posted Mar 29, 2024 23:58 UTC (Fri)
by mcatanzaro (subscriber, #93033)
[Link]
Posted Mar 29, 2024 23:38 UTC (Fri)
by cjwatson (subscriber, #7322)
[Link]
Posted Mar 30, 2024 1:11 UTC (Sat)
by DimeCadmium (subscriber, #157243)
[Link] (48 responses)
Posted Mar 30, 2024 1:40 UTC (Sat)
by bluca (subscriber, #118303)
[Link] (47 responses)
Posted Mar 30, 2024 5:30 UTC (Sat)
by wtarreau (subscriber, #51152)
[Link] (11 responses)
Posted Mar 30, 2024 5:48 UTC (Sat)
by rra (subscriber, #99804)
[Link] (10 responses)
Posted Mar 30, 2024 8:12 UTC (Sat)
by DimeCadmium (subscriber, #157243)
[Link] (9 responses)
Posted Mar 30, 2024 8:23 UTC (Sat)
by mb (subscriber, #50428)
[Link] (4 responses)
The real problem is that patches that have not been understood/reviewed have been applied.
Posted Mar 30, 2024 12:46 UTC (Sat)
by stef70 (guest, #14813)
[Link] (1 responses)
On my Debian system, liblzma.so is linked in several programs and libraries. A lot are unrelated to systemd: grub, insmod, lvm, reboot, gimp, imagemagick, runlevel, ...
All of them are potential targets for that xz backdoor. For now, we have to wait for the full analysis. I am pretty optimistic that sshd was the main target because installing another backdoor on the system or calling "home" would significantly increase the probability or detection.
Posted Mar 30, 2024 23:33 UTC (Sat)
by brooksmoses (guest, #88422)
[Link]
[Reference: https://github.com/Midar/xz-backdoor-documentation/wiki#s... as of the time of this comment.]
Posted Mar 31, 2024 1:25 UTC (Sun)
by DimeCadmium (subscriber, #157243)
[Link] (1 responses)
Ah, okay. And how exactly do you believe that one methods of notifications is any more reliable at this than any other? They all rely on the software developer picking a good time to say "started".
> But let's not distract from the discussion: systemd ist *not* why this backdoor was possible
It absolutely is.
> It could have been any other library
But it wasn't. "Don't worry about our vulnerabilities, other people have vulnerabilities too!" "Don't worry about our bad design, other people have bad design too!"
Posted Mar 31, 2024 9:22 UTC (Sun)
by smurf (subscriber, #17840)
[Link]
They all rely on picking a good time that happens to *work*.
There are plenty of situations where, once you're *really* started, it's no longer possible to signal "OK I'm alive now" by double-forking.
Writing a PID file has its own class of race conditions, the handling of which I can guarantee most users of that method get fatally wrong.
And so on.
> "Don't worry about our vulnerabilities, other people have vulnerabilities too!" "Don't worry about our bad design, other people have bad design too!"
Don't blame the messenger. If linking to a library you don't strictly need *in your particular situation* is a "vulnerability" or "bad design" I can guarantee that 90+% of programs out there suffer from it.
Posted Mar 30, 2024 9:53 UTC (Sat)
by motk (guest, #51120)
[Link]
This whole thing has nothing to do with service management, and everything to do with large corporations relying on volunteers writing critical software apparently just for something to do.
Posted Mar 30, 2024 16:58 UTC (Sat)
by rra (subscriber, #99804)
[Link] (2 responses)
I have run UNIX systems throughout those literal decades that you are talking about, and your faith in this half-assed, failure-prone mechanism is badly misplaced. I cannot count the number of ways I have seen this fail: the process does not actually start listening to the network until after the fork, the process starts listening before the fork but isn't really ready to accept connections because there is setup that has to be done after the fork, the process forks but doesn't fork twice and thus isn't properly reparented, the process didn't write a PID file and now you have no idea which process is actually running the service, the process did write a PID file and wrote the wrong PID to that file, you end up with multiple backgrounded copies of the same service running and interfering weirdly with each other... the list goes on.
We figured out that this was a bad way to run services by at least the early 2000s, when support for a foreground model with none of this self-daemonization nonsense badly copied into every service became widely available (and as someone who was managing UNIX systems all through that period, that was a delightful revelation). But you do not want to assume that the service is ready simply because the process has started. You need some mechanism for signaling that the service really has fully started, has allocated all of its resources, and is listening to network connections (if that is its job). Otherwise, you risk starting services that depend on it too soon.
Even upstart (the alternative preferred by some of the folks who disliked systemd) had a mechanism for doing this. (It was worse than systemd's, at least in my opinion.)
Posted Mar 31, 2024 1:25 UTC (Sun)
by DimeCadmium (subscriber, #157243)
[Link] (1 responses)
Posted Mar 31, 2024 1:45 UTC (Sun)
by corbet (editor, #1)
[Link]
Posted Mar 30, 2024 7:09 UTC (Sat)
by epa (subscriber, #39769)
[Link] (34 responses)
If the answer is “because it links as a C library and you get the transitive dependencies of everything”, that’s something to improve.
Posted Mar 30, 2024 7:32 UTC (Sat)
by mb (subscriber, #50428)
[Link] (1 responses)
So, statically link with LTO?
Posted Mar 31, 2024 14:23 UTC (Sun)
by dskoll (subscriber, #1630)
[Link]
No, static linking isn't needed. Just split the large libsystemd into smaller libraries where each smaller library contains a set of closely-related APIs and minimal other dependencies. There's no reason to pull code in to do log compression if all you need is code for the sd_notify protocol.
Posted Mar 30, 2024 7:50 UTC (Sat)
by cjwatson (subscriber, #7322)
[Link] (31 responses)
Posted Mar 30, 2024 10:24 UTC (Sat)
by job (guest, #670)
[Link] (30 responses)
Posted Mar 30, 2024 12:07 UTC (Sat)
by bluca (subscriber, #118303)
[Link] (25 responses)
Dependency chain of a full-feature build of libsystemd from main (plus a PR under review):
build/libsystemd.so.0 (interpreter => None)
We want to remove the need for libcap too, but that's a bit more complex.
Posted Mar 30, 2024 18:23 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (24 responses)
It also won't close off all avenues of attack. A malicious library can patch the code, ptrace() its process, modify the environment, etc.
Posted Mar 30, 2024 19:12 UTC (Sat)
by andresfreund (subscriber, #69562)
[Link] (16 responses)
Posted Mar 30, 2024 19:36 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (14 responses)
Still, we should start cutting back on this kind of nonsense.
Posted Mar 31, 2024 13:46 UTC (Sun)
by nix (subscriber, #2304)
[Link] (13 responses)
That's going to work really well given how many sites use both to fold in new hostname lookup mechanisms, new user lookup mechanisms, and new and fairly complex authentication patterns on the fly.
Anyway, dropping nsswitch and PAM wouldn't even really help, despite being immensely disruptive. dlopen does have its problems[1] and it is reasonable to prefer to avoid it when possible, but it is not rare even in the absence of nsswitch and PAM. Try adding reporting to glibc to see how often it's invoked on real running systems. (It's a *lot*. Even syslog daemons make extensive use of it these days, so you can't even say "perhaps daemons running as root can't dlopen".)
You cannot use 'this uses dlopen' as a signal of suspiciousness, or of anything really, any more than 'this is dynamically linked' is such a signal. "This has IFUNC resolvers that redirect symbols in other libraries" is definitely an actual sign of badness that I've never heard of anything legitimate doing, and I'm wondering if glibc could detect and block that somehow without too much cost (it would at least involve stack frame walks, but the resolver has to mess with the stack frame anyway...)
[1] now that prelink is dead, mostly that you can't use ldd to statically determine what the shared library dep tree is and what things might be potentially impacted by ABI changes, which *is* actually problematic on real systems
Posted Mar 31, 2024 16:12 UTC (Sun)
by Cyberax (✭ supporter ✭, #52523)
[Link] (12 responses)
Yeah, exactly. Remove dlopen() calls by refactoring the relevant systems. For example, musl libc does not have nsswitch (and has a built-in NSCD). PAM is already optional.
Posted Mar 31, 2024 17:02 UTC (Sun)
by nix (subscriber, #2304)
[Link] (11 responses)
That's a sign of someone on a hobby-horse if I ever heard of one.
(As someone who needs PAM to even log on -- on account of wanting to use YubiKey OTP to do so -- and who uses nsswitch for a variety of homebrewed lookups, I would obviously not be willing to drop either.)
Posted Apr 1, 2024 12:19 UTC (Mon)
by foom (subscriber, #14868)
[Link] (10 responses)
Then of course there's nscd, as already mentioned: a socket protocol for nsswitch lookups already implemented by glibc and musl. Someone could implement a different nscd server-side which doesn't use dlopen — without even modifying glibc. Yet, as far as I know, nobody actually has done so.
On the PAM side there's no similarly easy replacement, though one could investigate OpenBSD's BSD Auth system, which is extensible via spawning subprocesses to handle auth tasks.
In any case, that nobody seems to actually be working on any of this probably shows just how unimportant avoiding dlopen is for most people...
Posted Apr 1, 2024 16:48 UTC (Mon)
by nix (subscriber, #2304)
[Link] (9 responses)
The problem, as ever, is doing that compatibly -- but I suppose if glibc itself provided the 'nss server' that loaded existing nss modules and did everything else nsswitch did, and glibc called into it using the sort of thing you describe, this sort of thing might be practical: it would probably make nscd less of a horror show, too. With a lot of work (how many nss modules depend on being in the same address space as the running process, for starters? I bet it's not zero. And I bet this would slaughter performance for simpler cases, so maybe nss_files still needs to be built in. And so forth...)
Posted Apr 1, 2024 18:04 UTC (Mon)
by foom (subscriber, #14868)
[Link] (1 responses)
If you run the nscd service, then glibc sends nss lookups to nscd over a socket, instead of running them inside other binaries.
Nscd comes with a caching layer (unsurprisingly given its name), but you can mostly disable that if you only want the nss-server functionality.
Posted Apr 2, 2024 17:10 UTC (Tue)
by nix (subscriber, #2304)
[Link]
Posted Apr 1, 2024 18:23 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (6 responses)
glibc is the worst library in existence, so no wonder.
On the other hand, musl libc simply uses the nscd protocol to provide the NSS functionality and even allows wrapping legacy NSS modules: https://github.com/pikhq/musl-nscd
Additionally, with musl I can _already_ get a fully static system with zero dlopen()s or dynamic libraries. There are even several experimental distros that are fully statically linked. E.g.: https://framagit.org/Ypnose/solyste
Posted Apr 2, 2024 17:12 UTC (Tue)
by nix (subscriber, #2304)
[Link] (5 responses)
At this point I'm wondering if you're just being intentionally unpleasant. glibc navigates a frankly horrifying pile of tradeoffs and does the job fairly well given that. If it was "the worst library in existence" it would not be *remotely* so widely used, nor work as well as it does.
Posted Apr 2, 2024 22:54 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link] (4 responses)
I believe we should take at least _some_ of that experience and apply it to the rest of the system. Being static-friendly and not dlopen()-ing stuff is definitely a part of that.
BTW, does dlopen() in libsystemd preclude its static linking?
Posted Apr 3, 2024 11:17 UTC (Wed)
by nix (subscriber, #2304)
[Link] (3 responses)
In the future in glibc, yes. In all other libcs I'm aware of, yes, even now. (Or, rather, you can *try* to call it in statically-linked binaries, but the call will always fail.)
This is of course one of many reasons why just statically linking everything is not the panacea some seem to think -- plugins really *are* a thing and sometimes loadable shared code in the same address space is a convenient way to implement them... there's not a chance you'll ever get KDE to work statically linked, for instance.
Posted Apr 3, 2024 16:30 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
Plugins are a thing that has no business being in the foundational parts of the runtime. And it's not like we don't have a real-world example of a system without them, Alpine Linux exists. And it's significantly nicer to work with than the glibc-based systems.
Posted Apr 4, 2024 12:49 UTC (Thu)
by nix (subscriber, #2304)
[Link] (1 responses)
I am not convinced, and since as usual you didn't bother to give any reasons, relying instead on pure assertion, I'm not sure why you think this not-an-argument would ever convince anyone who didn't already agree with you.
Why on earth would you consider name lookup or authentication, both things that have had numerous wildly divergent implementations over time and which obviously have different site-by-site requirements, hence the *existence* of pluggable systems to implement them, to be things that "have no business" existing, based on the pure assertion that they are "in the foundational parts of the runtime"? People are *using* nss and PAM's extensibility, you know. They're not just there to annoy you. This is not a moribund module system with a half-dozen stale modules that have hardly changed in the last twenty years. People are plugging other things into that pluggability. (Not that this attack even *relied* on that pluggability, or NSS, or PAM, so why you think ripping them out will help here is quite beyond me.)
For that matter, what on earth even is a "foundational part of the runtime"? Is it the toolchain? Surely that counts if anything does! Better rip out LTO from GCC and clang then, since both rely on linker plugins that run the entire compiler! (Also, how many linker plugins are there? I can hardly name any but LTO. That's gotta be moribund, rip it out!) Is it the kernel? Better rip out kernel modules then, in-tree or not, since if they're not dynamically loaded plugins, nothing is... is it glibc? Surely not, since you can replace it with any other libc you like and keep the kernel and most userspace the same after a recompile: it could hardly be foundational! So I guess NSS can stay. Not sure about PAM, the idea came from Solaris and you have in the past expressed a liking for that sort of thing so maybe that's good now too?
That's the problem with arguing by pure assertion: since you give no reasons, define none of your terms, and provide no grounds to agree with you, there's no reason to accept your premises: and even if I do, it's easy to argue in the exact opposite direction since the premises are so vague, which makes your argument nothing more than a statement of personal preferences and an assertion that of course *your* personal preferences are more important than anyone else's.
Is the real definition "something Cyberax asserts without argument or rationale is foundational"? Or just "Cyberax is right"?
Posted Apr 4, 2024 17:24 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Plugins inherently face a complicated environment that they don't control and should not perturb too much. And a crashed plugin will take down the entire application. This was reasonable 30 years ago, but it's not anymore. These days, we actually have a good architectural pattern for this: split modules into a separate daemon that is activated by systemd as needed.
> People are *using* nss and PAM's extensibility, you know.
NSS is actually hardly used these days, NIS/NIS+ have mostly died out. The only major surviving service is LDAP (usually via SSSD). It can simply be incorporated into the glibc (it's 43kb), or it can be split into a daemon that talks to glibc via the NSCD protocol.
If we're talking about PAM in particular, then it's nothing but a stack of bad design decisions. In case of SSH, they can be replaced by ephemeral SSH certificates for most of the scenarios (e.g. a shared machine in a university or for management access to the production cluster on AWS EC2).
These two items will make most non-interactive systems completely dlopen()-free.
Posted Apr 1, 2024 14:41 UTC (Mon)
by job (guest, #670)
[Link]
In the end it was included because it made possible some use cases where no one else stepped up to make a practical alternative.
I don't think that is something we want to emulate. It is certainly possible to satisfy the necessary use cases without resorting to dlopen().
Posted Mar 31, 2024 12:12 UTC (Sun)
by bluca (subscriber, #118303)
[Link] (6 responses)
Posted Mar 31, 2024 13:49 UTC (Sun)
by nix (subscriber, #2304)
[Link]
That's tricky to implement (because doing things in the resolver is *always* a bit tricky) but I can't immediately think of any reason why it's *impossible*. It would need a new dynamic tag of course, DT_LAZY_NEEDED? DT_NEEDED_OPTIONAL?
You couldn't use the simpleminded approach above for everything (good luck making this work for things like data symbols where the GOT is needed before the PLT or in general anywhere you couldn't have used lazy binding before, or where you need the shared library's ELF constructors to run early, or where TLS inadequacies would prevent dlopen from working happily -- and it has the same security implications as using lazy binding) but it should work in a fairly large proportion of cases.
Posted Mar 31, 2024 16:13 UTC (Sun)
by Cyberax (✭ supporter ✭, #52523)
[Link] (4 responses)
No, it's not. It's done to _paper_ over dependencies, making them harder to discover statically and creating wonderful race conditions if mimmutable() is used at an inopportune moment. It's an all-around bad decision.
Posted Mar 31, 2024 17:06 UTC (Sun)
by nix (subscriber, #2304)
[Link] (3 responses)
What next? Shall we make changes to allow for Windows's per-libc malloc(), or for Linux's not-at-all-planned upcoming transition to Mach-O?
Posted Mar 31, 2024 18:54 UTC (Sun)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
This is subject to change: https://lwn.net/Articles/958438/
> particularly when those changes *reduce* security
They don't. libsystemd will _still_ depend on xz, it just will be hidden from cursory analysis.
> What next? Shall we make changes to allow for Windows's per-libc malloc(),
That's actually a pretty good idea, that will make several classes of vulnerabilities more difficult to exploit.
> or for Linux's not-at-all-planned upcoming transition to Mach-O?
I'd take PE: https://blog.hiler.eu/win32-the-only-stable-abi/
Posted Mar 31, 2024 19:36 UTC (Sun)
by nix (subscriber, #2304)
[Link] (1 responses)
I honestly wonder if you're even reading this thread. This attack depended on liblzma being loaded into sshd's memory because it was loaded by virtue of DT_NEEDED: after this commit, it would not be loaded at all, because libsystemd would only have loaded it if compressed journal reading was attempted, which sshd never attempts.
So it *would* in fact solve the problem.
But I'm tired of arguing with a brick wall with prejudged opinions, I think. Good night.
Posted Mar 31, 2024 20:16 UTC (Sun)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Reread your words. THIS attack. As in, this _particular_ one. Sure, having the library dlopen()-ed prevents it. I can think of several ways I can backdoor liblzma to work around it.
Making the system usable with mimmutable/mseal would prevent whole categories of exploits. And promoting the dlopen() craze will make this kind of mitigation impossible.
And yeah, I absolutely hate the braindead design of nsswitch, PAM, and now libsystemd.
Posted Mar 30, 2024 17:04 UTC (Sat)
by rra (subscriber, #99804)
[Link] (3 responses)
Posted Mar 30, 2024 17:18 UTC (Sat)
by nix (subscriber, #2304)
[Link] (2 responses)
IFUNCs are not really the villain here. It is perfectly possible for liblzma to have done the same sort of evil using only perfectly normal symbol interposition, dlsym(..., RTLD_NEXT) and ELF constructors.
Posted Mar 30, 2024 19:08 UTC (Sat)
by andresfreund (subscriber, #69562)
[Link] (1 responses)
Posted Mar 30, 2024 23:21 UTC (Sat)
by nix (subscriber, #2304)
[Link]
Posted Mar 30, 2024 6:16 UTC (Sat)
by mchehab (subscriber, #41156)
[Link] (5 responses)
Why systemd would possible require any integration with sshd? Originally, it started as a replacement for initrd, meant to make system init faster. See https://0pointer.de/blog/projects/systemd.html:
> For a fast and efficient boot-up two things are crucial:
In practice, system init is now a lot heavier and takes a lot more time to start a system than what it used to be with sysV init.
It also is now not only a PID 1 replacement, but it does lots of integration and interaction with almost everything needed for a system to run, including audit trails/logs.
With that, it became a component that can be compromised indirectly via changes on dozens (or hundreds?) of different components that are not directly related to systemd itself. That opened a window like what just happened where a malicious code introduced into xz is capable of compromising systems that contain systemd integration OOT patches.
IMO, systemd should return to its roots and stop requiring interactions with other packages unrelated to PID 1's task.
Posted Mar 30, 2024 6:55 UTC (Sat)
by intelfx (subscriber, #130118)
[Link] (4 responses)
To signal (and receive) the readiness state of the daemon in question. Not more, not less.
> IMO, systemd should return to its roots and stop requiring interactions with other packages unrelated to PID 1's task.
I'd say that "reliably determining whether the supervised process has successfully started up" (i. e. loaded and parsed its configuration, bound all the necessary sockets, did not encounter any other failures) is very much within the definition of the PID 1's task.
Posted Mar 30, 2024 22:57 UTC (Sat)
by mchehab (subscriber, #41156)
[Link] (1 responses)
> To signal (and receive) the readiness state of the daemon in question. Not more, not less.
System V init never needed that, as there are simple generic solutions to monitor that. Basically, when a process is forked on a child process and such child dies, the parent is notified. This a well-defined POSIX-defined behavior.
> > IMO, systemd should return to its roots and stop requiring interactions with other packages unrelated to PID 1's task.
It shall be up to sshd process - and to all other system daemons - to die if it failed to parse configuration and/or bind necessary sockets. The task of PID 1 is to monitor if the process is dying too fast, and, on such cases, to take some action.
There's absolutely no need to modify system daemons, implementing non-POSIX out-of-tree hacks just for PID 1 to be aware that a process is up and running.
Posted Mar 31, 2024 2:07 UTC (Sun)
by intelfx (subscriber, #130118)
[Link]
Yes, and it sucked.
> This a well-defined POSIX-defined behavior
The fact that it is well-defined or POSIX-defined does not automatically mean that it's _good_. I hate to break it to you, but POSIX is not a pinnacle of system design.
> It shall be up to sshd process - and to all other system daemons - to die if it failed to parse configuration and/or bind necessary sockets
Setting up a proper readiness notification by double-forking is approximately tenfold more complicated and requires exponentially more moving parts than the sd_notify mechanism.
In fact, many daemons (including openssh) do not complete their initialization until after the fork, so the only correct implementation of the interface you describe entails the immediate child _waiting_ for the grandchild to finish its setup, and only then exiting. Which means that there has to be a temporary pipe or socket between the child and the grandchild.
So now we are choosing between a socket notification mechanism implemented _once_ in a well-audited, well-maintained project (systemd) and **the same socket notification mechanism** plus a bunch of historical nonsense implemented _all over again_ in each daemon.
I trust the choice is obvious.
Posted Apr 4, 2024 15:34 UTC (Thu)
by koh (subscriber, #101482)
[Link] (1 responses)
Why would liblzma be needed for that?
Posted Apr 4, 2024 15:44 UTC (Thu)
by cjwatson (subscriber, #7322)
[Link]
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
It is more like corporations fault for not paying people to work in things they profit from.
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
So that other services that depend on the ssh server being started know when to start.
So that when you ask the service manager what services failed, you'll know that the ssh server failed.
So that you have an actual service manager, not a bunch of YOLO shell scripts with no error handling.
A backdoor in xz
A backdoor in xz
But let's not distract from the discussion: systemd ist *not* why this backdoor was possible. It could have been any other library. It could even have been any other server application. It's not restricted to sshd.
This is a social problem. Not a technical one.
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
We have managed to keep this conversation relatively free of systemd bashing, which is really not relevant to the discussion. Please don't do any more of it here.
Please stop here
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
libcap.so.2 => /lib/x86_64-linux-gnu/libcap.so.2
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
ld-linux-x86-64.so.2 => /lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
A backdoor in xz
>
> - To start less.
> - And to start more in parallel.
A backdoor in xz
A backdoor in xz
>
> I'd say that "reliably determining whether the supervised process has successfully started up" (i. e. loaded and parsed its configuration, bound all the necessary sockets, did not encounter any other failures) is very much within the definition of the PID 1's task.
A backdoor in xz
A backdoor in xz
A backdoor in xz