A backdoor in xz [LWN.net]

A backdoor in xz

Posted Mar 30, 2024 1:44 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link] (9 responses)

> Yet more reason to de-bloat instead of en-bloat with crap like systemd.

xz is also pulled in by selinux, or some PAM modules (which SSH also uses).

Arguably, both need to go, and SSH needs to be rewritten in a safe language, with authentication handled by something like systemd instead of random dlopen()-ed modules.

A backdoor in xz

Posted Mar 30, 2024 8:18 UTC (Sat) by DimeCadmium (subscriber, #157243) [Link] (8 responses)

Gross, selinux; pretty sure UsePAM is also a patch (and it can at least be disabled in the config). The question though is not "what pulls it in" but rather "what pulls it in without adding value" because that's how you get lists of 100s of deps, any one of which is vulnerable to an attack like this.

A backdoor in xz

Posted Mar 30, 2024 10:54 UTC (Sat) by khim (subscriber, #9252) [Link] (1 responses)

> The question though is not "what pulls it in" but rather "what pulls it in without adding value"

Each patch add value to someone, or it wouldn't have existed. Sshd without PAM would be 100% useless to me because all machined that I use ssh with use authentication not supported by stock Debian.

Similarly someone who needs to pass certain certification needs selinux and so on.

That's the flip side of the story which made available open source in the first place: we have millions of users and even if 0.01% of them are developers it's enough to produce software for free.

Remove all that “crap” and suddenly there are not enough developers to drive that thing forward because there are not enough users.

There are no easy solution for that problem, unfortunately.

A backdoor in xz

Posted Mar 31, 2024 1:27 UTC (Sun) by DimeCadmium (subscriber, #157243) [Link]

There's a difference between adding value to 1 person and adding value to everyone who uses some software, for example.

A backdoor in xz

Posted Mar 30, 2024 19:32 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link] (5 responses)

PAM still has value, it's still very useful for auditing and custom authentication in special environments.

These days, PAM can be mostly replaced by ephemeral SSH certificates for authentication. But it's still useful for auditing.

A backdoor in xz

Posted Mar 30, 2024 21:25 UTC (Sat) by apoelstra (subscriber, #75205) [Link] (3 responses)

I use pam_u2f extensively on my personal computers to use a Yubikey to authenticate my login and screenlocker. This usecase can't be replaced by ephemeral SSH certs because the goal is to talk to a physical U2F key which only speaks U2F.

A backdoor in xz

Posted Mar 31, 2024 0:47 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

Is it for interactive logins or for SSH? It's definitely still needed for interactive logins, but they are also much less troublesome. But I don't think SSH needs them.

A backdoor in xz

Posted Mar 31, 2024 16:54 UTC (Sun) by apoelstra (subscriber, #75205) [Link] (1 responses)

Ah, yes, only for interactive logins. For SSH I use GnuPG's ssh-agent emulation support, whose mechanism I don't really understand.

A backdoor in xz

Posted Mar 31, 2024 18:51 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link]

ssh-agent (or its emulation) is basically just the public key authentication.

PAM was useful for custom authentication, such as LDAP-based auth or something similar. These days a fairly typical workflow is to use some kind of a daemon/utility on the developer's machine to get a temporary SSH certificate, and then just use this certificate to log in using the SSH.

A backdoor in xz

Posted Mar 31, 2024 1:27 UTC (Sun) by DimeCadmium (subscriber, #157243) [Link]

Leftpad still has value, it's very useful for padding the left side of a string when you're too lazy to write 2 lines of code.

A backdoor in xz

Posted Mar 30, 2024 2:22 UTC (Sat) by dvdeug (guest, #10998) [Link] (9 responses)

So, do you run the kernel that compressed is 140MB of source (and 100MB of binary)? That's more than Linus's hard drive could hold when he started the project, and 140 times as large as version 1.0. Are you trying to debloat, or are you trying to score points on systemd?

A backdoor in xz

Posted Mar 30, 2024 8:16 UTC (Sat) by DimeCadmium (subscriber, #157243) [Link] (5 responses)

I have 35MB used in /boot, which includes 3 kernels among other things.

A backdoor in xz

Posted Mar 30, 2024 8:24 UTC (Sat) by niner (subscriber, #26151) [Link] (3 responses)

What about kernel modules? They are usually not in /boot.

A backdoor in xz

Posted Mar 30, 2024 21:49 UTC (Sat) by dmoulding (subscriber, #95171) [Link]

As just a random data point, my kernel has all functionality I need built-in to it. I don't even enable loadable module support. The compressed bzImage in /boot is 12M. The uncompressed vmlinux is 41. This is for a desktop with everything that entails (DRM, nouveau, bluetooth, USB, camera/video, audio, etc.)

A backdoor in xz

Posted Mar 31, 2024 1:29 UTC (Sun) by DimeCadmium (subscriber, #157243) [Link] (1 responses)

/lib/modules $ du -hs `uname -r`
64M 6.6.13-gentoo

I'm not sure if you're aware of `make menuconfig`, but unlike systemd, you actually CAN effectively turn off parts of the kernel that you don't need.

A backdoor in xz

Posted Mar 31, 2024 1:31 UTC (Sun) by DimeCadmium (subscriber, #157243) [Link]

Oh and here's the actual modules BTW:

700K 6.6.13-gentoo/misc/vboxdrv.ko
48K 6.6.13-gentoo/misc/vboxnetflt.ko
20K 6.6.13-gentoo/misc/vboxnetadp.ko
59M 6.6.13-gentoo/video/nvidia.ko
8.0K 6.6.13-gentoo/video/nvidia-peermem.ko
1.7M 6.6.13-gentoo/video/nvidia-modeset.ko
16K 6.6.13-gentoo/video/nvidia-drm.ko
2.5M 6.6.13-gentoo/video/nvidia-uvm.ko

A backdoor in xz

Posted Mar 30, 2024 15:10 UTC (Sat) by dvdeug (guest, #10998) [Link]

To carry on with niner's point, /boot/vmlinuz-6.7.9-amd64 may be nine megabytes, but /lib/modules/6.7.9-amd64/ is over a hundred. You could read through the kernels of Unix v6 (the Lions book), or xv6, or Minix. Even twenty years ago, the kernel was so big that the SCO mess turned up a copy of Unix malloc buried in kernel that no one had noticed, even though it should have been replaced with standard kernel functions. Should we use a tighter (less functional) kernel that's actually readable? I don't want to give up a lot of the features I use, but there's certainly no one with the headroom to completely understand 140 MB of source code.

A backdoor in xz

Posted Mar 30, 2024 13:24 UTC (Sat) by pawel44 (guest, #162008) [Link] (2 responses)

Kernel is not pulling third party dependencies.

A backdoor in xz

Posted Mar 30, 2024 14:38 UTC (Sat) by smurf (subscriber, #17840) [Link] (1 responses)

Well, not when building it. Running it is another matter, as it pulls in a heap of pre-built binaries (firmware) with poorly-documented provenance.

A backdoor in xz

Posted Mar 30, 2024 15:05 UTC (Sat) by marcH (subscriber, #57642) [Link]

Running it is another matter, as it pulls in a heap of pre-built binaries (firmware) with poorly-documented provenance.

Even worse: it does not even _log_ what it loaded! I usually carry this hack:

--- a/drivers/base/firmware_loader/main.c
+++ b/drivers/base/firmware_loader/main.c
@@ -562,7 +562,7 @@ fw_get_filesystem_firmware(struct device *device, struct fw_priv *fw_priv,
                size = rc;
                rc = 0;
 
-               dev_dbg(device, "Loading firmware from %s\n", path);
+               dev_warn(device, "XXXX Loading firmware from %s\n", path); 
                if (decompress) {
                        dev_dbg(device, "f/w decompressing %s\n",
                                fw_priv->fw_name);
@@ -924,6 +924,10 @@ _request_firmware(const struct firmware **firmware_p, const char *name,
                fw_log_firmware_info(fw, name, device);
        }
 
+       dev_warn(device, "XXXX request-firmware name=%s, ret=%d\n",   name, ret);

        *firmware_p = fw;
        return ret;
 }

A backdoor in xz

Posted Mar 30, 2024 2:56 UTC (Sat) by himi (subscriber, #340) [Link] (22 responses)

> Yet more reason to de-bloat instead of en-bloat with crap like systemd. xz-utils shouldn't have any relation to sshd, doesn't have any relation to upstream openssh, shouldn't be necessary to tell a service manager "hey I'm running!", telling a service manager "hey I'm running" shouldn't be necessary...

Services telling the service manager "hey, I'm running!" makes it much /much/ easier to have robust systems. I've run into all sorts of problems getting multiple dependent services working together reliably because of timing issues during startup - an early service happens to take a second longer to come up than expected, a dependent service comes up at just the right time that it doesn't get a connection refused and instead has to wait for a timeout, and anything that happens afterwards is just broken (without manual intervention). Having that first service /explicitly/ tell the service manager "I'm ready to accept connections" avoids that kind of thing /reliably/, without needing to throw in random sleeps that avoid 99.9% of the problems but make everything take five times as long as it should, and which still don't address that 0.1% tail.

Sure, I could implement that coordination in the services . . . at least for the ones I've written. Otherwise I'd need to what, wrap dependent services in something that /does/ handle the coordination? And what about cleanly shutting down or restarting - obviously I'd need the wrapper to handle that, too. And I'd need to wrap /everything/ somehow so that starting and stopping the whole stack in the correct order would work, and handle errors in the startup process sensibly, and and and . . .

And guess what - handling dependencies between services sensibly is pretty much the bare minimum for any reasonable service manager. Supporting that in some kind of "lets avoid systemd at all costs", "I'm not a service manager, just a thin coordination wrapper, no, really" bit of code is at best fiddly and difficult to do well, and at worst ends up being a reimplementation of a significant chunk of systemd. And even if you do all that, on a *nix system you'll ultimately end up having to delegate at least /some/ things to pid 1, particularly if you want a robust and reliable system (even if it's just tidying up zombie processes).

There's a minimum level of core complexity in any functional system - for a system based on a modern Linux kernel and the current standard Linux userspace, that minimum level of core complexity is close enough to what systemd provides that desperately trying to avoid all of systemd's "bloat" just means you're making things both less functional /and/ less robust. You /can/ make that kind of trade-off work (Alpine does a decent job of it, since they're targeting fairly constrained use cases), but pretending that there's no trade-off happening is just delusional.

But it's also irrelevant in this case, because lots of things other than systemd pull in liblzma - including pam, which was a critical component of a functional sshd on Linux /long/ before systemd became the default. The minimum level of core complexity back when we were all relying on SysV init was still plenty broad enough for a sufficiently motivated and well resourced attacker to find exploitable weaknesses, and there were far fewer options available to harden systems against that kind of attack.

A backdoor in xz

Posted Mar 30, 2024 8:19 UTC (Sat) by DimeCadmium (subscriber, #157243) [Link] (11 responses)

Til they tell the service manager "I'm running!" just before failing. I've had that happen several times, in fact. The only *actual* solution for that problem is monitoring. Notify-by-socket is precisely equivalent to notify-by-fork in terms of reliability.

A backdoor in xz

Posted Mar 30, 2024 9:57 UTC (Sat) by cesarb (subscriber, #6266) [Link] (4 responses)

Another solution is the watchdog: the service has to tell the service manager "I'm still running!" periodically, otherwise it's treated as failed. Then the main effect of the initial "I'm running!" is that the timeout before it is given by TimeoutStartSec, and the timeout after it is given by WatchdogSec, which can be shorter (allowing for services which are slow to start, like heavy Java-based servers).

A backdoor in xz

Posted Mar 31, 2024 1:33 UTC (Sun) by DimeCadmium (subscriber, #157243) [Link] (3 responses)

Til it tells the service manager "I'm still running" but never calls accept again. Notify-by-socket is precisely equivalent to notify-by-fork in terms of reliability.

The solution you are looking for is *monitoring*.

A backdoor in xz

Posted Mar 31, 2024 1:45 UTC (Sun) by intelfx (subscriber, #130118) [Link] (2 responses)

> Notify-by-socket is precisely equivalent to notify-by-fork in terms of reliability.

It might be "equivalent" in an information-theoretical sense (everything that can be achieved with one, is also achievable with the other), but it's absolutely not equivalent in _practical reliability_.

Setting up a proper "notifying" double-fork (which, I remind you, means that the immediate child has to wait for the grandchild to initialize and only then exit, because in most cases the initialization must be completed in the grandchild) is tenfold more _complicated_ and _easier to get wrong_ than simply writing a line into a pre-existing socket that the supervisor has prepared for you.

Even more: all known cases of proper notifying double-fork implementatoin involve creating a temporary pipe or socket between the child and the grandchild, precisely for the reasons described above. As such, we are choosing between a notify-by-socket implemented _once_ and a notify-by-socket implemented _over and over again_ in each daemon. The choice must be obvious, unless you specifically have an irrational axe to grind against systemd.

A backdoor in xz

Posted Mar 31, 2024 6:22 UTC (Sun) by DimeCadmium (subscriber, #157243) [Link] (1 responses)

> As such, we are choosing between a notify-by-socket implemented _once_ and a notify-by-socket implemented _over and over again_ in each daemon. The choice must be obvious,

Indeed it must be, considering that we are discussing the result of everyone sharing a single implementation of it.

A backdoor in xz

Posted Mar 31, 2024 12:25 UTC (Sun) by bluca (subscriber, #118303) [Link]

No, most of us are discussing the result of a multi-year-long sophisticated social engineering attack that preyed on underfunded and overworked unpaid maintainers to inject a complex backdoor. Yes, a handful of people are missing the wood for the trees because they are unable or unwilling to run a simple command to check the attack surface gained by backdooring xz:

$ apt-cache rdepends liblzma5 | wc -l
354

If it hadn't been libsystemd in the middle of the dependency chain, it would have been something else. The exploit was primed and ready to add more backdoors for other arbitrary workflows, with pre-prepared and unused "test files" signatures that we'll now never know what would have attacked.

A backdoor in xz

Posted Mar 30, 2024 14:29 UTC (Sat) by smurf (subscriber, #17840) [Link] (2 responses)

One does not exclude the other. In fact many of my monitoring scripts actively check that the service in question hasn't legitimately been shut down before complaining (too) loudly.

You cannot do a "has legitimately been shut down" check without systemd. (Well, OK, of course I could use or write some other code that does this job, but why would I want to replace one mostly-coherent, widely-used presumed-safe software package with five less-widely-used and poorly-integrated ones? sysV init scripts are of no help here)

A backdoor in xz

Posted Mar 31, 2024 1:33 UTC (Sun) by DimeCadmium (subscriber, #157243) [Link] (1 responses)

Yes you can? OpenRC has done it for longer than systemd has existed ffs

A backdoor in xz

Posted Apr 1, 2024 14:06 UTC (Mon) by farnz (subscriber, #17727) [Link]

I used OpenRC before I used systemd; it does not, as far as I can find (and even today) offer a proper lifecycle check; in particular, the only queries you can ask it are "is this service running", "is this service known and shut down", or "has this service crashed", whereas systemd adds "is this service running but in the process of shutting down" and "is this service running but in the process of restarting" to that list, which is essential for automated remediation of faults - you know that if a service is restarting, a fault is OK, while if a service is shutting down, you should determine if that shutdown is expected, or if you need to alert a human.

A backdoor in xz

Posted Mar 30, 2024 14:48 UTC (Sat) by dvdeug (guest, #10998) [Link] (2 responses)

> Til they tell the service manager "I'm running!" just before failing.

Which is a bug; they should complete all checks that make them fail before reporting success. Yes, bugs are a reality.

> The only *actual* solution for that problem is monitoring. Notify-by-socket is precisely equivalent to notify-by-fork in terms of reliability.

No, there exists many cases where a fork happens and then the program fails before it would have notified the service manager it was successfully running. By the same logic, monitoring is precisely equivalent to notify-by-fork in terms of reliability; monitoring programs can fail to notice a service no longer working as well, except that they add false positives and can report that a system has failed when it's been properly shutdown or had a temporary glitch, as from system overload.

A backdoor in xz

Posted Mar 31, 2024 1:34 UTC (Sun) by DimeCadmium (subscriber, #157243) [Link] (1 responses)

> Which is a bug; they should complete all checks that make them fail before reporting success. Yes, bugs are a reality.

Indeed. But *IT IS THE SAME BUG WHETHER YOU'RE USING SYSTEMD'S NOTIFICATIONS OR FORKING*

I don't understand why I have to explain that so many times only to hear the EXACT SAME (inane) ARGUMENT again.

A backdoor in xz

Posted Apr 1, 2024 11:29 UTC (Mon) by HenrikH (subscriber, #31152) [Link]

for that particular deamon yes, but there are less that have this bug than where the double fork is not reliable (which is 100% of the double fork cases).

A backdoor in xz

Posted Mar 30, 2024 23:34 UTC (Sat) by mchehab (subscriber, #41156) [Link] (9 responses)

> Services telling the service manager "hey, I'm running!" makes it much /much/ easier to have robust systems.

System V init systems are typically a lot more robust than systemd ones, as:

- the order where servers start is fixed; no risk of starting a process too early;
- jobs are started in sequence. Things like Network were started before network daemons like sshd, apache, etc;
- no parallel jobs during init/shutdown time;
- critical jobs that should never be stopped could also be added to the /etc/inittab. They were respawned if something bad happens and the process ends dying.

Also, the "hey, I'm running" task is really simple: if a process has problems, it shall die. PID 1 shall detect it and take the appropriate action when this happens. Modifying the daemon's source, specially with OOT patches sounds a very bad idea.

So, while systemd offers lots of flexibility, in terms of system's robustness, simpler usually means more stable and more reliable. I'm yet to see a systemd-based system more reliable than a SysV one. At best, it might be equivalent in terms of robustness.

A backdoor in xz

Posted Mar 31, 2024 1:14 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

> - jobs are started in sequence. Things like Network were started before network daemons like sshd, apache, etc;

What about wireless or VPNs?

> Also, the "hey, I'm running" task is really simple: if a process has problems, it shall die.

A Java server that I have for telephony takes 2 minutes to start up. How would you detect that?

There's also a problem with double-forking. The only process that can detect the death of a double-forked server is PID 1, and in classic SysV all it did was to reap the PID. Ditto for inittab - it can't detect the death of double-forked processes.

A backdoor in xz

Posted Mar 31, 2024 1:38 UTC (Sun) by DimeCadmium (subscriber, #157243) [Link] (2 responses)

> What about wireless or VPNs?

What about it? Both work fine for me, I have 3 VPNs and occasionally wireless (tethering via my phone).

> A Java server that I have for telephony takes 2 minutes to start up. How would you detect that?

Well, for one thing, I wouldn't use Java, and I wouldn't use a server that takes 2 minutes to start up. Other than that, there are plenty of solutions for this that you could easily implement in sysvinit (you can run whatever you want whenever you want, after all, it's just a shell script); OpenRC actually handles it natively (and has since before systemd existed).

> The only process that can detect the death of a double-forked server is PID 1

That's not true (PR_SET_CHILD_SUBREAPER).

> in classic SysV all it did was to reap the PID

How is that an argument for systemd?

> Ditto for inittab - it can't detect the death of double-forked processes

Huh? inittab is a config file.

A backdoor in xz

Posted Mar 31, 2024 3:51 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

> What about it? Both work fine for me, I have 3 VPNs and occasionally wireless (tethering via my phone).

Now try to make sure that the daemon does not come up until at least one network interface is up. Or until the VPN connection is established.

> That's not true (PR_SET_CHILD_SUBREAPER).

That's true in classic SysV. The subreaper was introduced only in Linux 3.4

> Huh? inittab is a config file.

If you're talking about SysV "simplicity", then you should at least learn it. Classic inittab supports respawning processes on death (action=respawn).

A backdoor in xz

Posted Mar 31, 2024 6:18 UTC (Sun) by DimeCadmium (subscriber, #157243) [Link]

> Now try to make sure that the daemon does not come up until at least one network interface is up. Or until the VPN connection is established.

Done and done.

> That's true in classic SysV. The subreaper was introduced only in Linux 3.4

... okay?

> If you're talking about SysV "simplicity", then you should at least learn it. Classic inittab supports respawning processes on death (action=respawn).

I know it, thanks. That's not a separate program. That's part of init. Controlled by the configuration file, inittab. If you're going to act like you know something better than someone else, then you should at least learn it.

A backdoor in xz

Posted Mar 31, 2024 1:58 UTC (Sun) by himi (subscriber, #340) [Link] (4 responses)

If SysV init systems are actually more robust than systemd ones, it's because they're much simpler - your description captures that simplicity really neatly, in fact.

Though I don't actually believe that claim - I've had plenty of SysV init based systems that were massive pains in the neck to deal with, while the majority of the systemd based systems I've managed have been fairly benign and well behaved, despite doing a whole lot more. Also, the issues I have to deal with generally have nothing to do with systemd, and when they /are/ issues with the unit configuration they've been much easier to resolve than similar issues with init scripts ever were.

Oh, and writing a service targeting a SysV init environment is /far/ more of a pain than writing the same thing targeting a systemd environment - in fact, it's ridiculous how easy it is to target systemd with a basic service. No daemonising, no futzing around with logging targets, no pid files, even the kind of basic intra-service coordination you're pooh-poohing is ridiculously simple. And no mess of spaghetti shell code init script! Most of the time the unit file for a basic service is five or ten lines, and they're simple and declarative with no complex logic - infinitely easier to write and debug. Even when you need to set up complex interdependencies it's generally simple to configure and relatively easy to debug - it can be fiddly, but that's mostly inherent to the problem, rather than an artefact of systemd's implementation.

SysV init worked okay in its day, though it was always a bit of a pain. But it's long /long/ past the time when people should have recognised how much of an improvement systemd is, in pretty much every way. Eulogising the past is all well and good, but actual current performance matters a whole lot more in the real world than sepia-tinted memories of past glories.

Here too

Posted Mar 31, 2024 2:05 UTC (Sun) by corbet (editor, #1) [Link]

Fighting the old systemd wars yet again is not going to help us address this kind of attack. Please, let's not do that.

A backdoor in xz

Posted Mar 31, 2024 3:13 UTC (Sun) by DimeCadmium (subscriber, #157243) [Link] (2 responses)

Indeed, simpler is better. Simpler is easier to secure, simpler is easier to develop, simpler is easier to maintain (i.e. run a system with it), simpler is easier to modify, simpler is easier to troubleshoot, simpler is easier to understand.

A backdoor in xz

Posted Mar 31, 2024 13:00 UTC (Sun) by pizza (subscriber, #46) [Link]

> Indeed, simpler is better.

You can have your remberberries, but the rest of us have to deal with the real world.

...The real world isn't simple, hasn't ever been, and has only trended towards higher complexity.

A backdoor in xz

Posted Apr 1, 2024 14:08 UTC (Mon) by farnz (subscriber, #17727) [Link]

FWIW, by porting a system from SysV init to systemd, I was able to close several hard-to-reproduce bugs, since systemd's extra complexity allowed me to remove a ton of complexity from the various scripts that started components of the system, and at least some of that complexity turned out to have race conditions in it that systemd did not have.

There is a huge advantage to one implementation of something done well replacing (in my case) 5 different implementations, all with different bugs.