How the XZ backdoor works
Versions 5.6.0 and 5.6.1 of the XZ compression utility and library were shipped with a backdoor that targeted OpenSSH. Andres Freund discovered the backdoor by noticing that failed SSH logins were taking a lot of CPU time while doing some micro-benchmarking, and tracking down the backdoor from there. It was introduced by XZ co-maintainer "Jia Tan" — a probable alias for person or persons unknown. The backdoor is a sophisticated attack with multiple parts, from the build system, to link time, to run time.
The community response to the attack is just as interesting as the technical aspects. For more information on that, refer to this companion article.
Build time
The backdoor consists of several distinct phases, starting when the package is being built. Gynvael Coldwind wrote an in-depth investigation of the build-time parts of the backdoor. Releases of XZ were provided via GitHub, which has since disabled the maintainers' accounts and taken the releases offline. Like many projects that use GNU Autoconf, XZ made releases that provided several versions of the source for download — an automatically generated tarball containing the source and related files in the repository, along with versions containing the generated build files. Those extra files include the configure script and makefiles for the project. Releasing versions that contain the generated files allows downstream users of the software to build without needing to install Autoconf.
In this case, however, the scripts in the maintainer-provided source tarballs were not generated by Autoconf. Instead, one of the build scripts contained the first stage of the exploit in m4/build-to-host.m4. This script is originally from the Gnulib library; it provides a macro that converts between the style of pathname used by the build environment and the run-time environment of the program. The version in these XZ releases was modified to extract the next stage of the exploit, which is contained in tests/files/bad-3-1corrupt_lzma2.xz.
This file is included in the repository, ostensibly as part of XZ's test suite, though it was never used by those tests. It was committed well before the release of version 5.6.0. The file, supposedly a corrupted XZ file, is actually a valid XZ stream with some bytes swapped — for example, 0x20 is swapped with occurrences of 0x09 and vice versa. When decoded, it yields a shell script that unpacks and executes the next stage of the backdoor.
The next stage of the backdoor is located in tests/files/good-large_compressed.lzma. This is the injected.txt file attached to Freund's message. That file contains more than just the next stage of the script — it also contains additional binary data that forms the actual backdoor itself. The final script skips over the header of the file from which it was extracted, and then uses awk to decrypt the remainder of the file. Finally, that decrypted stream is decompressed using the XZ command-line program, in order to extract a pre-compiled file called liblzma_la-crc64-fast.o, which is also attached to Freund's message.
Link time
The extracted file is a 64-bit relocatable ELF library. The remainder of the build process links it into the final liblzma library which ends up being loaded into OpenSSH on some distributions. Those distributions patch OpenSSH to use systemd for daemon-readiness notifications; libsystemd in turn depends on liblzma for compressing journal files. Lennart Poettering has since posted some example code (written by Luca Boccassi) showing how to let applications use systemd readiness notifications without pulling in the entire library. When the malicious liblzma is used by a dynamically linked process, it uses the indirect function mechanism to involve itself in the linking process.
Indirect functions are a feature of the GNU C library (glibc) that permits a developer to include several versions of a function and select which version to use at dynamic linking time. Indirect functions are useful for including optimized versions of a function that rely on specific hardware features, for example. In this case, the backdoor provides its own version of the indirect function resolvers crc32_resolve() and crc64_resolve() that select versions of crc32() and crc64() to use, respectively. liblzma does not usually use indirect functions, but using faster functions to calculate checksums does sound like a plausible use of the feature. This plausible deniability is probably why the exploit itself lives in a file called liblzma_la-crc64-fast.o.
When the dynamic linker finalizes the locations of those functions, it calls the backdoor's resolver functions. At this point, dynamic linking is still in progress, so many of the linker's internal data structures have not yet been made read-only. This would let the backdoor manipulate libraries that had already been loaded by overwriting entries in the procedure linkage table (PLT) or global offset table (GOT). However, liblzma is loaded fairly early in the link order of OpenSSH, which means that the OpenSSL cryptography functions that are the backdoor's ultimate target may not have been loaded yet.
To deal with that, the backdoor adds an audit hook. The dynamic linker calls all the registered audit hooks when it is resolving a symbol. The backdoor uses this to wait until it sees the RSA_public_decrypt@got.plt symbol being resolved. Despite the name, this function is actually part of handling an RSA signature (which is a decryption operation) — OpenSSH calls it while validating an RSA certificate provided by the client during a connection.
Run time
Once the backdoor detects this function being linked, it replaces the function with its own version. What the altered version does is still being investigated, but at least one of its functions is to attempt to extract a command from the public-key field of the provided RSA certificate (which means that certificates that are used in this attack cannot actually be used to authenticate normally). The backdoor checks whether the command is signed by the attacker's private key and has valid formatting. If it does, then the backdoor directly runs the given command as the user running sshd, which is usually root.
Anthony Weems has put together an explanation of the run-time portion of the exploit, including a honeypot to detect attempts to use the exploit, and code to generate command payloads. Using the backdoor involves signing the command to be executed with a private key, but the attacker's is not available, so the backdoored server needs to be patched to use another private key. This also means that detecting backdoored servers remotely is nearly impossible, since they will not react any differently to connections that don't use the attacker's private key.
Ultimately, the effect of the backdoor appears to be that a compromised SSH server which receives a connection with a hand-crafted RSA certificate for authentication can be made to run attacker-controlled code.
Anti-analysis
The design of the backdoor makes it difficult to notice without directly inspecting liblzma. For example, the choice to enable remote code execution rather than an authentication bypass means that use of the exploit does not detect a login session that could be noticed by traditional administration tools. The backdoor's code also uses several techniques to make discovery more difficult. For example, the string "RSA_public_decrypt@got.plt", which is used by the audit hook, never appears in the binary of the exploit. Instead, it uses a trie to hold various strings. Serge Bazanski posted a list of strings in the malicious liblzma encoded this way.
Examining that list shows that RSA_public_decrypt is likely not the only function interfered with; several other cryptography routines are listed. It also shows various functions and strings that are used to interfere with OpenSSH's logging. This is not yet confirmed, but it seems likely that a compromised SSH server would not actually log any connection attempts that use the exploit.
The backdoor also includes many checks to ensure it is running in the expected
environment — a standard precaution for modern malware that is intended to make
reverse-engineering more difficult. The backdoor is only active
under specific circumstances, including: running in a non-graphical
environment, as root (see this comment
from Freund), in a binary located at /usr/sbin/sshd, with
sshd having the expected ELF header, and where none
of its functions have had a breakpoint inserted by a debugger. Despite these
obstacles,
community efforts to reverse-engineer and explain the remainder of the
backdoor's code
remain underway.
The backdoor also includes code that patches the binary of sshd itself
to disable
seccomp() and prevent the program from creating a
chroot sandbox for
its children (see this comment).
In total, the code of the backdoor is 87KB, which is plenty of
space for additional unpleasant surprises. Many people have put together their
own summaries of the exploit, including
this comprehensive FAQ by Sam James, which links to other resources.
Being safe
The exploit was caught promptly, so almost no users were affected. Debian sid, Fedora Rawhide, the Fedora 40 beta, openSUSE Tumbleweed, and Kali Linux all briefly shipped the compromised package. NixOS unstable also shipped the compromised version, but was not vulnerable because it does not patch OpenSSH to link libsystemd. Tan also included some other changes to the XZ code to make detecting and mitigating the backdoor more difficult, such as sabotaging sandboxing measures and making preemptive efforts to redirect security reports. Even though the exploit did not reach their stable versions, several distributions are nonetheless taking steps to move to a version of XZ that does not contain any commits from Tan, so users should expect to see security updates related to that soon. Readers may also wish to refer to the security notice for their distribution for more specific information.
| Index entries for this article | |
|---|---|
| Security | Backdoors |
| Security | Dynamic linking |
Posted Apr 2, 2024 21:15 UTC (Tue)
by randomguy3 (subscriber, #71063)
[Link] (8 responses)
Posted Apr 2, 2024 21:31 UTC (Tue)
by Sesse (subscriber, #53779)
[Link] (1 responses)
Posted Apr 2, 2024 21:38 UTC (Tue)
by randomguy3 (subscriber, #71063)
[Link]
Posted Apr 2, 2024 23:45 UTC (Tue)
by Trelane (subscriber, #56877)
[Link] (1 responses)
Posted Apr 3, 2024 13:58 UTC (Wed)
by throwaway_901023443 (guest, #170606)
[Link]
"Impact
Posted Apr 2, 2024 23:56 UTC (Tue)
by Heretic_Blacksheep (guest, #169992)
[Link] (3 responses)
I'm glad to see the follow up article because a lot of early reporting from various outlets are working from incomplete information and therefore drawing incorrect conclusions ie: not just an authentication bypass, it's an arbitrary RCE injection which makes it far worse had it made to widespread distribution in RHEL 10(?) or Debian 13.
For OpenSSH's role in the chain, there's a feature request submitted to change its behavior for security certificates along with a diff already offered for review. https://bugzilla.mindrot.org/show_bug.cgi?id=3675
For Lasse Collin this has to be a nightmare scenario for him and I am sympathetic to his problems. I do wonder if he will assign the xz project and copyright to a well known open source foundation in the future who may be better suited to monitor and audit contributions if he hasn't the time and a lack of people he believes he can trust.
(?) Not sure how RHEL does its updates with regards to specific packages as I don't use it, 10 is the next major version anyway.
Posted Apr 3, 2024 8:20 UTC (Wed)
by pbonzini (subscriber, #60935)
[Link]
Posted Apr 3, 2024 15:31 UTC (Wed)
by mario-campos (subscriber, #152845)
[Link]
While I do agree that this speaks to a larger problem for all distros, I disagree that OpenBSD is somehow more susceptible. OpenBSD is, by design, more minimal, which means that if a backdoor exists in a 3rd-party package, one must choose to install it -- secure by default. Whereas, on other distros, such packages might already be included by default -- insecure by default.
Posted Apr 3, 2024 16:53 UTC (Wed)
by emaste (guest, #121005)
[Link]
Posted Apr 2, 2024 22:07 UTC (Tue)
by andresfreund (subscriber, #69562)
[Link] (4 responses)
Minor quibbles:
> However, liblzma is loaded fairly early in the link order of OpenSSH, which means that the OpenSSL cryptography functions that are the backdoor's ultimate target may not have been loaded yet.
I don't think when openssl is loaded matters much. The relevant part is the relocation processing for sshd itself, which will happen after all the libraries are already loaded. The audit hook indeed is only registered for "", the main binary. The GOT entry that gets modified during the execution of dl-audit hook for RSA_public_decrypt is in the GOT for the main binary.
I think the only library that's not yet mapped read-only during the dl-audit hook execution is the current library (i.e. sshd itself in this case), which quite fundamentally can't be mapped read-only yet.
> The backdoor also includes many checks to ensure it is running in the expected environment — a standard precaution for modern malware that is intended to make reverse-engineering more difficult. The backdoor is only active under specific circumstances, including: running in a non-graphical environment, as root, in a binary located at /usr/sbin/sshd, with sshd having the expected ELF header, and where none of its functions have had a breakpoint inserted by a debugger. Despite these obstacles, community efforts to reverse-engineer and explain the remainder of the backdoor's code remain underway.
FWIW, the exploit doesn't check that it's running as root:
$ time env -i LANG=C LD_LIBRARY_PATH=/path/to/xz-5.6.1/b/src/liblzma/.libs/ /usr/sbin/sshd -D -h
Posted Apr 2, 2024 23:33 UTC (Tue)
by ms-tg (subscriber, #89231)
[Link]
Also thank you to LWN, where consistently the subjects of articles arrive to contribute meaningfully in the comments.
Posted Apr 3, 2024 13:56 UTC (Wed)
by daroc (editor, #160859)
[Link] (1 responses)
Thank you for pointing these out!
For your first point, I think you're correct and I was insufficiently precise with my language. liblzma is loaded before the OpenSSL cryptography functions are resolved, so naturally the backdoor needs some way to wait for that to happen if it wants to override them.
For the second point, I was basing that on the fact that it looks like the exploit calls getuid() during its checks — but perhaps it is doing that for some other reason? I admit I haven't traced through all the checks myself. I'll make a correction.
Posted Apr 3, 2024 16:28 UTC (Wed)
by andresfreund (subscriber, #69562)
[Link]
I don't think it does call geteuid() during the "should I be enabled" portion (i.e. within the ifunc resolver for crc64). I think there is a call somewhere after the point the backdoor is actually called. But not 100% certain.
Posted Apr 3, 2024 20:08 UTC (Wed)
by titaniumtown (guest, #163761)
[Link]
Posted Apr 3, 2024 0:07 UTC (Wed)
by flussence (guest, #85566)
[Link] (2 responses)
Posted Apr 3, 2024 0:40 UTC (Wed)
by Heretic_Blacksheep (guest, #169992)
[Link] (1 responses)
Posted Apr 3, 2024 11:12 UTC (Wed)
by dgm (subscriber, #49227)
[Link]
Posted Apr 3, 2024 0:36 UTC (Wed)
by Keith.S.Thompson (subscriber, #133709)
[Link]
I don't know whether either was vulnerable, but both have reverted to version 5.4.6 and 5.4.5, respectively.
Posted Apr 3, 2024 7:44 UTC (Wed)
by pbonzini (subscriber, #60935)
[Link]
They were—the attacker didn't hand-patch the configure script and the tarball was presumably generated as usual with "make dist". However, the build-to-host.m4 file was modified to include the first stage of the attack before the tarball was generated. The file is not part of autotools proper, as it's common to have extra macros in an m4/ directory for use in the configure.ac file.
A hand patched configure script would have been harder to detect, but many distros suggest regenerating the scripts from configure.ac to ensure that they include any bug fixes applied to Autoconf by the distros, and that would have undone any manual changes applied by "Jia Tan".
Posted Apr 3, 2024 7:57 UTC (Wed)
by mezcalero (subscriber, #45103)
[Link] (9 responses)
1. I didn't post that code that shows how to do roll your own sd_notify(). Luca Boccassi did that work. Thanks for the laurels, but those are Luca's. I just mentioned it on Mastodon.
2. It's worth mentioning that OpenSSH actually independently implemented something like this: https://bugzilla.mindrot.org/show_bug.cgi?id=2641 – something I very much approve of. Given one of the OpenSSH maintainers has reworked it extensively, I would guess this actually has a chance of entering OpenSSH proper soonishly.
At the very least we now have a clearly defined avenue for getting code that matters for Linux only merged into OpenSSH upstream: just get some nation state actor to use it as a vehicle for an exploit, and bam, there's your window of opportunity to get something merged!
Lennart
Posted Apr 3, 2024 8:00 UTC (Wed)
by mezcalero (subscriber, #45103)
[Link]
Posted Apr 3, 2024 10:56 UTC (Wed)
by dvrabel (subscriber, #9500)
[Link] (5 responses)
Posted Apr 3, 2024 12:29 UTC (Wed)
by cyperpunks (subscriber, #39406)
[Link] (2 responses)
Aka having a standard split between systemd the core server service and "client services", it would make sense for several reasons?
Posted Apr 3, 2024 12:41 UTC (Wed)
by smurf (subscriber, #17840)
[Link]
systemd had that in the past. They switched to one single support library because handling more than one is a major hassle and frankly not well supported by pkg-config and similar tools.
The next release will use "dlopen" instead of direct linking, so you get mostly the same benefit.
Posted Apr 3, 2024 13:30 UTC (Wed)
by bluca (subscriber, #118303)
[Link]
- use libsystemd and you get a maintained implementation for free: when maintainability is the main concern and additional dependencies are not an issue or already present
Most projects that fall in the second category do not want a dependency, period - doesn't matter if it's only build time or also runtime, they just don't want to hear it, and that's what the second option is for, like for openssh. For other cases, just use libsystemd as provided.
Posted Apr 3, 2024 20:43 UTC (Wed)
by himi (subscriber, #340)
[Link] (1 responses)
At the same time, as modern programmers we're all pretty conditioned to look for a library or a "blessed" example of things like this rather than rolling our own - and generally this is a good thing! For something like sd_notify() it would make sense for there to be an officially blessed (i.e. part of the documentation) collection of cut-and-paste ready implementations for a collection of common languages, so that people aren't either pulling from a less-trusted source (stackexchange, etc) or rolling their own. It would probably make sense for other cases as well, though /what/ cases is obviously going to be a topic for much debate.
Posted Apr 4, 2024 10:06 UTC (Thu)
by bluca (subscriber, #118303)
[Link]
Turns out it's your lucky day after all: https://www.freedesktop.org/software/systemd/man/devel/sd...
PRs welcome to add more languages
Posted Apr 3, 2024 14:03 UTC (Wed)
by daroc (editor, #160859)
[Link] (1 responses)
Posted Apr 3, 2024 15:45 UTC (Wed)
by bluca (subscriber, #118303)
[Link]
Posted Apr 3, 2024 8:05 UTC (Wed)
by ms (subscriber, #41272)
[Link]
For those who want even more line-by-line detail, Russ Cox's write up is also excellent, imo. https://research.swtch.com/xz-script
Posted Apr 3, 2024 20:08 UTC (Wed)
by feliperalmeida (guest, #170644)
[Link] (1 responses)
> The backdoor also includes code that patches the binary of sshd itself to disable seccomp() and prevent the program from creating a chroot sandbox for its children.
I don't think that's accurate though. If that is referring to "https://gist.github.com/smx-smx/a6112d54777845d389bd7126d..." - the binary patch was done by the gist authors to be able to trace the `sshd` process using Frida and not by the backdoor. They probably binary-patched it to avoid recompiling.
Posted Apr 3, 2024 21:17 UTC (Wed)
by daroc (editor, #160859)
[Link]
Thank you for pointing out my mistake; I've edited the article with a correction.
Posted Apr 3, 2024 20:44 UTC (Wed)
by khim (subscriber, #9252)
[Link] (3 responses)
I wonder how temptation to bring XKCD into the article was resisted. I know that Randall Munroe got the year wrong and I suspect Lasse Collin doesn't live in Nebraska, but still… haven't expected to see that picture repeated in real life this literally.
Posted Apr 4, 2024 9:54 UTC (Thu)
by paulj (subscriber, #341)
[Link] (2 responses)
Posted Apr 4, 2024 11:01 UTC (Thu)
by khim (subscriber, #9252)
[Link] (1 responses)
Not that literally. Yes, OpenSSL is undermaintained, but here prophesy was fullfilled literally: package which is used in a lot of places with actually just one, single, maintainer (well, before “Jia Tan” started helping… and he was able to build rapport and almost achieve successful penetration because, apparently there were actual reasons to fix some things, Lasse Collin was genuinely thankfull for help).
Posted Apr 11, 2024 18:19 UTC (Thu)
by richmoore (guest, #53133)
[Link]
Posted Apr 3, 2024 20:53 UTC (Wed)
by bnorris (subscriber, #92090)
[Link]
It also was in Debian testing, which is much more widely used than sid. And it was live for almost 2 months, so I wouldn't say it was *that* prompt.
Source: https://lists.debian.org/debian-security-announce/2024/ms...
Posted Apr 5, 2024 11:59 UTC (Fri)
by matp75 (subscriber, #45699)
[Link]
Posted Apr 6, 2024 14:17 UTC (Sat)
by jwilk (subscriber, #63328)
[Link]
The official docs for RSA_public_decrypt are here:
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
https://glsa.gentoo.org/glsa/202403-04
Our current understanding of the backdoor is that is does not affect Gentoo systems, because 1. the backdoor only appears to be included on specific systems and Gentoo does not qualify; 2. the backdoor as it is currently understood targets OpenSSH patched to work with systemd-notify support. Gentoo does not support or include these patches; Analysis is still ongoing, however, and additional vectors may still be identified. For this reason we are still issuing this advisory as if that will be the case."
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
option requires an argument -- h
...
real 0m0.519s
user 0m0.514s
sys 0m0.005s
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
- take the MIT-0 example and insert/adapt it in your program: when additional dependencies are the main concern
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
How the XZ backdoor works
Debian
How the XZ backdoor works
How the XZ backdoor works
https://www.openssl.org/docs/manmaster/man3/RSA_public_de...
