A backdoor in xz

Posted Mar 31, 2024 6:32 UTC (Sun) by chris_se (subscriber, #99706)
In reply to: A backdoor in xz by MarcB
Parent article: A backdoor in xz

Did you use the "ultra" settings for zstd or any of the advanced options?

I don't remember. I just redid the same checks again for a single file I had laying around (I did the previous checks against multiple variants), and I got this:

| Method           | Time (m:s) | RAM during compress | Size      |
|------------------|------------|---------------------|-----------|
| xz               | 2:12       |   95 MiB            | 135 MiB   |
| xz -7            | 2:22       |  187 MiB            | 134 MiB   |
| xz -9            | 2:34       |  675 MiB            | 115 MiB   |
| zstd             | 0:02       |   52 MiB            | 182 MiB   |
| zstd -19         | 2:49       |  250 MiB            | 147 MiB   |
| zstd --ultra -22 | 4:22       | 1328 MiB            | 121 MiB   |

(All done one a single CPU core, Intel Core i7-8700K. Debian 12 stable.)

My payload is basically a tar file of a minimized Debian 12 rootfs, plus some additional internal software -- nothing special. (Orig size: 554 MiB)

To summarize my test: even at --ultra -22 zstd is worse in all aspects compared to xz -9.

A backdoor in xz

Posted Mar 31, 2024 16:06 UTC (Sun) by stefanor (subscriber, #32895) [Link] (1 responses)

zstd makes a trade-off of compression to decompression resources.

The main promise of zstd over the other options is faster decompression, so I think it would only be fair to include that in the comparison.

A backdoor in xz

Posted Apr 1, 2024 19:33 UTC (Mon) by chris_se (subscriber, #99706) [Link]

> The main promise of zstd over the other options is faster decompression, so I think it would only be fair to include that in the comparison.

Sure, one could do that, and zstd is probably going to be faster when it comes to decompression. But my original point was not to bash zstd - I was replying to the statement that zstd is always better than xz and that there's no reason to use xz at all. My second response where I posted my measurements was a little hyperbolic to underline that point.

Personally, I do quite like zstd - and if you look at my table, using the standard compression algorithm, you can reduce a filesystem image of 554 MiB down to 182 MiB (~32% of the original size) within just 2 seconds, which is a lot faster than what many other tools can do. (~60 times faster than xz in its default settings.) I do think zstd is an excellent algorithm to use as a default when no further constraints have been applied, because the tradeoffs it has chosen are very sensible for many applications.

The only point I'm trying to drive home is that if you have certain constraints - such as that the compressed size is to be as small as reasonably possible - then zstd might not be the algorithm you want to use in all cases (probably depending on what kind of data you want to compress), and that rhetoric such as "always use zstd, xz is obsolete" is not helpful. And while the broader public now knows a lot more about the past hardships of xz maintenance, hindsight is always 20/20, and I don't think the problems there were immediately obvious to most people just using xz themselves. I think that after-the-fact statements such as "people should not have used xz anymore anyway" are extremely unhelpful - not only because it's easy to say so after the fact, but also because I do think xz has some advantages in some situations and will remain a good choice when constraints require it.