OCI zstd
OCI zstd
Posted May 14, 2025 23:02 UTC (Wed) by tianon (subscriber, #98676)Parent article: The future of Flatpak
Just to be clear, the OCI has standardized support for zstd in general, but the clever zstd:chunked tricks are a podman-ecosystem specific format (that any zstd implementation should be able to handle reading, due to the way it hides the extra data in the chunking).
Posted May 14, 2025 23:35 UTC (Wed)
by vasi (subscriber, #83946)
[Link] (6 responses)
Posted May 15, 2025 15:39 UTC (Thu)
by nliadm (subscriber, #94000)
[Link] (5 responses)
The "zstd:chunked" and "estargz" schemes don't want stably-blocked output, they want random access to individual tar members. This means each member needs to be a complete output, which plays nicely with zstd and gzip's ability to be concatenated.
Posted May 15, 2025 16:36 UTC (Thu)
by vasi (subscriber, #83946)
[Link]
If it was just fast updates to container images, rsyncable (+ something like xdelta) would be sufficient.
If it was just partial fetches (ie: fast access to individual files), we wouldn't really need to make each member independently compressed, losing much of our compression ratio on small files. You just need framed compression, so you can jump to the beginning of a _block_; and a file index, so you know which blocks hold which files. This is basically what I built in pixz. It's generally fast enough to just grab the whole block containing a small file, without losing the compression advantages of reasonable block sizes.
But if we also specifically need deduplication, even across entirely unrelated images, then I guess we really do need to have independent compression of files, like zstd:chunked does.
It just feels a bit unfortunate to have invented a bespoke ZIP-like archive format, whose only implementation is within `containers/storage`. I think 7zip has zip + zstd working nowadays, which would feel cleaner to me.
Posted May 15, 2025 17:36 UTC (Thu)
by excors (subscriber, #95769)
[Link] (3 responses)
I believe "periodically" means "if (sum of the last 4096 bytes) % 4096 == 0" (rounded up to the end of a string match), which incidentally is a very poor checksum that makes it pretty inefficient at compressing long sequences of a single byte (e.g. 1MB of /dev/zero compresses to 30KB, whereas 1MB of a repeated two-byte pattern compresses to 1KB). Anyway, it means that changing one byte in the middle of the uncompressed input should only affect the next <36KB of compressed output, so rsync's blocks should get back in sync soon afterwards.
Unfortunately, since (I think) the flushing *doesn't* prevent new Deflate blocks referring to old data in the 32KB window, and a decompressor can only reconstruct that window by decompressing old Deflate blocks (which recursively depend on all data back to the start of the file), you can't use this to start decoding from the middle of a gzip --rsyncable file. You can (even without --rsyncable) construct a separate index file containing a subset of the block boundary positions and a copy of the 32KB window at each boundary, and use that to support reasonably efficient seeking to arbitrary positions within the compressed file, and I've written some code to do that, but it's a bit awkward compared to a compressed file format with native support for random access.
(I'm not sure of the details of 'zstd --rsyncable' but it does look a bit more sensible than gzip's implementation - at least it's got a proper checksum function.)
Posted May 15, 2025 18:50 UTC (Thu)
by vasi (subscriber, #83946)
[Link] (2 responses)
You said you've written code to deal with this before, I'm curious where! Would love to see how others have dealt with these issues.
Zstd unfortunately works similar to gzip here, where even with rsyncable each block depends on the previous window. But it at least has a multi-frame format specification, with multiple independent implementations: zstd's contrib dir, zstd-seekable-format-go, t2sz, maybe more.
Xz is really my favorite here, since in multi-threaded mode (which is on by default nowadays) it creates completely independent blocks. Yes, it gives up a tiny bit of compression ratio, but it enables both random-access and parallel DEcompression.
Posted May 15, 2025 19:54 UTC (Thu)
by excors (subscriber, #95769)
[Link]
Posted May 17, 2025 6:20 UTC (Sat)
by tianon (subscriber, #98676)
[Link]
A friend of mine wrote https://github.com/jonjohnsonjr/targz, which is essentially extracted from the code that powers the layer browsing functionality of https://oci.dag.dev/ (https://github.com/jonjohnsonjr/dagdotdev). 👀
My understanding of oci.dag.dev is that he creates an index of the tar inside the stream (without modifying the original compression in any way). Then he gets clever and stores that in a tar.gz so that if the *index* gets too big, he can make a map of the index too and just recurse.
(However, my own understanding of the details is very surface level, so if I've got the details wrong maybe he'll finally make an account just to correct me! ❤️)
OCI zstd
OCI zstd
OCI zstd
OCI zstd
OCI zstd
OCI zstd
OCI zstd
