|
|
Log in / Subscribe / Register

Integration into file formats.

Integration into file formats.

Posted Jan 15, 2026 2:48 UTC (Thu) by jepsis (subscriber, #130218)
In reply to: Integration into file formats. by jepsis
Parent article: Format-specific compression with OpenZL

Automatic decompression for such a file format is easy. Compression is the hard part. To write efficient representation you need clear intent i.e. how the data is expected to be used (streaming, random access, read-heavy, write-heavy), what the lifecycle looks like (archival or live data or if recompression is expected), and how the data is structured internally (schema, value distributions, chunking and ordering). Without this information any attempt to choose compression automatically is mostly guesswork and likely ends up with suboptimal result.


to post comments

Integration into file formats.

Posted Jan 15, 2026 14:46 UTC (Thu) by willy (subscriber, #9762) [Link]

The two of you may be talking past each other a little. It depends whether this is archival data or working set whether building compression into the file format is a good idea. There's value in "today's data is stored in foo, last year's data is stored in foo.gz". But sometimes we're always dealing with data that needs to be compressed, and then it's worth building it into the file format.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds