OCI is an antiquated format, not fit for modern security requirements
OCI is an antiquated format, not fit for modern security requirements
Posted May 17, 2025 11:16 UTC (Sat) by bluca (subscriber, #118303)In reply to: OCI is an antiquated format, not fit for modern security requirements by Cyberax
Parent article: The future of Flatpak
Other OSes I really don't care about, and I am pretty sure they are irrelevant for Flatpak too, which is the subject of the article.
Posted May 17, 2025 18:17 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (3 responses)
Squashfs is not too hard to support, as it's just barely more complex than tar. But then it also has a lot of tar's problems. EROFS is better, but it's also more complicated. And this means more space for potential issues.
And file formats for something like container images should be as simple as possible.
> Other OSes I really don't care about, and I am pretty sure they are irrelevant for Flatpak too, which is the subject of the article.
Sure, but then it's back to the status quo: Flatpak will remain a unique snowflake with slowly decaying tooling.
Posted May 21, 2025 16:36 UTC (Wed)
by hsiangkao (guest, #123981)
[Link] (2 responses)
I'm tired of writing comments on LWN.net because simply I don't get where those biased points come from.
- It doesn't have an old-styled centralized on-disk inode table as SquashFS like extX and minix; In fact, EROFS on-disk inodes can be placed on disk anywhere if needed as modern fses like XFS, BtrFS, etc., therefore it's quite easy to do incremental builds (e.g. add new inodes and data) without expending and rewriting a new inode table entirely;
- It doesn't have extra on-disk directory indices to speed up inode lookup "https://dr-emann.github.io/squashfs/squashfs.html#_directory_index" for large directories since without those directory indices, SquashFS directory can only search dirents in a simple linear way due to its on-disk dirent design; Unlike SquashFS, EROFS dirents are still simple and strictly sorted in alphabetical order and can do binary search natively. I've tested some AI datasets where each directory contains millions of files, and EROFS random access performance is even better than SOTA EXT4.
- The core on-disk format just have three parts: super-block, 32 or 64-byte inodes (instead of one layout for each type of inodes to save seamless space) and dirents: https://erofs.docs.kernel.org/en/latest/core_ondisk.html. I have no idea where is "more space for potential issues" because it just behaves as a fsblock-aligned archive format;
- EROFS uncompressed data is strictly fsblock-based which means data can be directly fetched via DMA to page cache without extra post-processing, instead of SquashFS unaligned data even if it supports uncompressed mode but still need a memcpy to handle unalignment; thus, EROFS also supports advanced runtime features natively like FSDAX (XIP), direct I/Os, etc.
Posted May 21, 2025 17:29 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
I'm not arguing that EROFS or Squashfs are bad, they are just more complex, and I want something as simple as possible with the widest amount of tooling available.
Posted May 21, 2025 17:40 UTC (Wed)
by hsiangkao (guest, #123981)
[Link]
How simple? tar consists of `tar header` and `data`. It was designed for tape devices and it doesn't even support metadata random access (because you can never image how rootdir looks like until the last `tar header` in case the last tar header is in the rootdir).
EROFS core on-disk format can be implemented in ~500 lines (for example, https://github.com/dmcgowan/go-erofs/blob/main/erofs.go) if you don't implement optimized binary search and xattrs.
It's basically just a combination of three basic on-disk parts: superblock + inodes + dirents if you could take a look of https://erofs.docs.kernel.org/en/latest/core_ondisk.html. Except for on-disk superblock, inodes and dirents can be arranged in a free form. dirents are designed for random access but you could just implement the naive way. I wonder how simpler than this form without extensibility?
Because EROFS implements many optional advanced features like ACL, FSDAX, Direct I/O, file-backed mounts, very optimized decompression subsystem with inplace I/Os etc. But it doesn't mean the on-disk format is complex.
OCI is an antiquated format, not fit for modern security requirements
OCI is an antiquated format, not fit for modern security requirements
> And file formats for something like container images should be as simple as possible.
EROFS core on-disk format (e.g. used for ComposeFS) is much simple, flexible and efficient:
OCI is an antiquated format, not fit for modern security requirements
OCI is an antiquated format, not fit for modern security requirements
> I'm not arguing that EROFS or Squashfs are bad, they are just more complex, and I want something as simple as possible with the widest amount of tooling available.