|
|
Subscribe / Log in / New account

An introduction to EROFS

An introduction to EROFS

Posted Jun 7, 2023 15:51 UTC (Wed) by hsiangkao (guest, #123981)
Parent article: An introduction to EROFS

Very sorry about my spoken English. I might need to add some words:

> The earlier read-only filesystems had many limitations, such as not supporting compression,

Here I meant they can be worked effectively without compression.

ROMFS might be something but as far as I understand it doesn't have block concept so we still need do extra memcpy for buffered I/O, see:
romfs_read_folio() -> romfs_dev_read() -> romfs_blk_read().
It makes direct I/O / FSDAX nonsense as well. Also ROMFS and CRAMFS on-disk format itself are quite limited as well.

> uses smaller compression block sizes, which reduces the memory amplification that occurs with SquashFS.

EROFS can use 1 MiB pcluster size as well as Squashfs, but EROFS original proposed scenarios were effectively with smaller pcluster sizes (4/8/16KiB for example, EROFS uses 4KiB pcluster by default), because we'd like to enable compression for users without extra memory footprints. Yet the previous approach (I mean indexes) are not quite good at these small compression units (you could benchmark with 4/8/16 KiB compression unit instead of typical 128 KiB for example.)

Finally, I'd like to mention EROFS now supports global compressed data deduplication with rolling hash as well, so if there are similiar data but not block-aligned (like text data like source code or similiar wikipedia versions), it might be useful to deduplicate + compression with this way...


to post comments

An introduction to EROFS

Posted Jun 8, 2023 9:23 UTC (Thu) by gmgod (guest, #143864) [Link] (4 responses)

Hello, this looks like very exciting work that seems to better fit lots of use cases people currently have (from initramfs to specific-need archiving, to a base for "immutable" OS, VMs and containers).

Two questions as someone who has not followed the advent of EROFS:

1. Do you have strong tempering prevention guarantees built-in (beyond being immutable of course) or is that something people have to figure out outside of EROFS?

2. Is EROFS agnostic of compression methods? Or said otherwise is it modular enough to use different compression/filtering methods? (I am aware that you are covering the two main cases people would want with your current choice: I am not questioning that.)

An introduction to EROFS

Posted Jun 8, 2023 10:08 UTC (Thu) by hsiangkao (guest, #123981) [Link] (1 responses)

Two questions as someone who has not followed the advent of EROFS:
> 1. Do you have strong tempering prevention guarantees built-in (beyond being immutable of course) or is that something people have to figure out outside of EROFS?

You meant malicious image resistence? We're always trying my best to deal with fuzzing issues and fix them as quick as possible. And currently we don't have remaining fuzzing issue at hand. That is the only guarantee I could do for this.

> 2. Is EROFS agnostic of compression methods? Or said otherwise is it modular enough to use different compression/filtering methods? (I am aware that you are covering the two main cases people would want with your current choice: I am not questioning that.)

It depends. In principle, any compression method could be added to EROFS with no modification directly but since EROFS data including compressed data is block-aligned (IMHO, like btrfs and f2fs compression but unlike squashfs), if such compression method doesn't support the optimized fit-block approach (aka. fixed-sized output compression, currently only lz4 and lzma have, and I'm working on deflate now), the last block (usually 4k block size) of each pcluster (4k, 8k, ... to 1m) will not be completely full with compressed data. That will cause some compression ratio loss if pcluster is small (like 4k or 8k, but I think it can be ignored if pcluster size itself is large like 128k or more).

In practice, I tend to avoid adding new algorithm randomly before I design carefully to EROFS since it could cause compatibility problems and maintainence burden if I later change to the optimal approach. In short, this year I will land deflate algorithm to enable deflate hardware accelerators (and maybe more I'm still planning with compression algorithm guys).

An introduction to EROFS

Posted Jun 8, 2023 10:39 UTC (Thu) by hsiangkao (guest, #123981) [Link]

> like btrfs and f2fs compression

Add some words: I just meant compressed data is block-aligned like those as far as I understand, but actually EROFS can handle arbitary decompressed offset/length instead of block-aligned decompressed offset/length compared with f2fs/btrfs. So that EROFS can do block-unaligned rolling hash compressed data deduplication since Linux v6.1 (also called CDC).

In principle, we could record byte-granularity decompressed offset/length pair and byte-granularity arbitary compressed offset/length pair for each compression unit but that makes on-disk indexes ineffective (metadata I/O) even makes on-disk index random access impossible. In addition, unaligned compressed data makes caching/in-place I/O strategy unfriendly.

For more details of detailed design, you could also refer to EROFS ATC19 paper and kernel documentation if needed.

An introduction to EROFS

Posted Jun 9, 2023 15:48 UTC (Fri) by bobolopolis (subscriber, #119051) [Link] (1 responses)

> 1. Do you have strong tempering prevention guarantees built-in (beyond being immutable of course) or is that something people have to figure out outside of EROFS?

dm-verity is probably your best bet for this, which would let you use erofs, squashfs, or whatever other read-only filesystem you want. I've been pretty happy with dm-verity + squashfs in past projects, I'm sure erofs would work great too.

An introduction to EROFS

Posted Jun 9, 2023 16:35 UTC (Fri) by hsiangkao (guest, #123981) [Link]

> dm-verity is probably your best bet for this, which would let you use erofs, squashfs, or whatever other read-only filesystem you want. I've been pretty happy with dm-verity + squashfs in past projects, I'm sure erofs would work great too.

Signed verified images are fine of this (if users just trust the signature), I think later LWN will post the following LSF/MM FS track topics. The related stuffs discussed several times in several seperate topics.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds