Composefs for integrity protection and data sharing

Posted Dec 8, 2022 10:27 UTC (Thu) by hsiangkao (guest, #123981)
In reply to: Composefs for integrity protection and data sharing by diconico07
Parent article: Composefs for integrity protection and data sharing

> I'm not sure Composefs authors will see your comments here, this should be posted in the RFC thread, so that may trigger a discussion and maybe shed light on why they chose to do composefs rather than building over EROFS (maybe they just don't know about the features of EROFS you highlight here).

I could do like this, yet one reason I don't do it now despite all of above is that I didn't see any useful comments (either negative or positive) from experienced filesystem developers (as well as overlayfs folks) about such overlay model first (especially from the point of view of security concerns if fs-verity of underlayfs is unusable). And considering if introducing another brand new filesystem, why not using a finer-grained deduplicated approaches like chunk/block-based deduplication like casync (which already covers per-file deduplication) but insisting on per-file hardlink-like deduplication?

Also months ago one of composefs authors already asked me on one Slack channel about "building over EROFS" just after I noticed composefs work. Overall I think it's not hard for EROFS to support such per-file model but my personal question is still above (does the model above really sounds reasonable? If it sounds reasonable, I'm also quite happy to develop this in a few days if they don't have time). I requested him to ask the Linux filesystem community but they seemed they still insist on doing this.

Then, I also noticed another filesystem called ostreefs [1], yet I don't have time to look much into that again.

> Concerning the lack of coverage of newest EROFS features, maybe you can write some article about it, it is not unusual to have guest writers on LWN (please note that I'm not part of LWN and don't know their policy for such articles) and I'd love to hear more about these features and how to use them.

Thank you! If LWN experienced writers ignore EROFS/Nydus work even EROFS is actually gaining more powerful/popular these years (and most all mainstream Linux distributions have landed EROFS) but I'm not a native English speaker but I will try my best to show all the potential use cases! Thank you again!

[1] https://github.com/ostreedev/ostreefs

Composefs for integrity protection and data sharing

Posted Dec 8, 2022 16:54 UTC (Thu) by gscrivano (subscriber, #74830) [Link] (6 responses)

Let me try to add some more context. I've contacted you on Slack because I was curious to see if there was a way to achieve with EROFS what I was playing with composefs, it would have been easier to do it with something already present in the upstream kernel, but it seemed immediately very different than what I had in mind.

I don't see composefs competing with EROFS. They can be used for similar use cases but I think they are very different.

You've started EROFS to improve over squashfs, while I was looking at how we could improve overlayfs, and especially how it is used with OCI containers: overlayfs puts together directories and composefs puts together files.

composefs would have probably been forgotten as another toy project if Alex hadn't added support for fs-verity and fixed the image format to be fully reproducible.

It is much simpler than EROFS and does very few things. Its simplicity is reflected in the source code:

$ wc -l ~/composefs/kernel/*.[hc] | tail -1
2287 total
$ wc -l ~/linux/fs/erofs/*.[hc] | tail -1
9098 total

We are mostly about putting together already existing pieces. We do not implement any deduplication or encryption, we just use what is already in the kernel.

Overall, I think composefs is a big improvement on what we have today. With just a few features, it solves a list of long-standing issues that exist with containers. From more serious ones like the lack of file integrity checks to more mundane ones like why do users have to worry about how they sort their ADD and RUN statements in a Dockerfile to optimize the reusing of layers/files? It is also useful without containers, it extends fs-verity to entire directories!

Another point is that long-term goal/wish is to be able to use composefs from a user namespace, would that be ever possible with EROFS and cachefiles?

You've pointed out the lack of a finer granularity for the deduplication. That is a conscious tradeoff: having to work at the file level simplifies the implementation. Now, composefs needs only to open the file from the underlying file system and delegate any operation to it. It doesn't have to worry about how chunks are glued together.
I've nothing against this feature though, we just didn't need it for our use cases. If a finer granularity is useful, then it can be added in the future.

Composefs for integrity protection and data sharing

Posted Dec 8, 2022 17:16 UTC (Thu) by hsiangkao (guest, #123981) [Link] (5 responses)

> Let me try to add some more context. I've contacted you on Slack because I was curious to see if there was a way to achieve with EROFS what I was playing with composefs, it would have been easier to do it with something already present in the upstream kernel, but it seemed immediately very different than what I had in mind.
> I don't see composefs competing with EROFS. They can be used for similar use cases but I think they are very different.

Please give more details first in which cases they behaves different.

> You've started EROFS to improve over squashfs, while I was looking at how we could improve overlayfs, and especially how it is used with OCI containers: overlayfs puts together directories and composefs puts together files.

So that is why I'd like to hear from overlayfs folks about this overlay model on the mailing list as well before I kick off the thread on the mailing list.

> composefs would have probably been forgotten as another toy project if Alex hadn't added support for fs-verity and fixed the image format to be fully reproducible.

Nope, composefs doesn't even consider endianness now as the very first version of Squashfs. I'm not sure how it could be _fully_ reproducible.

As I said above, I don't think fs-verity actually works if underlayfs doesn't support fs-verity, which also includes one of composefs example -- FUSE.

> It is much simpler than EROFS and does very few things. Its simplicity is reflected in the source code:
> $ wc -l ~/composefs/kernel/*.[hc] | tail -1
> 2287 total
> $ wc -l ~/linux/fs/erofs/*.[hc] | tail -1
> 9098 total

Please take the initial 4.19 EROFS version as the start (and exclude EROFS compression part since composefs doesn't support compression), since EROFS tends to be a generic filesystem for all backends like block, file, or later mtd with a lot of features.
Also let's do a wild guess if composefs finally merges, if you'd like to add more features, can its codebase stays at the same level?

> Overall, I think composefs is a big improvement on what we have today. With just a few features, it solves a list of long-standing issues that exist with containers. From more serious ones like the lack of file integrity checks to more mundane ones like why do users have to worry about how they sort their ADD and RUN statements in a Dockerfile to optimize the reusing of layers/files? It is also useful without containers, it extends fs-verity to entire directories!

EROFS will support self-contained data integrity later (I assumed also by using fs-verity), not like what composefs does --- just by fsverity_get_digest().

> You've pointed out the lack of a finer granularity for the deduplication. That is a conscious tradeoff: having to work at the file level simplifies the implementation. Now, composefs needs only to open the file from the underlying file system and delegate any operation to it. It doesn't have to worry about how chunks are glued together.

I don't know why you think chunk-based indexes is complex, like [2] [3]?

> I've nothing against this feature though, we just didn't need it for our use cases. If a finer granularity is useful, then it can be added in the future.

and it's much much like EROFS then.

[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/...
[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/...

Composefs for integrity protection and data sharing

Posted Dec 8, 2022 18:06 UTC (Thu) by gscrivano (subscriber, #74830) [Link] (4 responses)

> Please give more details first in which cases they behaves different.

Given a directory with a bunch of files, how do you setup an EROFS mount that contains the metadata for the file system but refers to the files for their actual payload and without requiring the image blob to be transformed to a different format first?

This is the use case I am interested in.

> Nope, composefs doesn't even consider endianness now as the very first version of Squashfs. I'm not sure how it could be _fully_ reproducible.

What version are you looking at? It does since https://github.com/containers/composefs/pull/24/commits/5...

> Also let's do a wild guess if composefs finally merges

we posted an RFC to gather feedback after we worked on it for quite some time to see if people find it useful but you turned it as if it were an attack on EROFS. It is not.

> if you'd like to add more features, can its codebase stays at the same level?

From the discussion we just had, it seems EROFS still misses page cache sharing and data-integrity check, so it is likely EROFS will grow more as well?

Composefs for integrity protection and data sharing

Posted Dec 8, 2022 18:39 UTC (Thu) by hsiangkao (guest, #123981) [Link] (2 responses)

> Given a directory with a bunch of files, how do you setup an EROFS mount that contains the metadata for the file system but refers to the files for their actual payload and without requiring the image blob to be transformed to a different format first?
> This is the use case I am interested in.

Why doesn't EROFS work like this? if you consider each EROFS blob as a per-file blob data (currently it's identified by an 16-bit blob ID, but it can extend if you really need like OSTree --- massive per-file blobs), and if each EROFS file has only _one_ chunk pointing to one blob ID.

Does it behave any different? You only change the integer blob ID into a string and strict it with one-file one-chunk.

The only difference is that EROFS uses fscache to manage its cache but that is partially due to our lazy pulling requirement (also I also tend to manage such blobs with a unified in-kernel framework rather than direct access random underlayfs files without some permission check. Take one example in my opinion, one composefs file "/bin/su" but the file was suddenly replaced by a malicious root shell. If fs-verity is disabled, how to prevent this --- on the other side, overlayfs doesn't have this issue since it doesn't keep another permission), you could refer to Incremental FS discussion [1]. Also EROFS already has an in-house version to access files directly for our special uses [2].

> Also let's do a wild guess if composefs finally merges we posted an RFC to gather feedback after we worked on it for quite some time to see if people find it useful but you turned it as if it were an attack on EROFS. It is not.

I just want to say composefs is much much similar to EROFS.

> From the discussion we just had, it seems EROFS still misses page cache sharing and data-integrity check, so it is likely EROFS will grow more as well?

Jingbo Xu is working on page cache sharing for Linux 6.3.
Data-integrity check and encryption for confidential containers will be discussed on the mailing list right after page cache sharing is landed.

[1] https://lore.kernel.org/all/20190502040331.81196-1-ezemts...
[2] https://github.com/alibaba/cloud-kernel/commit/6654d200b4...

Composefs for integrity protection and data sharing

Posted Dec 8, 2022 21:24 UTC (Thu) by gscrivano (subscriber, #74830) [Link] (1 responses)

thanks for the useful information.

> The only difference is that EROFS uses fscache to manage its cache but that is partially due to our lazy pulling requirement

so that is a significant difference. If I understand it correctly we will need to either setup fscache and populate its cache or have a different daemon before we can use this mechanism.

Would it ever be possible to use fscache from a user namespace?

Composefs for integrity protection and data sharing

Posted Dec 9, 2022 2:30 UTC (Fri) by hsiangkao (guest, #123981) [Link]

> so that is a significant difference. If I understand it correctly we will need to either setup fscache and populate its cache or have a different daemon before we can use this mechanism.

Sorry I just went to sleep. Bytedance's folks already developed fscache failover feature and fully daemonless mode for their cloud production, and it's also useful to all network fses. Basically we already developed a lot of features for fscache, it just needs time to upstream.

Overall I just tried to say currently composefs is very similar to EROFS, even it has some difference (such as directly accessing files) it can be adapted without any diffcult.

> Would it ever be possible to use fscache from a user namespace?

I missed this part at that time, sorry. I think EROFS has the same security model as all on-disk fses with on-disk permission model (no matter it's block-based or file-based.) So the question is no different from other on-disk fses, including composefs.

Composefs for integrity protection and data sharing

Posted Dec 8, 2022 18:45 UTC (Thu) by hsiangkao (guest, #123981) [Link]

> What version are you looking at? It does since https://github.com/containers/composefs/pull/24/commits/5...

I'm sorry I didn't follow the recent version, glad to know it's already improved.