On-disk format robustness requirements for new filesystems
As suggested by its name (and its acronym), EROFS is a read-only filesystem. It was developed at Huawei, and is intended for use in Android systems. EROFS is meant to differ from existing read-only filesystems in the area of performance; it uses a special compression algorithm that creates fixed-length blocks that, it is claimed, allows random access to compressed data with a minimum of excess I/O and decompression work. Details can be found in this USENIX paper [PDF] published in July.
Gao has made several requests in recent times to move EROFS out of the staging tree; the latest was posted on August 17. It read:
(EMUI is Huawei's version of Android.)
It would seem that there is little opposition to this move in general. As part of reviewing the code, though, Richard Weinberger noticed that the code generally trusts the data it reads from disks, often failing to check it for reasonableness. He quickly found a way to create a malformed filesystem that would put the kernel into an infinite loop, creating a system that is a bit more read-only than anybody had in mind. The problem was fixed just as quickly, but not before starting a discussion on whether robustness against hostile filesystem images should be a requirement for new filesystems entering the kernel.
Nobody disagrees that it would be a good thing if a filesystem implementation would do the right thing when faced with a hostile (or merely corrupt) filesystem image; that would make it possible to allow unprivileged users to mount filesystems without fear of handing over the keys to the entire system, for example. But, as Ted Ts'o pointed out, heavily used, in-kernel filesystems like ext4 and XFS don't meet that standard now, so requiring new filesystems to reach that level of robustness is presenting them with a higher bar:
In the case of EROFS, as Chao Yu pointed out, the intended use case makes this kind of robustness less important. The Android system images shipped in this filesystem format will be verified with a system like dm-verity, so the filesystem implementation should not be confronted with anything other than signed and verified images. Even so, the EROFS developers agree that this kind of bug should be actively sought out and fixed.
It seems that views about robustness against bad images vary somewhat among
filesystem developers. With regard to these bugs in ext4, Ts'o said that
"while I try to address them, it is by no means considered a high
priority work item
". He characterized the approach of the XFS
developers as being similar. Christoph Hellwig disagreed
strongly with that claim, though, saying that XFS developers work hard
to handle corrupt filesystem images, "although there are of course no
guarantees
". Eric Biggers asserted
that dealing with robustness issues should be mandatory, "but I can
understand that we don't do a good job at it, so we shouldn't hold a new
filesystem to an unfairly high standard relative to other filesystems
".
Hellwig arguably took the strongest position with regard to the standards that should be applied to new filesystems:
What those higher standards should be was not spelled out. They probably do not extend to absolute robustness against corrupt filesystem images, but it seems that developers would like to see at least an effort made in that direction. As Biggers put it:
Whether EROFS meets the "looks robust" standard is a bit controversial at
the moment. On
the other hand, there is little doubt that the EROFS developers are willing
and able to fix bugs quickly as they are reported. For the purposes of
moving EROFS into the kernel proper, chances are that will be good enough.
Unless some other show-stopping issue comes up, this little snag seems
unlikely to keep this code from graduating out of the staging tree. Future
filesystem developers will want to take notice, though, that reviewers will
be paying more attention to robustness against on-disk image corruption
than they have in the past.
Index entries for this article | |
---|---|
Kernel | Filesystems |
Posted Aug 20, 2019 3:55 UTC (Tue)
by dvdeug (guest, #10998)
[Link] (8 responses)
Yes, it's a volunteer system. But if you're concerned about Linux filesystem quality, I'm sure they'll take patches or at least bug reports for existing filesystems. You can push for better quality in core features without taking it out on the new features.
Posted Aug 20, 2019 7:52 UTC (Tue)
by rsidd (subscriber, #2582)
[Link] (2 responses)
Posted Aug 20, 2019 8:05 UTC (Tue)
by hsiangkao (guest, #123981)
[Link] (1 responses)
Posted Aug 20, 2019 8:11 UTC (Tue)
by hsiangkao (guest, #123981)
[Link]
and there are some Android commits mentioned about this staging EROFS:
We'd like to upstream to AOSP, and gain wider use of course.
Posted Aug 24, 2019 6:12 UTC (Sat)
by buck (subscriber, #55985)
[Link] (3 responses)
If a user plugs a USB drive in his/her machine and it causes the machine to lock up because it has a broken EROFS filesystem on it, that's not cool. It may not be fair, but there's an argument that can be made for not allowing in additional filesystems that widen the gamut of such problems.
That said, i've never written code for a filesystem or anything else nearly as complex that's supposed to deliver as much functionality, so, yes, i can imagine it may put an unrealistic damper on the possibilities for future awesomeness. I'll trust the LKML arbiters to figure it out.
Posted Aug 24, 2019 17:45 UTC (Sat)
by alonz (subscriber, #815)
[Link] (1 responses)
Alternatively, the filesystem driver may itself verify that the media type is suitable before even reading the superblock.
(Personally I would love it if we could just use lklfuse for all filesystems on removable media… But it looks like nobody support it.)
Posted Aug 25, 2019 1:04 UTC (Sun)
by pabs (subscriber, #43278)
[Link]
Posted Aug 24, 2019 19:04 UTC (Sat)
by hsiangkao (guest, #123981)
[Link]
We think that's not cool as well, so we are now addressing and will continue actively addressing it.
Again, please give us some time, not long before it resists almost all malformed images (it can already resist more malformed images than weeks before, and we will fix those reports as quick as what we can... that is our attitude on this...)
Posted Sep 4, 2019 6:42 UTC (Wed)
by holgerschurig (guest, #6714)
[Link]
I think you think too short.
Some of the android things are too specific for android. But some of the concepts needed there are also things that you need in other problem domains (e.g. embedded). And if EROFS is in Linux, then other projects (often in Embedded) might use it as well. It might actually already in use today :-)
> or at least bug reports for existing filesystems.
They are. Hellwig said for XFS that they make great strides for the v5 version, T'so said that for ext4 they work on this, just not with high priority. So warm up your fuzzer and start submitting bug reports :-)
Posted Aug 20, 2019 6:15 UTC (Tue)
by post-factum (subscriber, #53836)
[Link] (3 responses)
Somewhere out there Hans Reiser cries loudly.
Posted Aug 20, 2019 10:19 UTC (Tue)
by kmeyer (subscriber, #50720)
[Link] (1 responses)
Posted Aug 21, 2019 5:56 UTC (Wed)
by rsidd (subscriber, #2582)
[Link]
Posted Nov 26, 2019 3:31 UTC (Tue)
by Trammael (guest, #101173)
[Link]
California Dept of Corrections, Correctional Training Facility, Soledad Prison Road, Soledad, CA
Posted Aug 20, 2019 9:57 UTC (Tue)
by tao (subscriber, #17563)
[Link] (1 responses)
Pre-existing file-system code can be hard to fix because a.) it might necessitate breaking on-disk format, which is usually a no-go, b.) regressions have a real-world impact.
Ensuring a high level of robustness on the get-go before merging rather than trying to fix it afterwards is practically guaranteeing that these issues won't ever get fixed.
TL;DR: I totally agree with Christoph Hellwig.
Posted Aug 21, 2019 6:20 UTC (Wed)
by dvdeug (guest, #10998)
[Link]
Again, for "That's the only way to get the average quality up", this does not seem to be a viable way to get the average quality of Linux up for most users.
Posted Aug 20, 2019 11:06 UTC (Tue)
by Freeaqingme (subscriber, #103259)
[Link] (2 responses)
Posted Aug 20, 2019 21:17 UTC (Tue)
by sitsofe (guest, #104576)
[Link]
Posted Aug 21, 2019 8:18 UTC (Wed)
by dgc (subscriber, #6611)
[Link]
$ git grep fuzz tests/xfs/group |wc -l
156 separate on-disk format fuzzing tests, quite a few (~40) of which also test the ability of the under-development online repair code to fix the fuzzing damage automatically. These fuzzers know the on-disk format, so they defeat all the CRC checking by recalculating the CRC after the structures have been corrupted. That's why we have our own fuzzers - at the time nobody had a fuzzer capable of defeating CRCs, so we extended our own tools to do it....
So the truth is that XFS developers have a very high standard for on-disk format robustness and we have both the toolchain and runtime verification in place to find and fix bugs and areas we don't validate as well as we should. It's an ongoing process of improvement....
-Dave.
Posted Aug 22, 2019 11:35 UTC (Thu)
by mina86 (guest, #68442)
[Link] (1 responses)
Posted Aug 23, 2019 19:42 UTC (Fri)
by k8to (guest, #15413)
[Link]
But globally it seems burdensome.
Posted Aug 31, 2019 13:17 UTC (Sat)
by eduard.munteanu (guest, #66641)
[Link]
Posted Sep 5, 2019 13:46 UTC (Thu)
by polyp (guest, #53146)
[Link] (1 responses)
Posted Sep 9, 2019 2:08 UTC (Mon)
by hsiangkao (guest, #123981)
[Link]
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
https://android-review.googlesource.com/c/platform/system...
https://android-review.googlesource.com/c/kernel/configs/...
On-disk format robustness requirements for new filesystems
One could argue that the behavior of mount(8) (automatically trying all filesystem types) is the actual bug, and that its "auto" mode should restrict itself to using filesystems that are actually suitable for use with the current media. (Many filesystem types, EROFS included, could then be defined as supported only on non-removable media by default.)
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
But that is not absolute standard on this field ---- one hour, two hours, a day, a month, or forever? by some tool? and that is not filesystem-specific issue, but for all on-disk new features...
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
Tux3's author never submitted it for upstreaming, AFAIK. In 2018 he said "For the time being we will continue to develop out-of-tree". There seem to be no updates after that.
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
Yes, people have built fuzzers for filesystem images (there are even more if you mean things like syscalls - see fsx, trinity, syzkaller etc). Several years ago an Oracle developer applied afl to a number of different filesystem images and found bugs could be triggered within a few minutes of fuzing (but I don't know if the code for this was ever released). Going back further, the month of kernel bugs introduced the fsfuzzer back in 2006.
AFL for filesystems, fsfuzzer
On-disk format robustness requirements for new filesystems
156
$
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
On-disk format robustness requirements for new filesystems
I'd prefer the original name Enhanced Read-Only File System though, so change back again...
maybe full name is not important though..