The mini_fo filesystem

[Posted May 10, 2005 by corbet]

Markus Klotzbuecher recently announced the release of mini_fo 0.6.0. Mini_fo provides (what has been called in other systems) a "translucent" or "copy on write" filesystem. A read-only, base filesystem (possibly from a remote system or CDROM) can be made to appear, via mini_fo, as a local, writable filesystem. This functionality is useful for sharing filesystems with local overrides, live CD systems, sandboxing applications, and more.

At its core, mini_fo performs a simple fan-out operation. Each inode, dentry, and file structure associated with a mini_fo filesystem contains (via its private data) pointers to two other structures of the same type. One of them refers to the file or directory on the base filesystem; the other, instead, is for a local version of the file or directory on a local "storage filesystem." Both are hidden from user space, which thinks it is dealing directly with a file stored in the mini_fo filesystem.

When a mini_fo filesystem is first created, it appears as an exact copy of the underlying base filesystem. Any operation which reads files or directories is simply passed through to the base filesystem, with almost no additional overhead. In this mode, mini_fo functions as a sort of loopback filesystem.

Things change, however, when a file is opened for writing. In this case, mini_fo will create a copy of the file on the storage filesystem, with all of the data moved over. Any subsequent operations on that file will used the locally-stored version rather than the base version. So any changes made will appear locally, but they will not be propagated back to the base. Changes will be persistent across mounts as long as the storage directory used by mini_fo is not modified by anything except mini_fo.

Modified files are not the full story, of course; mini_fo must also cope with operations like deletes and renames. To that end, it maintains a set of lists of files which it knows about locally; there is one list for modified files, one for deleted files, one for files created locally, etc. These lists are stored in-kernel as standard linked lists. They are also written to the storage filesystem in a magic file (named META_dAfFgHE39ktF3HD2sr, for what it's worth) and reloaded from that file when the filesystem is mounted.

This release of mini_fo works with both the 2.4 and 2.6 kernels. Its author claims that it is intended for use with embedded systems, and thus has a small memory footprint. See the mini_fo web page for more information.

Index entries for this article
Kernel	Filesystems

transclucent fs for shared root access?

Posted May 12, 2005 14:12 UTC (Thu) by bkw1a (subscriber, #4101) [Link] (3 responses)

I was just thinking about translucent filesystems and userspace filesystems (the subject of another article in today's lwn kernel section). How about using a translucent filesystem to give unprivileged users "root" access? What I mean is, mount "/" as a read-only base filesystem for these users, but allow them to overlay changes (that would only be visible by them) through a mechanism like mini_fo. This would be one way to (safely?) let unprivileged users install new software, without requiring any changes in the way the software is packaged.

The first problem that occurs to me is that the root filesystem isn't really static. Can mini_fo deal with changes in the underlying filesystem?

transclucent fs for shared root access?

Posted May 12, 2005 18:31 UTC (Thu) by bronson (subscriber, #4806) [Link]

RTFAQ: http://www.denx.de/twiki/bin/view/Know/MiniFOFAQ

I don't know how well it would handle move/renames.

transclucent fs for shared root access?

Posted May 19, 2005 21:27 UTC (Thu) by klossner (subscriber, #30046) [Link]

This would be one way to (safely?) let unprivileged users ...

This isn't safe. Consider what happens if you let the unprivileged user overlay their own version of /etc/passwd. They won't stay unprivileged for long.

transclucent fs for shared root access?

Posted May 22, 2005 10:49 UTC (Sun) by markus78 (guest, #30082) [Link]

For now mini_fo can only deal to some extent with changes in the underlying file system, e.g. modifying existing files, even creating new files should be ok. What will definitely cause trouble is removing an file, what's like "pulling the carpet" you're standing on: the file system will expect to find a file that has gone.
I've got advanced error recovery that will allow this on my Todo list though.

The mini_fo filesystem

Posted May 12, 2005 15:40 UTC (Thu) by madscientist (subscriber, #16861) [Link] (1 responses)

How does this compare and contrast with unionfs, which seems to be a longer-standing project to allow translucent filesystems?

I'm really interested in this capability for a project I happen to be working on at the moment, actually. Yet again LWN comes through with timely info!

The mini_fo filesystem

Posted May 22, 2005 11:31 UTC (Sun) by markus78 (guest, #30082) [Link]

Actually mini_fo has been around longer, it just didn't implement all features until the last release ;-)

The main difference between unionfs and mini_fo is features and complexity. Unionfs allows to merge two and more branches with various options for each branch, while mini_fo focuses on merging only two, the base branch that will never be modified and the storage branch that contains the "diff".

This "lack" of features makes mini_fo a lot smaller, what is important as we use it in embedded systems.

The mini_fo filesystem

Posted May 13, 2005 18:36 UTC (Fri) by Ross (guest, #4065) [Link] (2 responses)

Are changes in the underlying filesystem tracked with some useful semantic
guarantees or does it depend on it being static? What happens if, for
example, X is renamed to Y the overly and then Y is renamed to Z in
the original filesystem?

The mini_fo filesystem

Posted May 21, 2005 6:03 UTC (Sat) by AnswerGuy (guest, #1256) [Link]

I don't know the answer to that question in this case, but consider that
a "rename" is really series of link and unlink operations (at the system
call level). Also a directory is a type of file (a list of link/inode
pairs).

Given those semantics I'd guess that a "rename" would unlink on the
top/writable layer (writing a version of the directory that did NOT
contain the link in question) and a link (possibly to that same directory
possibly to another) resulting in more writes to the writable layer.
This wouldn't affect the underlying inode (but that would probably
be copied up from the lower layer to the write layer because the
link count was incremented and then decremented).

So know I have one or two directory "files" that contain updated
contents (like any other file that got copied up to the writable
layer). The ls command (and other readdir() operations) will show
the copy of the directory that does not contain the old name and does
contain the new one.

Is this making any sense?

I do have to wonder what happens if you have multiple layers that are
writable and mounted in multiple places (bind mounts of some layers
outside of the stack). That sounds ugly.

mini_fo renaming

Posted May 22, 2005 12:28 UTC (Sun) by markus78 (guest, #30082) [Link]

Renaming works differently for directories and non-directories. For example renaming a regular file will result in the file beeing marked as deleted (whiteouted), and then copied up to the storage branch with the new name. From now, renaming this file again will really only rename it in the storage branch.

Renaming directories is a lot more complicated, because we don't want to copy up all directory contents (by the way, this is what "mv" does when a file system's rename function returns -ENOSUPP). So what happens is that the original directory is whiteouted, a new empty directory with the new name is created in storage and both directories are associated by a special meta tag that is saved in the meta-data.

If you rename that directory in the underlying file system while the mini_fo file system is not mounted (you should not do this while it is mounted, see above post), this association will be broken, as mini_fo has no way to "detect" changes that occured while not mounted.

The mini_fo filesystem

Posted Nov 9, 2005 16:30 UTC (Wed) by markus78 (guest, #30082) [Link]

Update:

The link to the mini_fo project page has changed (slightly):

http://www.denx.de/wiki/view/Know/MiniFOHome