|
|
Subscribe / Log in / New account

Unionfs

A longstanding (and long unsupported in Linux) filesystem concept is that of a union filesystem. In brief, a union filesystem is a logical combination of two or more other filesystems to create the illusion of a single filesystem with the contents of all the others.

As an example, imagine that a user wanted to mount a distribution DVD full of packages. It would be nice to be able to add updated packages to close today's security holes, but the DVD is a read-only medium. The solution is a union filesystem. A system administrator can take a writable filesystem and join it with the read-only DVD, creating a writable filesystem with the contents of both. If the user then adds packages, they will go into the writable filesystem, which can be smaller than would be needed if it were to hold the entire contents.

The unionfs patch posted by Josef Sipek provides this capability. With unionfs in place, the system administrator could construct the union with a command sequence like:

    mount -r /dev/dvd /mnt/media/dvd
    mount    /dev/hdb1 /mnt/media/dvd-overlay
    mount -t unionfs \
          -o dirs=/mnt/media/dvd-overlay=rw:/mnt/media/dvd=ro \
          /writable-dvd

The first two lines just mount the DVD and the writable partition as normal filesystems. The final command then joins them into a single union, mounted on /writable-dvd. Each "branch" of a union has a priority, determined by the order in which they are given in the dirs= option. When a file is looked up, the branches are searched in priority order, with the first occurrence found being returned to the user. If an attempt is made to write a read-only file, that file will be copied into the highest-priority writable branch and written there.

As one might imagine, there is a fair amount of complexity required to make all of this actually work. Joining together filesystem hierarchies, copying files between them, and inserting "whiteouts" to mask files deleted from read-only branches are just a few of the challenges which must be met. The unionfs code seems to handle most of them well, providing convincing Unix semantics in the joined filesystem.

Reviewers immediately jumped on one exception, which was noted in the documentation:

Modifying a Unionfs branch directly, while the union is mounted, is currently unsupported. Any such change can cause Unionfs to oops, or stay silent and even RESULT IN DATA LOSS.

What this means is that it is dangerous to mess directly with the filesystems which have been joined into a union mount. Andrew Morton pointed out that, as user-friendly interfaces go, this one is a little on the rough side. Since bind mounts don't have this problem, he asked, why should unionfs present such a trap to its users? Josef responded:

Bind mounts are a purely VFS level construct. Unionfs is, as the name implies, a filesystem. Last year at OLS, it seemed that a lot of people agreed that unioning is neither purely a fs construct, nor purely a vfs construct.

That, in turn, led to some fairly definitive statements that unionfs should be implemented at the virtual filesystem level. Without that, it's not clear that it will ever be possible to keep the namespace coherent in the face of modifications at all levels of the union. So it seems clear that, to truly gain the approval of the kernel developers, unionfs needs a rewrite. Andrew Morton has been heard to wonder if the current version should be merged anyway in the hopes that it would help inspire that rewrite to happen. No decisions have been made as of this writing, so it's far from clear whether Linux will have unionfs support in the near future or not.

Index entries for this article
KernelFilesystems/Union
KernelUnionfs


to post comments

Unionfs

Posted Jan 11, 2007 8:07 UTC (Thu) by k8to (guest, #15413) [Link] (3 responses)

Huh, I modify the contents of a branch out from under a union mount just about every day. In fact, that's the only way I ever modify it.

I have yet to see a single oops or data loss problem.

Surely there must be more to triggering this problem.

Unionfs

Posted Jan 11, 2007 9:50 UTC (Thu) by dlang (guest, #313) [Link] (2 responses)

writing to the underlying filesystem is conceptually the same as writing to /dev/hda1 while you have it mounted.

in both cases the filesystem on top doesn't know about the changes below it and can be (fatally) surprised when it finds them.

later messages in the thread indicate that the warning is (somewhat) intentially overstateing the risk of an oops (rather then going into many pages of quickly obsolete details of what will fail)

I suspect that you get away with it by the fact that you are doing read-only on the result.

Unionfs

Posted Jan 13, 2007 0:33 UTC (Sat) by giraffedata (guest, #1954) [Link] (1 responses)

later messages in the thread indicate that the warning is (somewhat) intentionally overstating the risk of an oops

Do you mean the warning is a lie and it is not in fact possible by design to oops the kernel by modifying an underlying filesystem?

Because that's the only way it's sensible. You cannot oops the kernel by writing to /dev/hda1 while an ext3 filesystem on /dev/hda1 is mounted. You can trash unlimited amounts of data, but as the filesystem is external to the kernel, the kernel is robust to whatever bits it might read from it at any time.

Unionfs

Posted Jan 13, 2007 4:52 UTC (Sat) by raven667 (subscriber, #5198) [Link]

You cannot oops the kernel by writing to /dev/hda1 while an ext3 filesystem on /dev/ hda1 is mounted. You can trash unlimited amounts of data, but as the filesystem is external to the kernel, the kernel is robust to whatever bits it might read from it at any time.

That seems like how it should be but I doubt that this is true. In fact I believe one of the major fixes in 2.6.19.2 is CVE-2006-5823 a problem where a corrupte cramfs could OOPS the kernel. This same kind of think can, has and will continue to happen in the filesystem, USB, network, Firewire, block device and other subsystems where bogus data from a piece of hardware isn't sufficiently checked before being used and causes an OOPS (with potential security implications.

Unionfs

Posted Jan 11, 2007 9:45 UTC (Thu) by smurf (subscriber, #17840) [Link]

Ubuntu has been shipping their kernels with unionfs (and squashfs) for a year now, in order to support live CDs properly.

In that case, modifying unionfs parts from behind the scenes is Not A Problem because the r/w branch is hidden by the clever way the initramfs sets up the file system structure -- the unionfs is mounted at /root, and then /root is move-mounted on top of /.

But I can certainly attest to its fragility in the general case -- trying to restore an unionfs filesystem by simply pulling the r/w part from a disk dump does cause a crash pretty much immediately afterwards: unionfs sees some cached directory entries which do not have its data structure attached, and BUG()s out.

Unionfs

Posted Jan 11, 2007 11:33 UTC (Thu) by nix (subscriber, #2304) [Link]

unionfs also stops unioning at mount points, which is really annoying and makes it unusable for a lot of purposes. e.g. a lot of the path-translation stuff in fakeroot could be avoided (and robustness improved) if you could union-mount the fakeroot directory over / before running the fakerooted command, so the changes that `make install' or whatever did landed in the fakerooted directory, but it normally saw the original /. But this doesn't work because of the mount-point-traversal problem.

(Oh, and also the corruption thing as well, of course: it's kind of annoying to avoid writing to /-and-all-subdirectories while fakeroot is running :) ).

You could also update the DVD

Posted Jan 11, 2007 16:19 UTC (Thu) by mcmechanjw (subscriber, #38173) [Link]

While it is true that unionfs is a nice way to provide updates to a DVD growisofs provides for adding files to a already written DVD+/-R/RW/RAM directly via the -M option until it is full or the disk is closed
As a example a script I use looks like this - most of which is concerned with updating the md5sum file, and then checking the md5sum of the files on the dvd.

mount /mnt/dvd
cat /mnt/dvd/md5sum >>md5sum
find $FILES -type f -print0 | xargs -0 md5sum >>md5sum
umount /mnt/dvd
ulimit -l unlimited
growisofs -M /dev/dvd -r $FILES md5sum
mount /mnt/dvd
(cd /mnt/dvd;md5sum -c md5sum)

Unionfs

Posted Jan 11, 2007 22:14 UTC (Thu) by dambacher (subscriber, #1710) [Link] (1 responses)

some days ago I searched for unionfs and lookalikes to set up a diskless boot and wondered.
there is more than just unionfs:
unionfs-FUSE is based on unionfs with some interesting features
aufs is a new implementation of uionfs. knoppix just switched from unionfs to this one.

none is in kernel but FUSE annd unionfs-FUSE runs in userspace.
none has good documentation .-(
special tricks like root mounting are not documented.

Unionfs

Posted Jan 13, 2007 11:29 UTC (Sat) by aakef (subscriber, #38030) [Link]

Hi dambacher,

we are just trying to use funionfs (also fuse based) as diskless setup. In principle it already works, but I found one critical bug, see here

http://www.fsl.cs.sunysb.edu/pipermail/unionfs/2007-Janua...

I looked into the sources of unionfs-fuse and I think it will have the very same access() problem.

All I need is a good idea how to fix it. Since everything is done in userspace, the source code is rather simple.

If you need some help to setup your diskless setup, just contact me.

Cheers,
Bernd

Unionfs

Posted Jan 13, 2007 15:14 UTC (Sat) by PlaguedByPenguins (subscriber, #3577) [Link]

for clusters, the 'no writes to underlying layers' thing is a killer - it means you can't upgrade the read-only part of the OS. it would be great if unionfs would recognise when VFS caches were trashed.
in the mean time, I've found oneSIS is a simple and effective way to diskless boot clusters.

cheers,
robin

Unionfs

Posted Jan 15, 2007 14:02 UTC (Mon) by JohnNilsson (guest, #41242) [Link]

Is this an application that would benefit from this
Zipper-based file server/OS thing I saw the other day?

http://okmij.org/ftp/Computation/Continuations.html#zippe...

From my limited understanding it's a copy on write based way to make an immutable data structure mutable which kind of sounds similar to some of the goals of unionsf.

Unionfs

Posted Jan 27, 2008 15:00 UTC (Sun) by TomasM (guest, #50151) [Link]

Well, aufs is a great replacement for unionfs. That particular code should be in kernel IMHO,
as it's working much better. It is documented in aufs.5 man page: http://aufs.sf.net/aufs.html

Knoppix switched to AUFS already, all linux-live based distributions including Slax are
switching to AUFS as well. Junjiro Okajima (AUFS author) sadly doesn't wish to submit the code
to kernel yet, while unionfs developers try very hard to get it inthere, have no idea why is
that so important for them. The inclusion won't make unionfs code better.

If you review the unionfs versus aufs development process, AUFS adds cool features (like
balancing of writable branches), while unionfs fixes locking, mmap code, NULL pointer
dereferences, and so on. .. 


Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds