January 30, 2007
This article was contributed by Michael J. Hammel
This series is all about making small systems, from the kernel on up. In
the first part I covered the
TinyLinux project and its eventual
integration into the kernel to help reduce kernel sizes for small systems.
In
the second part, I looked
at the use of the Initramfs and its role in
providing a root file system (directly or indirectly) for an embedded
system.
Now it's time to look at getting applications and utilities into the
system, still keeping an eye on size. The most direct approach is to use
as few utilities as possible, even replacing /sbin/init with a single
application. This is possible in very small systems but, generally speaking,
if you only have a single application to run you probably didn't need the
complexity of a multitasking system like Linux to run it anyway. There are
other, smaller operating systems that might be better suited in that case.
There are a number of ways to keep application layer tools small.
If you have multiple applications and/or require the facilities in Linux,
then you can (and should, for production systems) consider stripping your
binaries of all symbols. The symbols are useful for debugging purposes but
won't be of much value to your users. Additionally, using compile-time
features to reduce size is another option, and will be the focus of the
final article in this series. For now, we'll consider yet another option:
using a compressed file system.
Compressed File Systems
File systems provide the structure for managing files on storage media,
such as disks or tapes. While a device driver knows how to get data to and
from those devices, file system provide the logical structure of that data.
There are a huge number of file systems types, ranging from the standard
ext3 you'll find on many Linux systems to parallel and clustered
file systems, to steganographic file systems that can both encrypt and hide
data on the media.
(Note that Wikipedia has a nice long list
of file systems).
A compressed file system is one that uncompresses data as it is retrieved
and may or may not compresses data as it goes into the storage media.
Working with compressed files is an obvious benefit for saving space on
small systems. The decision to use a compressed file system is usually
based on the storage media you'll use in your system. A ram-disk based
system, for example, might copy data from flash into the ramdisk. Since
RAM is essential for system operation the size of the ram disk would
probably be best kept small. Compact flash or hard disk based systems, on
the other hand, offer more storage but may still be too small to fit all
the required files without some sort of compression.
While compressed file systems offer you more space for files, they also may
affect performance. There may be unacceptable overhead in managing the
decompression of large files at run time. And compressing files on the fly is
computationally expensive; random writes of compressed data is
difficult to achieve. Therefore it is far more common for compressed
file systems to be read-only.
Compressing data is a common practice for live CD distributions, which use
compression to squeeze a more complete distribution onto the limited size
of a CD or DVD. But many of the live CD distributions don't actually use a
compressed file system, instead using an conventional file system image made
up of compressed blocks which are uncompressed when read using the "cloop",
or compressed loopback, device. But this isn't a compressed file system. It's a
block level device handling compressed data.
The Knoppix distribution popularized the use of cloop
when its author, Klaus Knopper, picked up support of the driver. Many
other live CDs followed suit. One advantage of using this kind of
compressed image is that, since the blocks are compressed independently, it
is possible to seek to specific blocks without uncompressing all the
blocks. The disadvantage of such a device is that the entire image must
fit into memory in order to be uncompressed.
An example of a real compressed file system is CramFS, a file system popular
with embedded users of the 2.4 kernel for use with the initrd image. This
file system actually has compressed files with uncompressed metadata. The
files are placed in the file system from a standard directory using the
mkcramfs program, which compresses the files one page at a time. This is
done, for example, when creating an initrd image.
Another example of a compressed file system is e2compr. This is actually a
set of patches to make the well known EXT2 file system handle on-the-fly
compression and decompression. It supports both 2.4 and 2.6 kernels, but
has not been submitted for inclusion in either because of the complexity of
the patches. As with CramFS, metadata in e2compr is not compressed.
SquashFS
A more recent (and more actively supported, the last updates coming in mid
January 2007) compressed file system is SquashFS. SquashFS is a kind
of
successor to CramFS because it aims at the same target audience while
providing a similar process for creation and use of the file system.
What makes SquashFS an improvement over CramFS is best stated by Phillip
Lougher in a linux-kernel mailing list post:
"SquashFS basically gives better compression, bigger files/file system
support, and more inode information."
Both SquashFS and CramFS use zlib compression. However, CramFS uses a
fixed size 4KB block while SquashFS supports from 0.5KB to 64KB. This
variable block size allows for much larger file systems under SquashFS,
something desirable for complex embedded systems like digital video
recorders. Also SquashFS
supports compression of both the metadata and block fragments while CramFS
does not. And, while CramFS is integrated with the kernel source, SquashFS
is not. It comes as a set of kernel patches and the driver module.
The CELinux Forum provides some
comparisons of SquashFS against other file systems (compressed
and uncompressed).
JFFS2
Another compressed file system is JFFS2, the Journaling Flash
file system,
version 2. It was designed specifically for use with both NOR and NAND
flash devices, and recently received an update via David Woodhouse for the
NAND flash memory being used in the OLPC project. JFFS2 is actually a bit
more sophisticated than SquashFS because it provides mechanisms for
plugging in different compression algorithms, including not using any
compression at all. But unlike SquashFS, JFFS2 is integrated into the
kernel.
So if you're building an embedded system with flash storage, wouldn't you
be better with JFFS2? Not necessarily.
According
to the OpenWRT project, which uses both SquashFS and JFFS2,
SquashFS provides better performance than JFFS2. Additionally, at least
in the case of the few files that need to be updated for a production
version of the project, there is little advantage to using a read/write
JFFS2 compressed root file system with respect to the performance hit it incurs
vs a read-only SquashFS root file system used with a writable JFFS2 file system
for stored files.
JFFS2 is a read/write file system while SquashFS is a read-only file system.
A runtime system very often needs to write to its root file system.
Imagine making updates to /etc/hosts, for example, as you might with a
embedded video recorder client trying to access a server backend on a local network.
If writing to the file system is required for an embedded system, how could
you use SquashFS at all?
Some projects, like OpenWRT, use a hybrid system that uses a read-only root
file system mixed with a read/write file system for saving files. In such a
hybrid you might use special configurations or modified applications to
access read/write file systems, but that doesn't help if you need write
access to /etc/hosts on a read-only file system. What you need is a method
of having parts of the directory structure writable while other parts are
read-only. What you need is a stackable file system like UnionFS.
Using UnionFS: BusyBox and SquashFS together
UnionFS is a mechanism for mounting two directories from different
file systems under the same name. For example, I could have a read-only
SquashFS file system and a read/write JFFS2 file system mounted together
under the root directory so that the JFFS2 would be /tmp and
/etc while the SquashFS might be everything else.
So how might you use this with a compressed file system and our BusyBox
based utilities we created in the last article? First, we build our kernel
with SquashFS patches and then build the UnionFS driver as a loadable module.
Next, we build BusyBox with all the runtime utilities we need and install
the result to a local directory on the build machine, let's call it
"/tmp/busybox". Next, we package those files into a compressed SquashFS
file system:
mksquashfs /tmp/busybox /tmp/busybox.sqfs -info
This command takes the contents of /tmp/busybox and compresses it into a file system
image in /tmp called busybox.sqfs. The -info option
increases verbosity, printing the filenames, original size and compression
ratio as they are processed.
We then create an initramfs with another build of BusyBox that has only
minimal utilities - enough to do mounting of the loopback device and loading
kernel modules, plus the UnionFS module we built previously (which we
manually copy into the directory after we rebuild BusyBox). We might add
support for other devices like a CDROM if we store the SquashFS file there
or JFFS2 and support for flash memory if we store the SquashFS file there.
At runtime, I need a writable file system to go with my read-only SquashFS
file system. I'll use the tmpfs file system which puts all the files I'll
write at runtime in virtual memory. In my init script for my initramfs, I
add:
mkdir /.tmpfs
mount -w -t tmpfs -o size=90% tmpfs /.tmpfs
mkdir /.tmpfs/.overlay
The overlay directory will be used to store data written by my embedded
system.
When you boot your 2.6 kernel, you'll have a BusyBox based initramfs with
an init script and your SquashFS file system (or a way to get to that
file system via commands in your init script). I'm mounting
the busybox.sqfs file from the root directory of a CD over the loopback
device onto a directory in my initramfs, so I add the following to the init
script:
mkdir /.tmpfs/.cdrom
mount -r -t iso9660 /dev/cdrom /.tmpfs/.cdrom
losetup /dev/loop0 /.tmpfs/.cdrom/root.sqfs
Then I can mount the loopback device as a SquashFS file system to another
directory I've created in my tmpfs:
mkdir /.tmpfs/.sqfs
mount -r -t squashfs /dev/loop0 /.tmpfs/.sqfs
UnionFS mounts multiple directories, in either read-only or read-write
mode, onto a single directory. In the init script, I place three
directories side by side under a single UnionFS directory:
mount -w -t unionfs -o \
dirs=/.tmpfs/.overlay=rw:/.tmpfs/.cdrom=ro:/.tmpfs/.sqfs=ro \
unionfs /.union
What this does is place all three directory structures, which are referred
to as branches under UnionFS, under /.union; any conflicting directory
names are resolved by taking the first one found, searching the branches left to
right. So if there is an /.tmpfs/.overlay/etc/hosts (a file we've
created at runtime, for example), it takes precedence over
/.tmpfs/.sqfs/etc/hosts.
With this command, when you write to /.union (which later becomes the root
directory due to a switch_root in the init script), the writes go to the
read/write directory which is on the tmpfs file system. But this writable
space is in memory and won't survive reboots. If you need to save data
between boots, you could mount a compact flash drive under /.tmpfs/cf and
use that instead of /.tmpfs/.overlay in the previous mount command.
Which directory gets the write if there are two read-write branches?
UnionFS uses "copy-up", which causes any attempt to write to a read-only
branch to be written to the next read-write branch on its left. Imagine
creating a SquashFS for /etc, one for /var and one for everything else in
your root partition. Then if you had 2 compact flashes you could use one
for writes to /etc and one for writes to /var simply by ordering these
correctly when you mounted them under the UnionFS file system.
UnionFS is considered by some to be too buggy for production use, though
I've never had much trouble with it when building live CDs. If you
experience problems using UnionFS, you might consider AuFS
as an alternative. AuFS started out as a
rewrite of UnionFS but has since evolved into its own file system. SLAX, a
Slackware based live CD that originally used UnionFS, has migrated to AuFS.
In fact, a bug bounty was offered by SLAX for a bug and the winner of that
bounty, Junjiro Okajima, is the author of AuFS.
Next in the series: uClibc
This long running series (it's taken me awhile to write each of the three
articles so far) has one piece left: using uClibc to reduce program size.
This is a reduced size version of the standard glibc library, specifically
built for small footprint systems.
(
Log in to post comments)