LWN.net Logo

Emdebian Grip 1.0: the universal embedded operating system

April 8, 2009

This article was contributed by Koen Vervloesem

In the shadow of the long-awaited release of Debian 5.0 "Lenny", another announcement by the related Emdebian project appeared: Emdebian Grip, a small Debian-compatible Emdebian installation. The Emdebian project provides a more fine-grained control over package selection, size, dependencies and content to enable creation of small and efficient Debian packages for use on resource-limited embedded targets. Emdebian is a project in progress, but it already provides toolchains and two distributions.

One of these distributions, Emdebian Grip, maintains as much compatibility as possible with Debian: in essence, Emdebian Grip unpacks the .deb archives from Debian, but removes unneeded files such as manpages, info documents, documentation and unwanted translation files, then repacks the archive. So the binaries, maintainer scripts and dependencies of the original Debian package are untouched, but the overall size and the installation size of the package is reduced.

Emdebian Grip is primarily intended as a native build environment for building custom packages on an Emdebian installation. It's essentially a Debian distribution builder: the emgrip command (in the emdebian-grip package) processes a .deb from any of seven Debian architectures in the Debian archive, previously built by maintainers or buildd machines, and generates an Emdebian Grip package for this Debian architecture. Emdebian Grip 1.0 supports arm, armel, i386, amd64, powerpc, mips, mipsel and source, but the arm architecture will not be supported in Emdebian Grip 2.0, as this architecture has been deprecated after Debian 5.0 in favor of the new ARM EABI port, armel. When building a custom package for Emdebian Grip, you probably have to add a Debian mirror to the apt source to be able to install some -dev and -doc packages. Once the package is built, it can be converted to Grip with emgrip.

The fact that Emdebian Grip was able to support seven architectures in the first release is impressive, but it's just a consequence of the architecture-neutral generation process. Even more impressive is that Emdebian Grip is principally developed by one person: Neil Williams. The original idea for Grip came from Nick Bane and Wookey, during the Emdebian session at Extremadura in September 2008. Other members of the Emdebian project have also contributed ideas and added to the design requirements but development is mainly done by Williams. So, Emdebian Grip evolved from the first rough idea to the first stable release on 7 architectures in six months, with just one developer for most of the code. Williams calls this rightly "a testament to the power of architecture-neutrality and of binary compatibility with standard Debian."

Installations of Emdebian Grip 1.0 can be done with standard Debian tools like debootstrap, debian-installer and even debian-live. The project recommends using the Debian Lenny installer in Automatic Installation mode. After setting up the network, the installer prompts for the preconfiguration file. When www.emdebian.org is entered, the Debian base system is migrated to the Emdebian Grip distribution during the installation process. The "Select and install software" section shows some added Grip tasks: "Grip XFCE desktop" (which installs a trimmed-down list of XFCE packages) and "Minimal Grip XFCE desktop". A little comparison: the XFCE task in Debian brings in 354 new packages, needs 214MB of archives and uses 607MB of additional disk space. In contrast, the Grip XFCE task brings in 293 new packages, needs 82.5MB of archives and uses 255MB of additional disk space. The Minimal Grip XFCE task brings in 197 new packages, needs 57.3MB of archives and uses 171MB of additional disk space.

As the Emdebian Grip packages are not recompiled, they are completely binary-compatible with Debian, so one can even mix Emdebian and Debian packages. Or one can migrate an existing Debian system to Emdebian Grip simply by adding an apt source to /etc/apt/sources.list: for Lenny this is deb http://www.emdebian.org/grip lenny main. After the next apt-get upgrade the system is converted to Emdebian Grip. The user can still pin individual packages to Debian versions.

Last December, Williams made the first release of Emdebian Grip unstable available. When he converted the Debian Lenny installation on his Acer Aspire One to Emdebian Grip, 600 packages were updated (converted) and nearly 300MB disk space was freed. He went on to proclaim Emdebian Grip as Debian, only 25% smaller. Installation size is one of the main reasons why people would want to install Emdebian Grip.

Prominent software packages in this release are the Xfce 4.4.2 desktop environment, X.Org 7.3 (which autoconfigures itself with most hardware), Iceweasel (Firefox) 3.0.6, Linux kernel version 2.6.26, Python 2.5.2 and 2.4.6, Perl 5.10.0 and more than 1,000 other packages. Under the hood, it's using coreutils and glibc. Xfce is the default desktop environment. As Emdebian is meant to run on embedded devices, not all Debian packages are added to Emdebian, only the ones that make sense. For example, most -dev, -doc and -dbg packages are missing. The full Gnome or KDE suites are also probably not going to be available in Grip, although smaller parts can find their way in it.

How does Emdebian Grip squeeze its packages?

So how does Emdebian make its packages smaller? Removing manpages, info documents and documentation is simple, but what about localization? On a Debian system, /usr/share/locale consumes 250MB. The way Debian implements localizations is not suitable to embedded systems. Debian has one binary package including all translations for all locales. In contrast, Emdebian uses the TDeb system: one TDeb for each locale, for each source package. Emdebian Grip provides methods to only install the localization data needed by the actual packages installed and the locales actually configured. At this moment, there is still one catch with this system: a program that uses non-gettext translations might lose them when "gripped". Examples of these packages are OpenOffice.org, Mozilla, Qt or Java properties. According to Williams, non-gettext translations are unlikely to get full support in Emdebian Grip 2.0, but things should be easier in Grip 3.0, based on Debian 7.

A related difference between Debian and Grip is the cache data size: there is a noticeable delay when Debian loads the package data for the first time before an installation. Most of that delay is because the Packages.gz file of Debian is so large. Williams explains what he has done in Emdebian Grip to solve this problem: "Grip not only reduces the number of packages listed in the Packages.gz file, but also enforces a limit on the length of individual long descriptions for each package, producing a much smaller Packages.gz file which makes for faster installations and is more suitable for devices where the available space after initial installation could be smaller than the size of the Packages.gz file from Debian."

Grip, Crush and the future

Together with Emdebian Grip 1.0, another Emdebian variant appeared: Emdebian Crush 1.0. This one goes one stage further: it makes an even smaller Debian version by cross-building packages to modify dependencies and reduce overall package sizes. For example, Perl is removed, and required Perl packages are removed or reimplemented. These modified dependencies give large gains in installation size. In contrast to Grip, building, installing and maintaining a system running Crush 1.0 is a lot of work and requires detailed knowledge of Debian. Moreover, Crush is not a build environment: the emdebian-tools package used to build packages for Crush doesn't work because Crush doesn't include Perl. This means Emdebian Crush requires Debian to build. A minimal installation of Emdebian Crush 1.0 without X needs about 24 MB.

The future of Emdebian Grip surely does look interesting. One aim is to prepare an almost complete filesystem without regard to the architecture of the final install. According to Williams, this will extend architecture-neutrality from package generation to package installation. By doing all the work in advance, the installed filesystem does not need to include the downloaded .deb packages, allowing systems to be installed to within much tighter tolerances for available file space after installation. Williams depicts this process as follows: "Multistrap allows a complete system to be designed and prepared on a fast amd64 computer using armel packages from more than one repository and including all packages needed by that particular install, e.g. both lenny and lenny-proposed updates, along with security and volatile if desired. This allows the final install to be completed without network access and without needing to install any additional packages. The slower armel embedded device then merely needs to have the almost completed filesystem unpacked and then allow a single command to complete the configuration of all installed packages."

As an extension to this, another idea Williams wants to explore further is something he calls "incremental installation": teaching apt that if there are 100 packages to download and install, it should identify the 10 that can be installed without needing any other dependencies and download, install and cleanup after those, before moving on and identifying the next 10 or 20 that need no other dependencies from the rest of the stack. This way, the hundreds of megabytes that apt commonly needs for a large upgrade can be eliminated as the temporary space is constantly being reused instead of being allocated in one huge lump at the start and not being freed until the very end. This will bring Debian one step further in the direction of its goal of a universal operating system.


(Log in to post comments)

Emdebian Grip 1.0: the universal embedded operating system

Posted Apr 9, 2009 20:41 UTC (Thu) by no_treble (guest, #49534) [Link]

Very promising. I like all the ideas mentioned.

Emdebian Grip 1.0: the universal embedded operating system

Posted Apr 10, 2009 7:14 UTC (Fri) by Kamilion (guest, #42576) [Link]

Hm. Think I'll build some of my next virtual appliance projects with this.
Been using Ubuntu Server or Lenny, but this seems like it would fit much better.

Emdebian Grip 1.0: the universal embedded operating system

Posted Apr 10, 2009 17:25 UTC (Fri) by nlucas (subscriber, #33793) [Link]

That 24MB number for minimal Emdebian Crush is really nice. Have to try it some time.

Emdebian Grip 1.0: the universal embedded operating system

Posted Apr 19, 2009 15:04 UTC (Sun) by oak (guest, #2786) [Link]

> As Emdebian is meant to run on embedded devices, not all Debian packages
are added to Emdebian, only the ones that make sense. For example,
most -dev, -doc and -dbg packages are missing.

Embedian seems to have more Tdebs than Debian:
http://www.emdebian.org/emdebian/langupdate.html

Larger number of packages makes apt slower and use more memory (this is
relevant e.g. for the automated security updates of networked embedded
devices). Or is the plan not to have translation package repository
enabled by default?

Also, why Emdebian doesn't support Ddebs like Ubuntu does? Target
debugging is a crucial feature for embedded...

> the XFCE task in Debian brings in 354 new packages ... In contrast, the
Grip XFCE task brings in 293 new packages,

What (kind of) packages are missing? Is this metapackage(?) package
(dependency?) trimming something that is or can be automated?

Emdebian Grip 1.0: more tdebs than debian

Posted Apr 21, 2009 22:31 UTC (Tue) by speedster1 (subscriber, #8143) [Link]

From the page to which you linked:

http://www.emdebian.org/emdebian/langupdate.html
Emdebian generates a single package for every translation of each Emdebian package, leading to a 70% reduction in installation size but a tenfold increase in the number of binary packages built from each source package. To solve this scalability problem, langupdate supports a secondary sources list and secondary apt cache so that the main apt cache can be kept as small as possible.
--------

It looks like this "secondary apt cache" is intended to address that problem of using extra memory on things like automated security updates. Security updates would be in the main apt sources list, which would not include the translation packages. Also note that the biggest explosion of packages is really on the build machine rather than the install target -- a target device with space constraints could enable only one language at a time, even though lots of languages may be available in the package repo.

Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds