The Embedded Linux Nightmare
The usage of proprietary operating systems in companies over the last 25
years has established a set of constraints which are not really applicable
to the way open source development - and Linux kernel development in
particular - works. My keynote talk ("
Embedded Linux Conference in Santa Clara addressed this mismatch; it
created quite a bit of discussion. I would like to follow up and add some
more details and thoughts about this topic.
Why follow mainline development?
The version cycles of proprietary operating systems are completely
different than the Linux kernel version cycles. Proprietary operating
systems have release cycles measured in years; the Linux kernel, instead,
is released about every three months with major updates to the
functionality and feature set and changes to internal APIs. This
fundamental difference is one of the hardest problems to handle for the
One can easily understand that companies try to apply the same mechanisms
which they applied to their formerly- (and still-) used operating systems
in order not to change procedures of development and quality
assurance. Jamming Linux into these existing procedures seems to be somehow
possible, but it is one of the main contributions to the embedded Linux
nightmare, preventing companies from tapping the full potential of open
source software. Embedded distribution vendors are equally guilty as
they try to keep up the illusion of the one-to-one replacement of
proprietary operating systems by creating heavily patched Linux Kernel
It is undisputed that kernel versions need to be frozen for product
releases, but it can be observed that those freezes are typically done very
early in the development cycle and are kept across multiple versions of the
product or product family. These freezes, which are the vain attempt to
keep the existing procedures alive, lead to backports of features found in
newer kernel versions and create monsters which put the companies into
the isolated situation of maintaining their unique fork forever, without
the help of the community.
I was asked recently whether a backport of the new upcoming wireless
network stack into Linux 2.6.10 would be possible. Of course it is
possible, but it does not make any sense at all. Backporting such a feature
requires backporting other changes in the network stack and many other
places of the kernel as well, making it even more complex to verify and
maintain. Each update and bug fix in the mainline code needs to be tracked
and carefully considered for backporting. Bugfixes which are made in the
backported code are unlikely to apply to later versions and are therefore
useless for others.
During another discussion about backporting a large feature into an old
kernel, I asked why a company would want to do that. The answer was: the
quality assurance procedures would require a full verification when the
kernel would be upgraded to a newer version. This is ridiculous. What level
of quality does such a process assure when there is a difference between
moving to a newer kernel version and patching a heavy feature set into an
old kernel? The risk of adding subtle breakage into the old kernel with a
backport is orders of magnitudes higher than the risk of breakage from an
up-to-date kernel release. Up-to-date kernels go through the community
quality assurance process; unique forks, instead, are excluded from this
free of charge service.
There is a fundamental difference between adding a feature to a
proprietary operating system and backporting a feature from a new Linux
kernel to an old one. A new feature of a proprietary operating system is
written for exactly the version which is enhanced by the feature. A new
feature for the Linux kernel is written for the newest version of the
kernel and builds upon the enhancements and features which have been
developed between the release of the old kernel and now. New Linux
kernel features are simply not designed for backporting.
I only can discourage companies from even thinking about such things.
The time spent doing backports and the maintenance of the resulting
unique kernel fork is better spent on adjusting the
internal development and quality assurance procedures to the way
in which the Linux kernel development process is done.
Otherwise it would be just another great example of a useless waste
Benefits to companies from working with the kernel process
There are a lot of arguments made why mainlining code is not practicable in
the embedded world. One of the most commonly used arguments is that
embedded projects are one-shot developments and therefore mainlining is
useless and without value. My experience in the embedded area tells me,
instead, that most projects are built on previous projects and a lot of
products are part of a product series with different feature sets. Most
special-function semiconductors are parts of a product family and
development happens on top of existing parts. The IP blocks, which are the
base of most ASIC designs, are reused all over the place, so the code
to support those building blocks can be reused as well.
The one-shot project argument is a strawman for me. The real reasons are
the reluctance to give up control over a piece of code, the already
discussed usage of ancient kernel versions, the work which is related to
mainlining, and to some degree the fear of the unknown.
The reluctance to give up control over code is an understandable but
nevertheless misplaced relic of the proprietary closed source model.
Companies have to open up their modifications and extensions to the Linux
kernel and other open source software anyway when they ship their
product. So handing it over to the community in the first place should be
just a small step.
Of course mainlining of code is a fair amount of work and it forces
changes to the way how the development in companies works. There are
companies which have been through this change and they confirm that
there are benefits in it.
According to Andrew Morton, we change approximately 9000 lines of kernel
code per day, every day. That means that we touch something in the range of
3000 lines of code, when we take comments, blank lines and simple
reshuffling into account. The COCOMO estimate of the value of 3000 lines
of code is about $100k. So we have a total investment of $36 million per
year which flows into the kernel development. That's with all the relevant
factors set to 1. Taking David Wheelers
factors into account
would cause this figure to go up to $127 million.
This estimate does not take other efforts around the kernel into account,
like the test farms, the testing and documentation projects and the immense
number of (in)voluntary testers and bug reporters who "staff" the QA
department of the kernel.
Some companies realize the value of this huge cooperative investment and
add their own stake for the long term benefit. We recently had a
customer who asked if we could write a driver for an yet-unsupported
flash chip. His second question was whether we would try to feed it
back into the mainline. He was even willing to pay for the extra hours,
simply because he understood that it was helpful for him. This is a small
company with less than 100 employees and a definitely limited budget. But
they cannot afford the waste of maintaining even such small drivers out
of tree. I have seen such efforts of smaller companies quite often in
recent years and I really hold those folks in great respect.
Bigger players in the embedded market apparently have budgets large enough
to ignore the benefits of working with the community and just concentrate
on their private forks. This is unwise with respect to their own
investments, not to talk about the total disrespect for the values which are
given them by the community.
It is understandable that companies want to open the code for new products
very late in the product cycle, but there are ways to get this done
nevertheless. One is to work through a community proxy, such as
consultants or service providers, who know how kernel development works and
can help to make the code ready for inclusion from the very beginning.
The value of community-style development is in avoiding mistakes and the
benefit of the experience of other developers. Posting an early draft of
code for comment can be helpful for both code quality and development time.
The largest benefit of mainlining code is the automatic updates when the
kernel internal interfaces are changed and the enhancements and bugfixes
which are provided by users of the code. Mainlining code allows easy
kernel upgrades later in a product cycle when new features and
technologies have to be added. This is also true for security fixes, which
are eventually hard to backport.
Benefits to developers
I personally know developers who are not interested in working in the open
at all for a very dubious reason: as long as they have control over their
own private kernel fork, they are the undisputed experts for code on which
their company depends. If forced to hand over their code to the
community, they fear losing control and making themselves easier to
replace. Of course this is a short-sighted view, but it happens. These
developers miss the beneficial effect of gaining knowledge and expertise by
working together with others.
One of my own employees went through a ten-round review-update-review
cycle which ended
with satisfaction for both sides:
> Other than that I am very happy with this latest version. Great
> job! Thanks for your patience, I know it's always a bit
> frustrating when your code works well enough for yourself and you
> are still told to make many changes before it is acceptable
Well, I really appreciate good code quality. If this is the price,
I'm willing to pay it. Actually, I thank you for helping me so
Over the course of this review cycle the code quality of the driver
improved; it also led to some general discussion about the affected
sensors framework and the improvement of it on the fly.
The developer improved his skills and he got an improved insight into
the framework with the result that his next project will definitely
have a much shorter review cycle. This growth makes him far more
valuable for the company than having him as the internal expert for
some "well it works for us" driver.
The framework maintainer benefited as well, as he needed to look at the
requirements of the new device and adjust the framework to handle it in a
generic way. This phenomenon is completely consistent with Greg
Kroah-Hartman's statement in his OLS
keynote last year:
We want more drivers, no matter how "obscure", because it
allows us to see patterns in the code, and realize how we
could do things better.
All of the above leads to a single conclusion: working with the kernel
development community is worth the costs it imposes in changes to internal
processes. Companies which work with the kernel developers get a kernel
which better meets their needs, is far more stable and secure, and which
will be maintained and improved by the community far into the future.
Those companies which choose to stay outside the process, instead, miss
many of the benefits of millions of dollars' worth of work being
contributed by others. Developers are able to take advantage of working
with a group of smart people with a strong dedication to code quality and
It can be a winning situation for everybody involved - far better than
perpetuating the embedded Linux nightmare.
to post comments)