Upgrading to a new version of an operating system is always a bit of a
mixed experience. The promise of new features, new applications, and
better performance (one hopes) contends with the fear that the upgrade will
break something that used to work. Even the most worried among us,
however, do not normally worry about an upgrade causing hardware to self
destruct. Those who have recent attempted to install Mandrake
Linux 9.2 on a system containing an LG CD drive (shipped by Dell and
numerous others) have gotten just that sort of surprise, however. An
unpatched 9.2 system, it seems, can cause those drives to wipe out their
firmware and cease to function.
This problem has been the centerpiece of a small flood of complaints about
the stability of the 9.2 release - over 250MB of updates have already been
issued by MandrakeSoft. The simple fact of the matter, however, is that it
is hard to blame MandrakeSoft for this problem.
The code which toasts LG drives was added to the Mandrake Linux kernel back
in August, as part of a general packet writing support patch. It issues a
standard ATAPI FLUSH_CACHE command to the drive at times, in order
to ensure that all outbound data reaches its intended destination. A
CD-ROM is a read-only device, so the FLUSH_CACHE command does not
make any particular sense in this context. But, for the purpose of the
packet-writing code, it was easier to simply issue that command
The ATAPI specification is clear on what should happen in this situation;
the drive should either simply ignore the command, or it should fail it
with an error code. The designer of the LG drive firmware, however, had a
different idea. Since FLUSH_CACHE is not a command that is
applicable in this situation,
why not reuse it to overwrite the firmware in some (undocumented)
way? It must have, in some twisted way, seemed like a good idea at the
time. But standard commands should never be re-purposed in this way; and
they especially should not be turned into a self-destruct operation. The
LG drives are non-compliant and mis-designed, and nobody can blame
MandrakeSoft for having been the first distributor to get burned by this
Some people have tried to lay the blame there anyway, of course. According
to the critics, if MandrakeSoft would only test its releases more
thoroughly and avoid including non-standard kernel patches, this sort of
episode would not occur. These charges do not hold water, however.
Mandrake Linux has, arguably, the most open development process of any
commercial distributor; anybody who is interested can follow the evolution
of each release from one day to the next and, yes, test those releases.
The code in question was included in two 9.2 release candidates,
but nobody pointed out the problem. It is hard to see how much better
MandrakeSoft could do on the testing front.
With regard to patches: for better or worse, shipping patched kernels is
standard practice for distributors. Some distributors ship kernels which
are hard to recognize as being derived from any mainline release; Red Hat's
kernels are called 2.4.x, but, at the moment, are packed with 2.6 code and
features. Even Debian has just been through a lengthy (and somewhat
inconclusive) debate on just how heavily its kernels should be patched.
For many patches, use in distributor kernels is a prerequisite to inclusion
in the mainline. The use of patched kernels in distributions is not only
standard practice, but it's a part of the wider development process.
New code will bring surprises, though, hopefully, not often of this
magnitude. The only real way to be sure of the stability of code is to see
it in wide use, in many different situations. Unfortunately, in the
software world, the only way to achieve that degree of testing is to have
the end users do it. This is true for both free and proprietary software.
Such is life in this industry. MandrakeSoft got unlucky this time; the
next such incident could just as easily happen to anybody else.
(Mandrake users may want to see the errata page
for the LG drive problem).
to post comments)