|
|
Log in / Subscribe / Register

Mandrake Linux 9.2 and self-destructing CD-ROM drives

Upgrading to a new version of an operating system is always a bit of a mixed experience. The promise of new features, new applications, and better performance (one hopes) contends with the fear that the upgrade will break something that used to work. Even the most worried among us, however, do not normally worry about an upgrade causing hardware to self destruct. Those who have recent attempted to install Mandrake Linux 9.2 on a system containing an LG CD drive (shipped by Dell and numerous others) have gotten just that sort of surprise, however. An unpatched 9.2 system, it seems, can cause those drives to wipe out their firmware and cease to function.

This problem has been the centerpiece of a small flood of complaints about the stability of the 9.2 release - over 250MB of updates have already been issued by MandrakeSoft. The simple fact of the matter, however, is that it is hard to blame MandrakeSoft for this problem.

The code which toasts LG drives was added to the Mandrake Linux kernel back in August, as part of a general packet writing support patch. It issues a standard ATAPI FLUSH_CACHE command to the drive at times, in order to ensure that all outbound data reaches its intended destination. A CD-ROM is a read-only device, so the FLUSH_CACHE command does not make any particular sense in this context. But, for the purpose of the packet-writing code, it was easier to simply issue that command unconditionally.

The ATAPI specification is clear on what should happen in this situation; the drive should either simply ignore the command, or it should fail it with an error code. The designer of the LG drive firmware, however, had a different idea. Since FLUSH_CACHE is not a command that is applicable in this situation, why not reuse it to overwrite the firmware in some (undocumented) way? It must have, in some twisted way, seemed like a good idea at the time. But standard commands should never be re-purposed in this way; and they especially should not be turned into a self-destruct operation. The LG drives are non-compliant and mis-designed, and nobody can blame MandrakeSoft for having been the first distributor to get burned by this poor product.

Some people have tried to lay the blame there anyway, of course. According to the critics, if MandrakeSoft would only test its releases more thoroughly and avoid including non-standard kernel patches, this sort of episode would not occur. These charges do not hold water, however. Mandrake Linux has, arguably, the most open development process of any commercial distributor; anybody who is interested can follow the evolution of each release from one day to the next and, yes, test those releases. The code in question was included in two 9.2 release candidates, but nobody pointed out the problem. It is hard to see how much better MandrakeSoft could do on the testing front.

With regard to patches: for better or worse, shipping patched kernels is standard practice for distributors. Some distributors ship kernels which are hard to recognize as being derived from any mainline release; Red Hat's kernels are called 2.4.x, but, at the moment, are packed with 2.6 code and features. Even Debian has just been through a lengthy (and somewhat inconclusive) debate on just how heavily its kernels should be patched. For many patches, use in distributor kernels is a prerequisite to inclusion in the mainline. The use of patched kernels in distributions is not only standard practice, but it's a part of the wider development process.

New code will bring surprises, though, hopefully, not often of this magnitude. The only real way to be sure of the stability of code is to see it in wide use, in many different situations. Unfortunately, in the software world, the only way to achieve that degree of testing is to have the end users do it. This is true for both free and proprietary software. Such is life in this industry. MandrakeSoft got unlucky this time; the next such incident could just as easily happen to anybody else.

(Mandrake users may want to see the errata page for the LG drive problem).


to post comments

Mandrake Linux 9.2 and self-destructing CD-ROM drives

Posted Oct 30, 2003 14:07 UTC (Thu) by NAR (subscriber, #1313) [Link] (1 responses)

I'd love to see the day when a hardware manufacturer will be sued (and convicted) for a bug like this...

Bye,NAR

Mandrake Linux 9.2 and self-destructing CD-ROM drives

Posted Oct 30, 2003 14:57 UTC (Thu) by donwaugaman (subscriber, #4214) [Link]

As much as I enjoy indulging in schadenfreude at this kind of bogus hardware design, the regular process of returning bad equipment should result in these kinds of bozos either getting a clue quickly or going the way of the dodo in fairly short order, with little in the way of legal action necessary barring poor customer support practices. If a suit is necessary, it would probably end up going class-action given the low average selling price of CD-ROM drives these days.

(And, to be pedantic, in the USA you can't get a conviction in a lawsuit. Lawsuits are civil cases, convictions happen in criminal cases.)

Mandrake Linux 9.2 and self-destructing CD-ROM drives

Posted Oct 30, 2003 16:07 UTC (Thu) by hentosh (guest, #6115) [Link] (1 responses)

The bug isn't really as bad as it sounds. The FLUSH command is an ATAPI command 35h. The FLASH_FIRMWARE command was an ATA 35h command. The problem wasn't that LG reused an ATAPI flush command to flash the firmware. The problem was that the firmware didn't at this point verify if it was an ATAPI command or an ATA command.

This case was probably missed in testing since the ATAPI command for FLUSH didn't make sense to send to a CDROM device (as stated above).


Poor CD-ROM design?

Posted Oct 31, 2003 23:45 UTC (Fri) by giraffedata (guest, #1954) [Link]

What gets me about this is how happily the drive overwrites its firmware with trash. If I were designing a "flash firmware" command, I think I'd stick some kind of signature and/or CRC in there to make absolutely sure that the data being supplied was intended as firmware for the device. Think of all the ways you could intentionally send a "flash firmware" command, but unintentionally send the wrong data.

Mandrake Linux 9.2 and self-destructing CD-ROM drives

Posted Oct 31, 2003 18:19 UTC (Fri) by taruntius (guest, #1140) [Link] (1 responses)

Article says:

Some people have tried to lay the blame there anyway, of course. According to the critics, if MandrakeSoft would only test its releases more thoroughly and avoid including non-standard kernel patches, this sort of episode would not occur. These charges do not hold water, however. Mandrake Linux has, arguably, the most open development process of any commercial distributor; anybody who is interested can follow the evolution of each release from one day to the next and, yes, test those releases. The code in question was included in two 9.2 release candidates, but nobody pointed out the problem. It is hard to see how much better MandrakeSoft could do on the testing front.
Agreed, the hardware manufacturer really is to blame here. But even so, I have to disagree with absolving MandrakeSoft on the grounds that their development process is particularly open, and that therefore people could have been testing those bits. I've been mulling that logic over ever since reading this article yesterday, but every way I look at it, it seems to be little more than a fancy way of saying "in the open source world, the burden of testing falls on the user." That, my friends, simply cannot be what we're advocating.

As so many linux advocates will claim, we want linux to have success on the desktop. Success on the desktop equates to success with ordinary users, in which case we cannot seriously be suggesting that those same users be deeply involved in the pre-release testing of their OS. QED: if they were, they wouldn't be ordinary users. And frankly, if I'm an ordinary user and you tell me that with Linux I've got no guarantee that my distro of choice may well not have been tested with my hardware, and that furthermore, it's my fault if something blows up because I could have been testing it in pre-release form, I guarantee you my response is going to be "Well screw that! Windows ain't that bad."

As to the claim of "It is hard to see how much better MandrakeSoft could do on the testing front," you must not be looking very hard. Do you really think it's so much of an undue burden to ask MandrakeSoft, a company that purports to be making a living selling distributions, to do a some compatibility testing with a variety of hardware? Or even to do a lot of compatibility testing? Is it so unthinkable that just maybe they ought to fork over some dollars to buy a few Dell, Gateway, et. al, boxes to see what happens on hardware people are likely to have? Frankly, I think that's entirely reasonable. I don't think that's a particularly high "barrier to entry" for a company like MandrakeSoft. It's true that CD-ROM manufacturers should do a better job of implementing the standards for those devices, but MandrakeSoft, RedHat, and all the other serious commercial distro manufacturers out there, need to step up to the plate and take responsibility for testing their products. That, among many other things, is what it takes to be a credible player in the OS game.

Anyway, that's how I feel about it. But if appeals to responsibility and the plight of the ordinary user don't motivate you, consider this: I'll just bet that somewhere in Redmond there's a big lab where Microsoft tests every random peripheral, CPU, motherboard, and memory stick they can get their hands on to make damn sure this kind of thing doesn't happen to their ordinary users. You want to bet against that?

Mandrake Linux 9.2 and self-destructing CD-ROM drives

Posted Nov 4, 2003 0:24 UTC (Tue) by crouchet (guest, #1084) [Link]

>>And frankly, if I'm an ordinary user and you tell me that with Linux I've got no guarantee
that my distro of choice may well not have been tested with my hardware, and that
furthermore, it's my fault if something blows up because I could have been testing it in
pre-release form, I guarantee you my response is going to be "Well screw that! Windows
ain't that bad."<<

Actually, yes it is. When XP came out there was no guarantee that it would work with
YOUR particular hardware. And if it didn't, who was scrambling to fix it? The hardware
company. If you doubt that, check out this MS document:

http://support.microsoft.com/?kbid=330181

Which basically says to treat incompatible hardware the same as damaged hardware;
remove it from your system and then contact the manufacturer. A "Not Our Problem"
response.

If you check this site:
http://www.microsoft.com/windowsxp/home/using/howto/gettingstarted/guide/setupqanda.asp

You will find out about something called the "Hardware Compatibility List" and the
following statement:

"If your hardware isn't on the Hardware Compatibility List (HCL), contact the hardware
manufacturer to see if there is a Windows XP driver for it."

Nobody promises their software will work with all hardware.

JC

How did they figure this out?

Posted Oct 31, 2003 23:49 UTC (Fri) by giraffedata (guest, #1954) [Link] (1 responses)

Wow, I wonder how many drives had to be toasted for us to know that this was the problem? The source of this problem is by no means obvious from the result, and it's a pretty expensive thing to reproduce.

Or maybe after a small number of problem reports the drive manufacturer managed to reproduce the problem and then track it down with the luxury of being able to restore wiped out firmware?

How did they figure this out?

Posted Nov 2, 2003 12:49 UTC (Sun) by axboe (subscriber, #904) [Link]

It was actually pretty easy. The event of discovering this was some years ago, when Compaq used LG drives. A recently added patch in the SUSE kernel fried two drives at SUSE testing labs, so it was fairly easy to find out that it was the FLUSH_CACHE that was at fault. Debug and discovery was done by myself.

Now fast forward two years, two weeks ago. Mandrake fries CD-ROM drives, apprently this LG drive bug never got fixed at the source (LG). Unfortunate that the packet writing bug (issuing unconditional FLUSH_CACHE) never got fixed, stupid that LG still has that very serious bug. And very unfortunate for Mandrake.


Copyright © 2003, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds