LWN.net Logo

LCA: Andrew Tanenbaum on creating reliable systems

LCA: Andrew Tanenbaum on creating reliable systems

Posted Jan 18, 2007 20:14 UTC (Thu) by eklitzke (subscriber, #36426)
In reply to: LCA: Andrew Tanenbaum on creating reliable systems by elanthis
Parent article: LCA: Andrew Tanenbaum on creating reliable systems

The problem with that sentiment, and the whole article, is that it focuses solely on the kernel. I don't think I've had more than 3 or 4 Linux failures in my life, and most of those were when using very new drivers (or NVIDIA). I have had X crash or lock, various GNOME and KDE components crash or lock, various regular applications crash and lock more times than I can possibly count. Definitely into the triple digits, if not quadruple by now.

I tend to agree with you here. The kernel is very stable -- I've only had one real, bona fide kernel oops in the past 18 months or so (I think it was pdflush that crashed it). And I can't even begin to count how many times X has totally locked up the system (usually after starting a misbehaving Gnome application). But that just means that those applications just need to implement a fault tolerant model as well. It's totally unacceptable that an application can cause X to lock up the whole computer. If X was self-healing that would be spectacular.

A lot of the most modular pieces of software on my system (I am thinking particularly of Postfix and Apache) are also the most stable. TCP/IP is another example of a modular (well, layered) system that is particularly resilient to failure. Certainly this level of modularity isn't needed in all cases, but for any really critical software I think that taking some lessons from the microkernel model is a great idea.


(Log in to post comments)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds