LCA: Andrew Tanenbaum on creating reliable systems
LCA: Andrew Tanenbaum on creating reliable systems
Posted Jan 18, 2007 8:24 UTC (Thu) by mingo (guest, #31122)In reply to: LCA: Andrew Tanenbaum on creating reliable systems by elanthis
Parent article: LCA: Andrew Tanenbaum on creating reliable systems
If you take Tanenbaum's suggestion to heart, the 5-10% "penalty" of the micro-kernel design is irrelevant, because you won't just be swapping in a micro-kernel underneath the bloated, unreliable layers we've built on top of Linux. You'll be building an entire new system, bottom to top, with less bloat and more reliability. Will that total system have a 5-10% penalty over my current system? I doubt it. You can't even *begin* to speculate, because there are just far, far too many variables to really judge that.
Yes. The other cost is not performance but flexibility of design and /flexibility of bugfixes/. Both matter very much. You dont win more reliability by making bugs harder to fix. A 'monolithic' kernel's state might be harder to debug, but you've got everything in one place - if you need to change a few drivers to fix a bug in a core infrastructure API - no problem, you just do it. If you need to expose a data structure to another subsystem - no problem.
In a microkernel design you have explicit, documented, relied on APIs (which are more like ABIs) between subsystems, making both the ad-hoc sharing of information and fast fixing of those interfaces alot more cumbersome.
Furthermore, if there's a failure in any of the subsystems, i definitely do not want to hide this fact by having a "restart and try again" feature. I really want to achieve a bug free kernel, not a kernel that appears bug-free.
My opinion is that we'll win far more reliability by concentrating on transparent debugging facilities (static ones such as Sparse and dynamic ones such as [plug alert] lockdep), than via limiting the basic flexibility of the kernel's design. I'd rather burn CPU time on running with lockdep enabled to find deadlocks, than to slow down and hinder /all/ kernel development by forcibly isolating components from each other.
Also, there are some areas and subsystems where isolation wins us /more/ flexibility: for example filesystems. But here Linux already has FUSE, which is an /optional/ feature to write filesystems in user-space. NTFS-3G has already proven (by being leagues better than the in-kernel ntfs driver) that at least for that type of filesystem, and in that stage of its lifecycle, development was faster and more flexible in user-space.
Anyway ... we'll see how this works out. I have a huge amount of respect for Mr. Tanenbaum, his books are great and i am sure he is having tons of fun with Minix - and i definitely agree with him that reliability is the #1 challenge of modern OS design. Diversity of opinion and diversity of approach does not bother me, it will only enrich the end result.
Posted Jan 25, 2007 15:26 UTC (Thu)
by tjc (guest, #137)
[Link] (4 responses)
Posted Jan 27, 2007 22:50 UTC (Sat)
by pascal.martin (guest, #2995)
[Link] (3 responses)
Lets assume the disk driver was restarted. What happens if the disk driver crashes again, because of the activity caused by the crash log? 8-)
That may seems silly, but I have seen similar "death trap" problems in actual life.
Posted Jan 29, 2007 15:22 UTC (Mon)
by tjc (guest, #137)
[Link]
I expect the logging system works in enough cases to be a benefit.
Posted Jan 31, 2007 22:50 UTC (Wed)
by tjc (guest, #137)
[Link] (1 responses)
Unfortunately, no specifics are given. It sounds like something from Star Trek TNG. Data: "Captain, I could use an binary exponential backoff protocol to restart the warp engines." Picard: "Very good Mr. Data -- make it so!" http://www.minix3.org/doc/ACSAC-2006.pdf
Posted Feb 1, 2007 12:54 UTC (Thu)
by robbe (guest, #16131)
[Link]
Example with f = 300, i.e. 5 minutes (a viable value for SMTP):
* First try ... fails
It would work the same for OS-component restart, of course with values
LCA: Andrew Tanenbaum on creating reliable systems
Furthermore, if there's a failure in any of the subsystems, i definitely do not want to hide this fact by having a "restart and try again" feature.
My understanding is that MINIX 3 will log server/driver crashes and email the developer if so configured. I can't remember if I read this somewhere here, or in one of the whitepapers.
Minix will log server/driver crashes? To disk ? even if the disk driver crashed? :-)LCA: Andrew Tanenbaum on creating reliable systems
Well yes, there is some chance of that happening, but there's also some chance that you will be hit by a bus and killed before you read this post. LCA: Andrew Tanenbaum on creating reliable systems
I just found this bit of information in the paper "Reorganizing UNIX for Reliability"LCA: Andrew Tanenbaum on creating reliable systems
If crashes reoccur, a binary exponential backoff protocol could be used to prevent bogging down the system with repeated recoveries.
Exponential backoff is a standard technique used, for example by mail exponential backoff
servers, in the face of transient failures: after the n-th consequitve
error, wait f * k^n seconds, then retry. Suitable values for f and k
depend on the application -- k is often 2 -> binary exponential backoff.
* Wait 5 minutes
* Second try ... fails
* Wait 10 minutes
* Third try ... fails
* Wait 20 minutes
* Fourth try ... fails
* Wait 40 minutes
* Fifth try ...
etc.
for f in the milliseconds.
