A survey on kernel quality
Posted Jul 10, 2006 12:09 UTC (Mon) by MathFox
In reply to: A survey on kernel quality
Parent article: A survey on kernel quality
As a developer, I once wrote a device driver for a non-FOSS OS. From my experience I can tell that these kinds of "random lockups" are hard to debug: The bugs usually are timing-sensitive and adding debug statements to the code makes the bug go away. Furthermore there is no easy way to get the information out of the computer when the kernel hangs.
The best way to make progress here is to find a workload that makes reproducing the bug easy (having it occur once every day) and instrument the computer with "bus snooping" hardware (logic analysers, etc.) that can provide you with a log of the activity in the milliseconds before the crash.
N.B. This kinds of Heisenbugs are influenced by any attempt to pin them down; some species can reliably detect hardware probes.
to post comments)