How about a distro-provided bisection facility?
Posted Apr 15, 2008 21:58 UTC (Tue) by jd
In reply to: How about a distro-provided bisection facility?
Parent article: Bisection divides users and developers
Some distros (Red Hat and SuSE spring to mind) are big enough that some (but not all) bisecting could actually be done automagically on a server at the distro's HQ. I'm picturing something like this:
- Reasonably tech-savvy user finds repeatable kernel bug or regression
- Said user is able to produce a sequence of events that lead up to the bug, plus the test that establishes the presence of the bug or regression
- The script is handed off to a virtual machine at the distro HQ, along with the .config file
- The script is validated by a human, to prevent accidental or deliberate DoS
- The VM builds a test kernel, applies the script and checks against the test
- If the test shows the bug is repeatable on the distro's hardware, the VM uses bisection and the prior step to automatically locate the bug
- If the bug is in distro-supplied or distro-modified patches, the bug report goes to the distro, otherwise it's handed off to the kernel developers
This method has several advantages. Firstly, if the bug can be easily repeated, it moves the heavy lifting from users to people who (usually) have more powerful hardware at their disposal. Secondly, by distinguishing hardware-specific and hardware-agnostic bugs, there is automatically more information available for debugging. Thirdly, you really want to get to the final destination of having a way of reporting and filtering bug reports that maximizes both the quantity and quality of what kernel developers get, which means the manual parts have to be minimal and reducable by automation.
It also has several disadvantages. More users can bisect than can produce an automatable test plan. It's far harder for an automated system to eliminate non-identical reports that are of the same bug and carry no additional information. Too many automated bug reports may lead to developers ignoring them - and a bug in the bug reporter itself certainly would. So few distros can afford the hardware that would be required to do this well that it would have limited benefit. By necessarily using such high-end hardware, as opposed to what users are likely to have, a lot of hardware-related (and almost all hardware-specific) bugs - which, beween the two, will account for a sizeable fraction of all bugs - cannot be automatically bisected on a remote machine. Automated reporting systems cannot answer additional kernel developer questions or carry out additional testing onthe developers' behalf.
Ultimately, the question becomes one of how to get the most results from the most testing, given that testing is something programmers generally avoid if possible and the users most likely to do something funky enough to cause a crash are the ones who don't know what they're doing. The semi-automated method above won't solve that last one, though.
to post comments)