LWN.net Logo

Kernel Summit 2007 - an advance view

Kernel Summit 2007 - an advance view

Posted Aug 26, 2007 12:30 UTC (Sun) by error27 (subscriber, #8346)
Parent article: Kernel Summit 2007 - an advance view

I don't think slowing down is a good idea. It would just make things languish in git trees for a little longer but it wouldn't result in many fixed bugs.


(Log in to post comments)

Kernel Summit 2007 - an advance view

Posted Aug 26, 2007 17:22 UTC (Sun) by mingo (subscriber, #31122) [Link]

I don't think slowing down is a good idea. It would just make things languish in git trees for a little longer but it wouldn't result in many fixed bugs.

Yeah - or worse, it would force developers to go elsewhere.

I think the right approach to high-flux development is three-fold:

1) decrease the latency of getting the latest kernel to the user. 2-3 months release cycle pushes human limits but is doable. 2-3 _days_ is doable latency for the -git kernel, for bleeding-edge testers: today a fully packaged up rpm kernel of latest -git is yum-upgradable 1-2 days after Linus commits a change into his tree.

2) increase the likelyhood of users to report back a bug and for them to test fixes. This one has many aspects: good debugging infrastructure (many automated debugging features, automatic bisectability), meaningful (and early enough) debug output, and rewards to testers who report back, responsive maintainers, etc. This is where we have the biggest technical deficiencies right now.

3) maintainers must push back on unrobust code, as a function of how stable the _previous_ release was. If the previous release was too unstable, be stricter in the next release. If the previous release was too boring, risk more changes. 2-3 months of a delay to a feature is not the end of the world - with a release cycle longer than that it can be lethal to motivation.

The biggest problem is the fundamental lability of this dynamic system: if the kernel gets buggier then users go away, due to that the kernel gets _even more_ buggier (and kernel developers wont even notice initially), then developers go away too, etc. It quickly spirals out of control and detaches the kernel developer community from the user community. That is what happened with Linux 2.5 and it took years to fix up and get Linux 2.6.0 out of the door.

Sticking to the "90 days max" release cycle is key. That is what keeps us all honest. If we mess up then the only true recourse is to release a buggier kernel, and the punishment for that is much more direct (and unpleasant to the kernel developer) "your kernel is buggy" (or "your subsystem is buggy") impression than a "slip in the release date" action is - so it's a far more efficient method of feeding back the true quality of the kernel back to the developer community and thus keeping the quality of the kernel up high enough.

Kernel Summit 2007 - an advance view

Posted Aug 27, 2007 11:03 UTC (Mon) by balbir_singh (subscriber, #34142) [Link]

Hi, Ingo,

Good suggestions, I agree with all of the three. But, I might add that it might be a good idea to have

1. A feature reviewer for every new feature added (as a backup for the
contributor of the feature). That way we ensure that at-least three
people understand the code, the contributer, the feature reviewer
and the maintainer. Currently, the maintainer is burdened with too
many features to handle.
2. I like the way boost works with respect to processes
(http://www.boost.org/more/formal_review_process.htm)

Kernel Summit 2007 - an advance view

Posted Aug 28, 2007 11:53 UTC (Tue) by pointwood (subscriber, #2814) [Link]

=== CUT ===
2) increase the likelyhood of users to report back a bug and for them to test fixes. This one has many aspects: good debugging infrastructure (many automated debugging features, automatic bisectability), meaningful (and early enough) debug output, and rewards to testers who report back, responsive maintainers, etc. This is where we have the biggest technical deficiencies right now.
=== CUT ===

I'm just a mere user, but I would like help out if/where I can. I have no idea how to test a new kernel on my systems, but if it was made easy (LiveCD's?) and/or there was a guide that provided me with relevant info (what to report back, etc.), then I'd be happy to test new kernels. I bet much of it can be automated. Create a mailing list used for announcing new kernels to be tested. I think that would get you a lot more testers and get the kernel tested on a lot more hardware.

Kernel Summit 2007 - an advance view

Posted Aug 30, 2007 10:53 UTC (Thu) by intgr (subscriber, #39733) [Link]

There indeed appears to be no documentation on kernel testing for
non-developers; Linus is always asking for more testers of release
candidates, but it is often not straightforward without guidance on getting
the kernel installed and reporting problems.

Publishing LiveCDs for every release candidate would really be a waste,
and I am not aware of any. If compiling it all manually is over your head,
don't bother -- debugging and reporting problems also requires quite a bit
of technical know-how which you are unlikely to be prepared for.

This is the "short" guide to testing kernels: follow Linus's release
candidates; kerneltrap.org often reports -rc1 releases earlier than LWN
because it's not bound by the weekly schedule, but LWN is more consistent
-- the latest kernel release candidate is always reported on the kernel
page.

Preferably you would use git because it allows regressions to be bisected
to the patch that caused the regression:
http://www.kernel.org/pub/software/scm/git/docs/v1.4.4.4/...

If that's too complicated for you, get tarballs from www.kernel.org; build
the kernel (look for a howto/tutorial for your distro and take your time,
configuration might take a while for the first time); use it for everything
you normally use your computer for. When problems occur, write to LKML
(linux-kernel@vger.kernel.org) with the details of your hardware, your
configuration, any suspicious messages in dmesg, and what could have
caused the problem. Be sure to mention that it's a regression, and the
last kernel version that worked. More details at:
http://www.kernel.org/pub/linux/docs/lkml/reporting-bugs....
If you can be bothered, use the git-bisect feature to track down the
erroneous commit (see the link about git above).

Even if the kernel runs fine, keep your eye on dmesg for OOPSes -- these
are definitely bugs regardless of whether they cause you problems.

If there is no release candidate, you can try running the -mm kernel tree;
it contains experimental patches, many of which are probably going to be
merged into the mainline sooner or later. Be aware though, that it is more
likely to have serious bugs.

Kernel Summit 2007 - an advance view

Posted Aug 30, 2007 18:57 UTC (Thu) by aegl (subscriber, #37581) [Link]

"Even if the kernel runs fine, keep your eye on dmesg for OOPSes -- these
are definitely bugs regardless of whether they cause you problems."

If you are regularly testing development kernels, it is really useful to save the dmesg(1) output from every kernel that you boot. I added this to my /etc/rc.local

REL=`uname -r`
TSTAMP=`date +%Y%m%d%H%M%S`
dmesg -s 100000 > /dmesg/$TSTAMP-$REL

Then when I notice something odd happening I can check whether anything new and interesting showed up in the boot messages.

I've spotted a lot of issues by simply running diff(1) on the current and previous dmesg output.

Kernel Summit 2007 - an advance view

Posted Sep 1, 2007 14:12 UTC (Sat) by kreutzm (subscriber, #4700) [Link]

Or use a tool to scan your log files (like logcheck). When I try out a new kernel, I immediately get all new lines which I can then add to the log check rules after I acknowledged them (and possibly took other action).

Kernel Summit 2007 - an advance view

Posted Aug 31, 2007 1:41 UTC (Fri) by rddunlap (guest, #27065) [Link]

Have you seen the "Linux Kernel Tester's Guide" ?

http://www.stardust.webpages.pl/files/handbook/handbook-e...
or just
http://www.stardust.webpages.pl/files/handbook/handbook-e... for
the non-rc1 version.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds