LWN.net Logo

The state of the e1000e bug

The state of the e1000e bug

Posted Oct 6, 2008 7:31 UTC (Mon) by jzbiciak (✭ supporter ✭, #5246)
Parent article: The state of the e1000e bug

I'm not a git user nor am I kernel developer, but this caught my eye:

[E]ven if this bug is fixed tomorrow, it will be present in most of the 2.6.27 history. Anybody bisecting the kernel in an attempt to track down an unrelated bug risks being bitten by a zombie version of the e1000e bug. There may be no way to deal with that threat other than the posting of some big warnings.

It seems like it would be useful to have a git bisect mode that allowed you to pin some changesets while otherwise warping you to the next kernel in a bisection sequence. In other words, you want to get "version XXX + plus these N changesets." That seems like it might be a generically useful facility. It also seems like it'd let people bisect to find other bugs while holding certain things in the present, such as e1000e.

Does git provide for such a thing?


(Log in to post comments)

The state of the e1000e bug

Posted Oct 19, 2008 1:35 UTC (Sun) by Duncan (guest, #6647) [Link]

While you've almost certainly stopped watching for replies, perhaps this
comment will help someone else coming across this later, perhaps from a
google...

Fortunately I don't have an e1000e NIC so this particular bug hasn't been
a problem here (and it has by the time I write this been traced to the
ftrace framework and fixed properly), but I did have a bug with this
kernel and used git bisect on it, so the question does pertain, and would
have been of immediate interest if I did have the hardware.

In general, it's quite possible to revert any specific commit or set of
commits, while doing bisect or otherwise testing using git. However,
there's a couple problems if trying to do it that way.

One, it's quite possible additional commits will have been built upon the
problem commit, so one could well end up reverting a decent size swath of
commits, to the point one couldn't really be said to be testing the
kernel at that particular point anyway, potentially invalidating any
conclusions the testing may come to. Perhaps not, indeed, probably not,
but it's a complicating consideration, certainly.

Two, as was the case with this bug now that it has been traced and
looking at it in hindsight, the bug can be in an area entirely unrelated
(except via the bug) to where it actually shows up. (parenthetical
example: Sort of like a leaky roof; the hole in the roof may be several
meters away from where the water drips thru the ceiling!) In this case,
it was a bug in the new ftrace functionality, coupled with removing
modules, that was eventually found to cause the problem. ftrace has been
disabled in 2.6.27.1, but the point is that until the problem is fully
traced, there's no guarantee that one would pick the correct commits to
revert while bisecting in any case. I've no idea how long ago the last
e1000e commits were, but supposing they happened in this kernel, the
instinctive thing to do would be to revert all of them while doing the
bisect, but that wouldn't have helped in this case, and there was no way
of knowing until later what /would/ help, since the ftrace stuff was
otherwise entirely unrelated.

Thus a bisect with the supposedly offending changes might both lead to
the wrong conclusions, and not remove the danger of bricking the hardware
in any case.

Unfortunately, the fact remains that testing unreleased kernels is risky.
Indeed, conservative folks will likely want to stay a full kernel release
back, not installing 2.6.26 until 2.6.27 at least, and only then
installing whatever happens to be the latest 2.6.26.x stable release.
Even distribution kernels were bit by this, altho obviously it was just
the most bleeding edge ones, the ones shipped as -rc testing, for those
willing to risk their machines and try it, and this -rc series DID point
out the very literal meaning of the "risk their machines" bit. It's
certainly not for everyone, but as one that does run -rc kernels (tho
only from -rc3 or so) myself, it can be rewarding too -- there's nothing
quite like feeling of being able to point to a particular -rc bug and
say "but for me, that may have made it to release, I played my part in
making this kernel a good one", especially for folks (like me) that might
do sysadmin level bash scripts, but little more.

Duncan

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds