User: Password:
|
|
Subscribe / Log in / New account

The source of the e1000e corruption bug

The source of the e1000e corruption bug

Posted Oct 23, 2008 2:26 UTC (Thu) by modernjazz (guest, #4185)
Parent article: The source of the e1000e corruption bug

There seem to be other bugs that were fixed by disabling CONFIG_DYNAMIC_FTRACE: see, e.g.,
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/263059
It's interesting that this was discovered by studying what might be the scariest case (bricking the hardware), rather than in a much "easier" case of studying hangs-on-boot. It goes to show you, intense motivation can overcome a lot of the barriers of inconvenience!


(Log in to post comments)

The source of the e1000e corruption bug

Posted Oct 23, 2008 3:31 UTC (Thu) by nevets (subscriber, #11875) [Link]

Another issue was that ftrace was not a suspect at the time. I corrected any bugs that were passed on to me.

We were designing a new (more robust) version of ftrace in the linux-tip tree. This new version does not have the problems that the old version (in 2.6.27) had. But since the new version was a new design, we held off pushing it to Linus.

Unfortunately, all our testing of the old design never showed any of these issues. It took going out to a larger audience to have them appear.

The source of the e1000e corruption bug

Posted Oct 23, 2008 9:03 UTC (Thu) by alonz (subscriber, #815) [Link]

Wouldn't it be better to simply dump the entire contents of the mcount buffer whenever any code is unmapped, instead of just disabling this (useful) optimization in a kernel that is likely to have a long life?

The source of the e1000e corruption bug

Posted Oct 23, 2008 12:11 UTC (Thu) by nevets (subscriber, #11875) [Link]

Wouldn't it be better to simply dump the entire contents of the mcount buffer whenever any code is unmapped, instead of just disabling this (useful) optimization in a kernel that is likely to have a long life?

From a safety point of view, no. Anything other than disabling it was unacceptable in the stable release. If we found a simple bug (off by one, or array out of bounds) then we could have fixed it. But the bug was a design issue (which has changed in 2.6.28).

How would we know for sure that we got every place that kernel text was freed? How do we know that we don't add more bugs with this "dump the mcount on release".

Now if you would like to have dynamic ftrace in 2.6.27, it would not be hard for me to port the new design. I've already ported it to 2.6.24-rt. Just do not expect this backport to show up in the stable branch.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds