|
|
Log in / Subscribe / Register

Coverity's kernel code quality study

Coverity's kernel code quality study

Posted Dec 14, 2004 18:14 UTC (Tue) by southey (guest, #9466)
Parent article: Coverity's kernel code quality study

Note that 54% are in device drivers and 41% are due to NULL pointer dereferences. So Linux is probably has an even better quality than it first appears.


to post comments

Coverity's kernel code quality study

Posted Dec 14, 2004 18:21 UTC (Tue) by MathFox (guest, #6104) [Link] (5 responses)

My take on the story is:
After four years of pestering the Linux coders with "Stanford code checker" reports, we hardly find any bugs with our code checker (which is the rebadged Stanford checker).
The number Coverty presents can not be compared with the number of bugs found in other projects before Coverty fixes.

Bugs per kLOC

Posted Dec 14, 2004 19:47 UTC (Tue) by man_ls (guest, #15091) [Link] (1 responses)

The number Coverty presents can not be compared with the number of bugs found in other projects before Coverty fixes.
Well said. Bugs per thousand lines of code (kLOC) can only be evaluated as a relative number, since we cannot know:
  • if blank lines of code are counted,
  • if comments are counted either,
  • if coding style matters (lone '{'s or '}'s)...
Otherwise, we can only suppose it is non-blank, non-comment lines of code what we are counting (the usual industry standard); and play with broad estimates, which I will presently do for the fun of it.

The figure given by Carnegie Mellon University, 20 or 30 bugs per kLOC, is definitely not for released software, but probably for written software before any testing happens. After release, the number would rather be 1 to 5 bugs per kLOC in commercial software. For mission-critical code, the count can be as low as 0.1 bugs per kLOC (as in Shuttle software), depending on cricicity and budget. Project size is also a factor.

Of course the rate in Linux is lower than in "commercial enterprise software"; an operating system kernel arguably is mission-critical software. 0.17 bugs per kLOC looks like a lot, even if those bugs are in device drivers, or especially then since they can take down the whole system, corrupt data, etc. (I remember estimates for w2k were 2 bugs per kLOC after release, but that includes the whole operating system, not just the kernel.)

But there is more. Nobody would expect that, after fixing the 985 bugs, Linux would magically become error-free. So 0.17 bugs per kLOC must be a lowest-bound estimate; the real figure will be higher.

All in all, a poor press release with not much real value, but great promotion for the Stanford Code Checker.

Bugs per kLOC

Posted Dec 14, 2004 23:02 UTC (Tue) by hppnq (guest, #14462) [Link]

I assume the Coverity program has a proper parser that allows for at least a proper comparison of the actual number of lines of code it has inspected. It must be much harder, for instance, to compare the complexity (that is inevitably related to the number of bugs found, I would say) of two programs.

But I fully agree with you that the press release reads more like a promotional flyer, which is a bit strange considering these people must know the tricks of the scientific trade.

Coverity's kernel code quality study

Posted Dec 14, 2004 23:58 UTC (Tue) by brouhaha (subscriber, #1698) [Link] (2 responses)

The number Coverty presents can not be compared with the number of bugs found in other projects before Coverty fixes.
Certainly it CAN be compared. If you were going to choose an OS today, would you decide to ignore the higher bug count of <some proprietary os> just because the owner hasn't had those Coverity fixes but might get them (or the equivalent) in the future?

Coverity's kernel code quality study

Posted Dec 15, 2004 0:42 UTC (Wed) by MathFox (guest, #6104) [Link] (1 responses)

You should realise that any (automated or manual) bug-checking process only finds a subset of the bugs that exist in a program. There ain't no silver bullet! The Stanford/Coverity bug checker will only find some of the bugs and be blind for the others.
Running an automated checker on a program will find a lot of bugs in the first run... but it will not find some bugs that hide in the bling spot of the checker. You can fix the bugs that the scanner finds and rerun it; but it can not help you with the bugs that it is blind for. So there will allways be an unknown(!) number of bugs left after a perfect scan.

It makes more sense to compare the numbers of bugs that were found on the first run of the scanner between different projects, or the total number of bugs spotted by the scanner instead of the number of <residual> bugs that the scanner finds after several iterations of bug fixing.

The holy grail in software testing is knowing the exact number of bugs in the product. :)

Coverity's kernel code quality study

Posted Dec 15, 2004 0:48 UTC (Wed) by emkey (guest, #144) [Link]

Shouldn't the holy grail be knowing exactly where all those bugs are? :-)


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds