Coverity: one bug fixed every six minutes
In seven days, the defect density for 32 open source projects analyzed dropped from 0.434 defects per thousand lines of code to 0.371 defects. Samba, a widely used open source project used to connect Linux and Windows networks, showed the fastest developer response, reducing software defects in Samba from 216 to 18 in the first seven days."
Posted Apr 3, 2006 16:07 UTC (Mon)
by JoeBuck (subscriber, #2330)
[Link]
Posted Apr 3, 2006 16:32 UTC (Mon)
by bk (guest, #25617)
[Link] (30 responses)
Posted Apr 3, 2006 16:42 UTC (Mon)
by vondo (guest, #256)
[Link] (3 responses)
This is less of a problem than Bitkeeper. It would still be nice to see an open source service with similar functionality, though.
Posted Apr 5, 2006 7:30 UTC (Wed)
by mingo (guest, #31122)
[Link] (2 responses)
1) there's less incentive to develop them because Coverity fixes all the bugs
2) there's less incentive for upstream maintainers to accept debugging frameworks into the kernel codebase ('why this hassle, it doesnt detect many bugs' [but only because Coverity detected them already])
3) Coverity might be building up IP that it can use against free debugging tools later on.
there's a false perception of how 'healthy' the Linux development process is. Once Coverity goes away, things might deterioriate quickly, without any quick replacement. Bugs arent a one-time thing - they get introduced and fixed all the time. So the bugfixing methodology (and tools) need to be open-source just as much as all the other development tools need to be open-source.
Coverity already tried to attach proprietary strings to their bugreports: some sort of EAULA that forbids the use of these bugreports for the development of 'competitors'. That move definitly had a BitKeeper flavor. This requirement was removed for the Linux kernel bugreports, but how about other free projects?
Posted Apr 6, 2006 11:24 UTC (Thu)
by lacostej (guest, #2760)
[Link] (1 responses)
Posted Apr 6, 2006 11:57 UTC (Thu)
by mingo (guest, #31122)
[Link]
Posted Apr 3, 2006 16:46 UTC (Mon)
by anLWNreader (guest, #36915)
[Link] (17 responses)
However, unlike with BitKeeper, when and if they eventually disappear it will not hurt anybody. Open-source static analysis tools were available for a long time (splint and uno come to mind), it's only that they were not used much. The reason is that the class of bugs they can detect is very limited, and this applies to Coverity's tool too.
Posted Apr 3, 2006 17:15 UTC (Mon)
by AnswerGuy (guest, #1256)
[Link] (16 responses)
I recall that Coverity evolved out of the Stanford Checker, which used a modified version of gcc (called xgcc). Of course Engler and his team never distributed any derivative of their work. Ergo they have never been obliged to release their sources. However, I think it's high time a group in the open source community undertook a similar approach.
There are limits to what can be accomplished by static code analysis. But the yields are low hanging fruit which should be plucked as efficiently as possible so we can leave our best and brightest minds free to focus on more interesting problems.
JimD
Posted Apr 3, 2006 17:29 UTC (Mon)
by halla (subscriber, #14185)
[Link]
Posted Apr 3, 2006 17:36 UTC (Mon)
by madscientist (subscriber, #16861)
[Link]
Posted Apr 3, 2006 21:43 UTC (Mon)
by iabervon (subscriber, #722)
[Link] (4 responses)
The real issue is that checking is only useful with a precise definition of what would be wrong. If it carefully does something which isn't actually what it's supposed to do, the static checker can't tell that that wasn't what it was supposed to do. For example, I've just had bugs where it was printing the wrong values for labels on a chart, and using the wrong string length for deciding how many labels to have. It would be impractical to write a checker which could identify that my generated chart shows things wrong.
Posted Apr 3, 2006 23:53 UTC (Mon)
by nix (subscriber, #2304)
[Link] (3 responses)
Posted Apr 4, 2006 1:02 UTC (Tue)
by iabervon (subscriber, #722)
[Link] (2 responses)
Rice's theorem is a problem for compiler optimization, where you don't care whether the code is at all sane; you want to generate code that works regardless. And it's an issue for deciding whether to accept code to run, where you want to run any requested code which doesn't break any rules.
But a static checker doesn't have to worry too much about Rice's theorem (or decidability in general), because you want the code to not just be correct, but obviously correct. And that's a bit vague, but a finite limit on the complexity of the analysis that should be attempted to prove correctness is certainly appropriate.
Posted Apr 4, 2006 19:30 UTC (Tue)
by nix (subscriber, #2304)
[Link] (1 responses)
I wish this wasn't true. I want a better universe: this one's broken.
Posted Apr 4, 2006 20:15 UTC (Tue)
by iabervon (subscriber, #722)
[Link]
And, of course, a static checker is only really useful when the number of cases it fails to complete which people can tell are okay is low enough that it can usefully flag as flaws everything that it can't prove.
Posted Apr 3, 2006 23:30 UTC (Mon)
by cventers (guest, #31465)
[Link] (2 responses)
Posted Apr 4, 2006 0:20 UTC (Tue)
by dlang (guest, #313)
[Link] (1 responses)
Posted Apr 4, 2006 0:44 UTC (Tue)
by jtc (guest, #6246)
[Link]
That's right! To elaborate: DBC is not a run-time tool, although some toolsets that support DBC provide useful runtime-checking facilities (checking assertions, preconditions, etc. at run time). The main point of DBC is to document precise specifications for interfaces, which allow clients (programmers using the specifications) to use the interfaces correctly and to make it easier to discover defects in software that uses these interfaces (whether by testing or inspection).
Run-time checking of assertions certainly is useful, but often must be turned off in production systems for efficiency.
Posted Apr 4, 2006 0:34 UTC (Tue)
by jtc (guest, #6246)
[Link]
AKA design by contract (DBC):
http://en.wikipedia.org/wiki/Design_by_contract
Posted Apr 4, 2006 9:06 UTC (Tue)
by eru (subscriber, #2753)
[Link] (4 responses)
IMHO C would benefit most from some minor language changes that
would remove the most commonly recurring idiotic mistakes at the time
the code is first compiled! At the top of my list would be:
These changes would greatly reduce C bugs without affecting efficiency
at all, or make programs any more verbose. Of course existing programs
would need modifications, but these could be largely automated.
Posted Apr 4, 2006 15:11 UTC (Tue)
by vmole (guest, #111)
[Link] (3 responses)
Umm, "minor changes"? I don't think breaking 99.9%[1] of the existing code base is "minor".
Which isn't to say they aren't good ideas for a language, but you'll never get them added to C, except possibly the required prototype one (which a lot
of compilers can enforce now) and the wrong or missing return type (ditto).
[1] Totally made up statistic. Don't whine.
Posted Apr 5, 2006 5:41 UTC (Wed)
by eru (subscriber, #2753)
[Link] (2 responses)
As I already aknowledged in my last sentence, existing programs would have
to be changed (even though most changes could be automated). In that
sense the changes are not "minor". But on the other
hand the resulting language would still be almost identical to C, it would
retain all the good features of C, and programmers would quickly get used to
the modified rules.
Getting this into the official C standard is of course hopeless. but it would
probably not be too much work to implement it as an option in GCC.
Posted Apr 6, 2006 0:59 UTC (Thu)
by xoddam (guest, #2322)
[Link] (1 responses)
Posted Apr 6, 2006 5:51 UTC (Thu)
by eru (subscriber, #2753)
[Link]
You mean, you could automatically convert existing buggy
code into something semantically identical which passes
all your new-fangled static checking.
You miss my point. I was just addressing backward-compatibility
concerns when initally moving a lot of code to the new system. Certainly
a conversion of a buggy program is still a buggy program (although the
converter should highlight dubious bits in the code for possible
corrections). The real value of the proposed new rules would be realized
when writing new code or hand-modifying old.
splint already supports most of the things you suggest, without
changes to the syntax of C itself (it uses comments and/or
macros) or to the semantics of existing code.
So why isn't it used more? Answer: precisely because it is an extra
pass, it is not installed everywhere the compiler is, and it requires extra
annotations to be really useful. Correctness is not an add-on feature.
Programmers should mind it all the time when writing code, without
imagining it can be retrofitted with a final lint run, or with testing.
Having every compiler run nag about dubious code helps better in achieving
this. Anyway, my proposed changes are really not so much about adding extra
statical checking, but removing totally unnecessary error sources from
the language.
Posted Apr 3, 2006 17:04 UTC (Mon)
by rst (guest, #5098)
[Link] (2 responses)
(Also, the problem with BK was the abusive conditions in the "free" license. Coverity couldn't pull the same kind of stunts right now even if they wanted to --- since they're not distributing their code at all, they haven't got the lever that the BK license gave Larry).
I'm not saying this relationship couldn't go sour, but it seems a whole lot more pleasant than the BK situation, so far.
Posted Apr 3, 2006 17:22 UTC (Mon)
by dmarti (subscriber, #11625)
[Link] (1 responses)
Posted Apr 4, 2006 12:13 UTC (Tue)
by gowen (guest, #23914)
[Link]
Posted Apr 3, 2006 17:44 UTC (Mon)
by iabervon (subscriber, #722)
[Link]
And, of course, the BitKeeper situation turned out to be a net win for free software, since adopting BitKeeper greatly improved productivity, and adopting git again improved productivity. Alternatively, it convinced Linus that version control could be done better than CVS did it, so it was worth trying to do better than even BitKeeper did it.
Posted Apr 3, 2006 23:33 UTC (Mon)
by cventers (guest, #31465)
[Link] (3 responses)
Posted Apr 4, 2006 13:11 UTC (Tue)
by Wol (subscriber, #4433)
[Link] (1 responses)
The problem was, as Linus himself said, "Linus can't scale".
BitKeeper was needed to reduce the load on Linus. And, fortunately for us, BK solved the problem of how to scale Linus, so when BK went away it was easy to write a new tool to do the same thing. Without BK, we would never have had git, so on balance BK almost certainly was good, even if it did cause a few problems along the way.
Cheers,
Posted Apr 6, 2006 1:03 UTC (Thu)
by xoddam (guest, #2322)
[Link]
Posted Apr 6, 2006 16:55 UTC (Thu)
by JoeBuck (subscriber, #2330)
[Link]
The situation is not analogous to BitKeeper in that there is no lock-in issue.
There may be an issue with patents that Coverity might hold, but as Bruce Perens has explained, it is safest for those of us in the field not to look for those patents (damages triple for "knowing infringement").
Posted Apr 5, 2006 18:24 UTC (Wed)
by mdhirsch (guest, #5924)
[Link]
I wish they wouldn't phrase things this way. I have in-house code that shows zero defects according to Coverity, and I assure you that it's still full of bugs, just not the class of bugs that Coverity detects.
Coverity: one bug fixed every six minutes
This is starting to remind me of the infamous BitKeeper situation. Coverity will give free software all this "aid" but once they make a sufficient name for themselves (and build a paying client base) they'll disappear just like McVoy.Coverity: one bug fixed every six minutes
Maybe. On the other hand, if they do that, there is no migration pain. The projects just stop benifitting from their efforts. There won't be a switch-over period. In other words, these projects don't rely on Coverity.Coverity: one bug fixed every six minutes
Actually, there is migration pain, because Coverity hinders GPL-ed debugging tools (such as Sparse) in several ways:Coverity: one bug fixed every six minutes
If such an open source tool existed, it could still be ran against the old version of the software to find errors in (before they got fixed).Coverity: one bug fixed every six minutes
That would be a good way to check its efficiency.
Nothing beats the experience of finding bugs in the latest and greatest. Just like Coverity needs (and uses) the resulting PR (both small-scale and large-scale PR) of finding bugs, an open-source source-code-validator project needs similar feedback. Much fewer people will use and rely on an OSS tool that can only find old bugs.Coverity: one bug fixed every six minutes
Yes indeed Coverity is just using the free-software movement to build a name for themselves, but it is fair because in turn they are giving back.Coverity: one bug fixed every six minutes
Clearly there are limits to the types of bugs that can be found through static analysis of C programs. (If nothing else the halting problem and Godel's Theorem are clear indications that no form of analysis can guarantee that any non-trivial code in any "sufficiently powerful" language (or other axiomatic system) is bug free.Limitations and Evolution
However, we can do far better than our current limits if we also adopt some programming extensions and programming practices that make the job easier. For example C would benefit from much more extensive use of assertions ... and some language features to support static and stochastic simulation tests which incorporate those assertions --- and some features for PBC (programming by contract) --- (ultimately three specific forms of assertions: pre-conditions, invariants and post conditions).
Indeed, Krita got some very nice bugfixes from someone who was running an Limitations and Evolution
analysis tool, and quite a few bugs were found because we use asserts
very liberally in Krita.
I realize you never claimed otherwise, directly, but to be clear Coverity no longer uses GCC, modified or not. They are now using a different code parsing front-end, which is proprietary.Limitations and Evolution
The halting problem doesn't really matter for debugging; the halting problem means that you can't determine whether a flaw actually could cause problems in practice, but anything that you can't statically analyze in finite time is at least bad form, because other programmers won't be able to tell whether it works. Of course, the system has to be more clever than current systems are in order to make the same determinations that programmers do.Limitations and Evolution
Rice's theorem really *is* a problem, and that's a direct consequence of the halting problem. It pretty much condemns us to either using heuristics or always providing a way to bail out of an optimization that isn't getting anywhere.Limitations and Evolution
Rice's theorem is only a problem if you consider a program okay if it never commits an array bounds violation (for example), even if it's impractical to demonstrate that it won't. But if it takes substantial analysis to determine that the program doesn't have a problem, than that's a bug anyway; at the very least, the next person to touch the code is likely to make a change that makes it misbehave.Limitations and Evolution
Unfortunately you don't need particularly contrived code to collide with Rice's theorem, in my experience :( even static checkers will frequently need to say `oh, I give up' and not warn about it, at least for some classes of test.Limitations and Evolution
I think that's not so much a consequence of Rice's theorem as an effect of not supporting quite all of the sorts of constraints programmers use. A bigger issue is that, even for decidable questions of interest (e.g., is there a proof of the correctness of this code which would fit on a single page?) they're often NP-complete. So you get code with a comment explaining why it's okay, and programmers can check that it's true in polynomial time, but the static checker can't come up with the correct rule in a reasonable time (and, of course, the original programmer came up with the constraint first and then wrote the code to obey it). But that's an issue of feasibility, not decidability. This can be made practical with annotations and assertions, although these have to be made acceptable to programmers by being sufficiently readable and writable (so the programmer can write them without much trouble, and so other programmers find them helpful in understanding why the code works).Limitations and Evolution
Well... I'm not a big fan of excessive assertions or runtime PBC. There is Limitations and Evolution
a point where it's very obnoxious, because you're wasting time for every
single operation just to make sure you (the programmer) didn't make a
certain mistake.
I'm not totally against error checking; indeed, I think you should
vigorously meter anything coming in and out of your program or library. My
own code is extremely anal about checking the return value of every system
call / library call. It's almost always possible for my program to back
out from and continue operating when it encounters, say, a malloc()
failure, etc.
I've found in my own experience that if you build your code very anally in
this way, you end up with something that is *very* fault tolerant, and if
and when it does fail, it fails very close to the bug site (rather than
halfway across the app).
this isn't a runtime toolLimitations and Evolution
"this isn't a runtime tool"Limitations and Evolution
"However, we can do far better than our current limits if we also adopt some programming extensions and programming practices that make the job easier. For example C would benefit from much more extensive use of assertions ... and some language features to support static and stochastic simulation tests which incorporate those assertions --- and some features for PBC (programming by contract) --- (ultimately three specific forms of assertions: pre-conditions, invariants and post conditions)."Limitations and Evolution
For example C would benefit from much more extensive use of assertions ... and some language features to support static and stochastic simulation tests which incorporate those assertions --- and some features for PBC (programming by contract) --- (ultimately three specific forms of assertions: pre-conditions, invariants and post conditions).
Limitations and Evolution
Limitations and Evolution
Umm, "minor changes"? I don't think breaking 99.9%[1] of the existing code base is "minor".
Limitations and Evolution
> most changes could be automated Automatically make buggy code pass static checkers! Yay!
You mean, you could automatically convert existing buggy
code into something semantically identical which passes
all your new-fangled static checking.
>... it would probably not be too much work to implement it
> as an option in GCC.
splint already supports most of the things you suggest, without
changes to the syntax of C itself (it uses comments and/or
macros) or to the semantics of existing code. Running an
extra checker over the source isn't very much different from
enabling a compiler option. And like -Wall and -Werror, you
can gradually fix warnings by hand, then decide on a per-compilation-unit
basis when to enforce them.
http://www.splint.org
> most changes could be automated
Missing my point...
As others have noted, Coverity wouldn't do immediate damage to free software projects if they stopped reporting new bugs. But they would do damage to themselves --- they'd lose the free advertising, and feedback from an interested set of (effectively) beta test users on the quality of the results from whatever new tests they toss in. Which seems at least as much of a lose for them as the loss of reports on new bugs would be for, say, the Linux kernel developers.
Coverity benefits, too...
I agree -- Coverity could be finding bugs using a patented kitten-shredding machine and it wouldn't matter once they're reported to the project maintainers.Coverity benefits, too...
It would matter to me, since I have a patent on 'Method to detect bugs in a compiled computer language using feline evisceration techniques'Coverity benefits, too...
The Coverity thing is somewhat different because they have a government contract to provide this service, and they aren't exposing any of their IP to do so, so they can't go nuts over licensing.Coverity: one bug fixed every six minutes
I doubt it. Coverity might stop contributing reports if someone started Coverity: one bug fixed every six minutes
developing a competitive free software option; otherwise, when running
their checker on open source costs virtually no resources and makes them
buddy-buddy with everyone, why not keep it up?
BK was a disaster because in order to really work on Linux you had to
swallow BK, and the license you had to agree to in order to do so was
nuts.
I'm a very strong supporter of free software. I think BK was a huge
mistake (though in general, I very much respect Linus's decision-making).
But I have to be honest -- I see absolutely nothing wrong with Coverity,
as it stands today. Though Coverity hasn't scanned any of my code, I still
have to say "thank you" for helping to improve the quality of many free
software projects I know and love.
Plenty of developers managed without BK ...Coverity: one bug fixed every six minutes
Wol
> Without BK, we would never have had git, so on balance BK all for the good -- except for poor lm.
> BK almost certainly was good
Yes. Linus stood for a while on the shoulders of a very
cantankerous giant, who like scaffolding proved to be
dispensable once the bridge was completed :-)
Any Coverity customer can continue to run the tool on free software and issue bug reports; the Coverity license agreement does not prevent people from communicating to others about the bugs it finds. When Coverity's contract is up, this is probably what will happen (full disclosure: my employer is a Coverity customer).
Coverity: one bug fixed every six minutes
Several folks have complained about the non-existence of similar Open Open Source analysis tool
Source tools. Findbugs (http://findbugs.sourceforge.net/) is a great tool
which doesn't compete directly with Coverity because it is for Java rather
than C, but it is a great static analysis tool, and it's GPLed.