|
|
Log in / Subscribe / Register

Everyone insane or what?

Everyone insane or what?

Posted Apr 1, 2026 20:17 UTC (Wed) by pm215 (subscriber, #98099)
In reply to: Everyone insane or what? by pbonzini
Parent article: The role of LLMs in patch review

Yes, human reviewers can have false positives -- but resolving those has the benefit of passing knowledge to that human reviewer. False positives from an automated tool help nobody, they're pure loss.

The original Coverity authors had a paper decades back noting the importance of a low false positive rate -- if you have too many false positives then users will decide your tool isn't worth paying attention to, and are likely also to ignore any genuine problems it flags up.


to post comments

Everyone insane or what?

Posted Apr 1, 2026 20:27 UTC (Wed) by mb (subscriber, #50428) [Link] (1 responses)

>The original Coverity authors had a paper decades back noting the importance of a low false positive rate

What was the number?

Everyone insane or what?

Posted Apr 1, 2026 22:08 UTC (Wed) by pm215 (subscriber, #98099) [Link]

Digging out the ACM article I had in mind: https://www.cs.columbia.edu/~junfeng/18sp-e6121/papers/co... the part about false positives is on the last page. They say:

* above 30% is definitely bad
* they aimed for below 20%
* when forced to choose between more bugs and fewer false positives, choose the latter
* the initial reports are really important -- if the first few are bad then the response is "this tool sucks" and people reject it
* "you never want an embarrassing false positive. A stupid false positive implies the tool is stupid"

(My personal experience of Coverity today is that its false positive rate is way higher than I would like.)

Everyone insane or what?

Posted Apr 2, 2026 6:36 UTC (Thu) by pbonzini (subscriber, #60935) [Link] (1 responses)

Absolutely, however the Coverity paper is about a different kind of issue and report. A tool that has looks at a higher level, is able to look up related code, understands the names of variables can (at least for me) afford a higher rate.

That said I have used Coverity a lot more than Sashiko so I admit my picture might be excessively rosy.

> False positives from an automated tool help nobody, they're pure loss.

Not entirely - it can suggest that a comment is necessary, for example. For example see the second report for patch 10 at https://sashiko.dev/#/patchset/20260326181723.218115-1-pb..., which is correct but impossible *now*.

Everyone insane or what?

Posted Apr 2, 2026 8:19 UTC (Thu) by pm215 (subscriber, #98099) [Link]

For me, the false positive situation is true regardless of the tool and what level of analysis it performs, because the cost is the same -- I have to go through the bogus reports, figure out what it's suggesting, determine that it's wrong, and dismiss the report. I might hope that a tool capable of higher level analysis has a lower false positive rate (often Coverity f.p. reports are a result of an inability to see the higher level), but if it doesn't in practice have a low f.p. rate then it's just as bad and timewasting as any other.

Everyone insane or what?

Posted Apr 2, 2026 8:19 UTC (Thu) by khim (subscriber, #9252) [Link]

> False positives from an automated tool help nobody, they're pure loss.

That's not true with LLMs, surprisingly enough — and precisely for the same reason that's usually perceived as LLMs weakness.

LLMs couldn't “think”, but they are the world's best generators of bullshit (term from the 1986 year paper).

That makes their “true false positives” rare surprisingly low. If you define “false positive” not as “there are no issues with code but LLM says there is” but as “there are no issues with the code and it's obvious why there are no issues with the code” then false positives rate drops to almost zero, because most places where LLM finds “something fishy” (when, in fact, everything is fine) are quite tricky and deserve at least a comment if not change to the code.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds