|
|
Log in / Subscribe / Register

Maintainer input and code review

Maintainer input and code review

Posted Apr 21, 2026 16:15 UTC (Tue) by jorgegv (subscriber, #60484)
Parent article: Using LLMs to find Python C-extension bugs

From my experience using LLM for development, I'd suggest 2 improvements (I have found that they dramatically improve the quality of the result):

First, it would be great if the maintainers specified, broadly, which categories or types of bugs they are _not_ interested in. Then the LLM can be instructed to assess each bug against these restrictions and modify or discard its output accordingly.

As a second improvement, I always have a prompt request similar to the following: "For every piece of code developed (feature, bugfix, test, etc.), have an independent agent review the code. Be very critical. An agent should NEVER review its own code". This definitely increases the number of tokens consumed per feature, but I have found that the reviewer often finds things that the original developer has not or has misdeveloped. In extreme cases (e.g. when designing a complicated architecture refactor and test plan design) I have even requested a second review, to undo any ties between the first 2 agents.


to post comments

Maintainer input and code review

Posted Apr 21, 2026 23:32 UTC (Tue) by karath (subscriber, #19025) [Link]

In important work, I’d suggest using an entirely different model to perform the reviews. Defensive review of code is important.

Maintainer input and code review

Posted Apr 22, 2026 9:41 UTC (Wed) by devdanzin (subscriber, #183390) [Link]

> First, it would be great if the maintainers specified, broadly, which categories or types of bugs they are _not_ interested in. Then the LLM can be instructed to assess each bug against these restrictions and modify or discard its output accordingly.

That's a great suggestion, thank you! I've been tailoring the report format and style (removing reproducers, making lists of actionable items, etc.) accordingly to what maintainers request, but asking them about insignificant bug classes is a clear improvement. I'll start doing that.

> As a second improvement, I always have a prompt request similar to the following: "For every piece of code developed (feature, bugfix, test, etc.), have an independent agent review the code. Be very critical. An agent should NEVER review its own code".

There's something vaguely similar to this in place, and I'm working on a new plugin that addresses it more directly.

Right now, the analysis is run three times: two naive passes, in which the agents don't know about the others' results, and one informed pass, where the agent is fed the previous agents' findings. This allows to check for convergence (do two agents reach the same conclusions about a given bug?) and differential analysis (what do they disagree on?). And when we reproduce findings it works as independent confirmation from the main Claude Code instance.

I'm working on a related plugin, report-quality-gate, which goes through the report assessing relevance, tone, factual correctness, etc, of a report. In doing this, it reviews the report (as opposed to the findings). Adding an independent "adversarial" finding reviewer before this phase could be interesting, once the new plugin is done I can give it a try and see how it works. Thank you again for the suggestion!


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds