|
|
Subscribe / Log in / New account

No disclosure for LLM-generated patch?

No disclosure for LLM-generated patch?

Posted Jun 27, 2025 10:45 UTC (Fri) by excors (subscriber, #95769)
In reply to: No disclosure for LLM-generated patch? by drago01
Parent article: Supporting kernel development with large language models

> In the end of the day what matters is whether the patch is correct or not, which is what reviews are for. The tools used to write it are not that relevant.

One problem is that reviewers typically assume the patch was submitted in good faith, and look for the kinds of errors that good-faith humans typically make (which the reviewer has learned through many years of experience, debugging their own code and other people's code).

If e.g. Jia Tan started submitting patches to your project, you wouldn't say "I know he deliberately introduced a subtle backdoor into OpenSSH and he's probably a front for a national intelligence service, but he also submitted plenty of genuinely useful patches while building up trust, so let's welcome him and just review all his patches carefully before accepting them". You'd understand that your review process is not infallible and he's going to try to sneak something past it, with malicious patches that look as non-suspicious as possible, so it's not worth the risk and you would simply ban him. Linux banned a whole university for a clumsy version of that: https://lwn.net/Articles/853717/. The source of a patch _does_ matter.

Similarly, LLMs generate code with errors that are not what a good-faith human would typically make, so they're not the kind of errors that reviewers are looking out for. A human isn't going to hallucinate a whole API and write a professional-looking well-documented patch that calls it, but an LLM will eagerly do so. In the best case, it'll waste reviewers' time as they try to figure out what the nonsense means. In the worst case there will be more subtle inhuman bugs that get missed because nobody is thinking to look for them.

At the same time, the explicit goal of generating code with LLMs is to make developers more productive at writing patches, meaning there will be more patches to review and reviewers will be under even more pressure. And in the long term there will be fewer new reviewers, because none of the junior developers who outsourced their understanding of the code to an LLM will be learning enough to take on that role. I think writing code is already the easiest and most enjoyable part of software development, so it seems like the worst part to be trying to automate away.


to post comments

No disclosure for LLM-generated patch?

Posted Jun 27, 2025 13:38 UTC (Fri) by drago01 (subscriber, #50715) [Link] (1 responses)

Well one of the reviewers would be the patch submitter, who used the LLM to save time.

If you don't trust that person and don't review his submission in detail the problem is unrelated to whether a LLM has been used or not.

No disclosure for LLM-generated patch?

Posted Jun 27, 2025 15:10 UTC (Fri) by SLi (subscriber, #53131) [Link]

Exactly. The submitter takes the responsibility for the sanity of the submissions. If it is bad, that's on them. It doesn't matter that they found it convenient to edit the scheluder code in MS Word.

No disclosure for LLM-generated patch?

Posted Jun 27, 2025 14:02 UTC (Fri) by martinfick (subscriber, #4455) [Link]

If these junior programmers are reviewing the output of the LLM before submitting the code for inclusion, doesn't that mean this would be helping get more coders to become reviewers sooner? Perhaps this will actually help resolve the reviewer crisis by making reading code finally a more common skill then writing it?


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds