On the use of LLM assistants for kernel development
Arguably, the current round of debate began with this article on a presentation by Sasha Levin at the Open Source Summit North America in June; his use of an LLM to generate a kernel patch came as a surprise to some developers, including the maintainer who accepted that patch. Since then, David Alan Gilbert has posted a patch proposing requirements for the disclosure of LLM use in kernel development. Levin has posted a series of his own focused on providing configurations for coding assistants and guidelines for their use. Both of these submissions have provoked discussions ranging beyond their relatively narrow objectives.
Gilbert suggested the use of a new patch tag, Generated-by, to identify a tool that was used to create a kernel patch; that tag would be expected not just for LLM-generated patches, but also patches from long-accepted tools like Coccinelle. Levin, instead, suggests using the existing Co-developed-by tag, but takes pains to point out that an LLM should not add the Signed-off-by tag that normally is required alongside Co-developed-by. Either way, the suggestion is the addition of information to the tags section of any patch that was generated by an LLM-based tool.
A step back
While much of the discussion jumped directly into the details of these
patches, some developers clearly feel that there is a more fundamental
question to answer first: does the kernel community want to accept
LLM-developed patches at all? Vlastimil Babka responded
that Levin's patch set was "premature
", and that there was a need to
set the rules for humans to follow before trying to properly configure
LLMs:
So without such policy first, I fear just merging this alone would send the message that the kernel is now officially accepting contributions done with coding assistants, and those assistants will do the right things based on these configuration files, and the developers using the assistants don't need to concern themselves with anything more, as it's all covered by the configuration.
Lorenzo Stoakes said
that "an official kernel AI policy document
" is needed first, and
suggested that it would be best discussed at the Maintainers Summit (to be
held in December). He agreed with Babka that merging the patches in the
absence of such a policy would be equivalent to a public statement that
LLM-generated patches are welcome in the kernel community.
A number of developers expressed concerns that these tools will be used to
generate patches that are not understood by their submitters and which may
contain more than the usual number of subtle bugs. David Hildenbrand worried
that he would end up dealing with contributors who simply submit his
questions to the tool that generated the patch in the first place, since
they are unable to explain the code on their own. He also pointed out the
policy
adopted by the QEMU project, which essentially bans LLM-generated
contributions in that project. Al Viro described LLM-based tools
as "a force multiplier
" for the numerous developers who have, for
years, been submitting machine-generated patches that they don't
understand.
Mark Brown, instead, suggested that these tools will be used regardless of the kernel policy:
I'm also concerned about submitters just silently using this stuff anyway regardless of what we say, from that point of view there's something to be said for encouraging people to be open and honest about it so it can be taken into consideration when looking at the changes that get sent.
Levin's point of view is that
the current policy for the kernel is that "we accept agent generated
contributions without any requirements beyond what applies to regular
humans
"; his objective is to work out what those extra requirements
should be. It should also be noted that some developers clearly feel that
these tools are helpful; Kees Cook, for example, argued against any sort
of ban, saying it would be "not useful, realistic, nor enforceable
".
Elsewhere, he has commented that
"the tools are finally getting interesting
".
Disclosure
If the kernel project were to ban LLM-generated code, then the rest of the
discussion would be moot, but that would appear to be an unlikely outcome.
If one assumes that there will be (more) LLM-generated code entering the
kernel, a number of questions come up, starting with disclosure of tool
use. Both Gilbert and Levin propose the addition of patch tags to document
this use. A couple of developers disagreed with that idea, though;
Konstantin Ryabitsev said that
this information belongs in the cover letter of a patch series, rather than
in the tags. That is how code generated by tools is described now, and he
did not see a reason to change that practice. Jakub Kicinski argued that the
information about tools was "only relevant during the review
", so
putting it into patch changelogs at all "is just free advertising
"
for the tools in question.
The consensus view, though, would appear to be in favor of including tool
information in the patch itself. Cook, who initially favored keeping tool
information out of the tags, later acknowledged that it would
be useful should the need come to track down all of the patches created by
a specific tool. Steve Rostedt said that this
information could be useful to find patterns of bugs introduced by a
specific tool. Laurent Pinchart noted
that formalized patch tags would be useful for tracking down any
copyright-related problems as well. Gilbert commented that disclosure
"lets the people who worry keep of track what our mechanical overlords
are doing
".
If one takes the position that tool use must be disclosed, the next
question is inevitably: where should the line be drawn? Levin asked whether the use of a
code-completion tool requires disclosure, for example. Others have
mentioned using compiler diagnostics to find problems or the use of
language-sensitive editors. There is clearly a point where requiring
disclosure makes no sense, but there does not, yet, appear to be a
consensus on where that point is. One possible rule might be this one suggested by
Rostedt: "if AI creates any algorithm for you then it must be
disclosed
".
Meanwhile, Levin's first attempt to disclose LLM usage with a Co-developed-by tag drew an amused response from Andrew Morton, who seemingly had not been following this conversation. Hildenbrand responded that a new tag, such as Assisted-by, would be more appropriate; Ryabitsev has also made that suggestion.
Copyright and responsibility
The copyright status of LLM-generated code is of concern to many developers; if LLM-generated code ends up being subject to somebody's copyright claim, accepting it into the kernel could set the project up for a future SCO-lawsuit scenario. This, of course, is an issue that goes far beyond the kernel community and will likely take years of court battles worldwide to work out. Meanwhile, though, maintainers will be asked to accept LLM-generated patches, and will have to make decisions long before the legal processes have run their course.
Levin pointed to the generative-AI guidance from the Linux Foundation, saying that it is the policy that the kernel community is implicitly following now. In short, this guidance suggest that developers should ensure that the tool itself does not place restrictions on the code it generates, and that said code does not incorporate any pre-existing, copyrighted material. Levin suggested using this document as a starting point for judging the copyright status of submissions, but that guidance is only so helpful.
Michal Hocko asked how
maintainers can be expected to know whether the conditions suggested in
that "quite vague
" guidance have been met. Levin's answer reflects a theme that came
up a few times in the discussion: that is what the Signed-off-by
tag applied by the patch submitter is for. By applying that tag, the
submitter is indicating that the patch is a legitimate contribution to the
kernel. As with any other patch, a contributor needs to be sure they are
on solid ground before adding that tag.
That reasoning extends beyond just copyright status to responsibility for
the patch at all levels. Rostedt suggested
documenting that a signoff is also an indication that the submitter
understands the code and can fix problems with it. Viro said that, for any patch
regardless of origin, "there must be somebody able to handle active
questioning
" about it. Levin added that: "AI doesn't send
patches on its own - humans do
", so it is the human behind the patch
who will ultimately be responsible for its contents.
The reasoning makes some sense, but may not be entirely comforting to
nervous maintainers. The people submitting LLM-generated patches are not
likely to be in a better position to judge the copyright status of that
work than maintainers are. Meanwhile, maintainers have had to deal with
patches from contributors who clearly do not understand what they are doing
for many years; documenting that those contributors must understand the
output from coding tools seems unlikely to slow down that flood.
Hildenbrand expressed
his concern this way: "We cannot keep complaining about maintainer
overload and, at the same time, encourage people to bombard us with even
more of that stuff
". Based on what has been seen in other areas, it
would not be surprising to see an order-of-magnitude increase in the flow
of low-quality patches; indeed, Greg Kroah-Hartman said that it is
already happening.
More discussion
The end result is that the question of how to incorporate LLM-based development tools into the kernel project's workflow is likely to feature prominently in community discussions for some time. While these tools may bring benefits, including finding patterns that are difficult for humans to see and the patient generation of test code, they also have the potential to bring copyright problems, bugs, and added maintainer stress. The pressure to use these tools is not going away, and even the eventual popping of the current AI bubble seems unlikely to change that.
Within a few milliseconds of the posting of the call for topics for the
2025 Maintainers Summit, there were two separate proposals (from Stoakes
and Jiri
Kosina) on the issue of AI-based tools in the kernel workflow; they
have sparked discussions that will surely have progressed significantly by
the time this article is published. One does not, it seems, need an LLM to
generate vast amounts of text. This conversation is, in other words,
just beginning.
Index entries for this article | |
---|---|
Kernel | Development tools |
Posted Aug 7, 2025 22:17 UTC (Thu)
by jepsis (subscriber, #130218)
[Link] (23 responses)
LLMs are not so strong for programming, especially when it comes to creating something totally new. They usually need very limited and specific context to work well. Still, they can help in reviewing your own code. I use them mostly for litmus-testing ideas and checking if something could work in practice.
Posted Aug 7, 2025 23:02 UTC (Thu)
by willy (subscriber, #9762)
[Link] (19 responses)
The problem the Linux kernel has is not that we have too few patches to review!
Posted Aug 8, 2025 4:34 UTC (Fri)
by wtarreau (subscriber, #51152)
[Link] (18 responses)
IMHO mandating that patches disclose it when an assistant was involved is good, at least for the copyright reasons. And this can help reviewers decide whether or not they're interested in reviewing something which looks like junk (because it sometimes is), and even decide to simply ignore certain submitters who only provide such low-quality patches. It can also save some reviewer time to figure that it's probably not worth explaining in finest details what is wrong if the submitter is not going to understand.
We also need to think forward. In 5-10 years, there will be little distinction between AI assistants and coworkers, and we can hardly request from a submitter more when involving an AI assistant than what is required when they get help from a friend or coworker.
What is really not acceptable is submitters not able to discuss their changes, whether it comes from an LLM, they stole it from another project or they had someone else write it for them. But if they used an AI-assisted editor which properly formatted the patch and fixed typos in comments, maybe ran a first-level review or helped pick suitable variable names, and stuff that generally improves the quality and reduces the reviewer effort, I don't see a reason to reject this.
Posted Aug 8, 2025 8:01 UTC (Fri)
by khim (subscriber, #9252)
[Link] (17 responses)
No. There huge distinction: coworkers can be taught something LLMs would be forever clueless to your objections and requests. You can not teach LLM anything. Well… technically you can, of you train the new model with a [slightly] different properties – but that's not what contributors or coworkers would be able to afford any time soon. And that means that we should insist that contributors (and not maintainers) should shoulder the responsibility of fixing the warts LLMs would add to their submissions again and again.
Posted Aug 8, 2025 14:57 UTC (Fri)
by wtarreau (subscriber, #51152)
[Link] (16 responses)
Note that I purposely said "AI assistants" not "LLM". LLMs are dumb because they're only based on language and I anticipate that future generations will come leveraging more of the multi-modal approach and will be more flexible like humans, by involving multiple senses at once. In addition I'm pretty sure we'll start to imitate the way we currently function with short-term and long-term memory with conversion phases that we call "sleep" in our cases. LLMs can already do quite impressive things and that's unfortunately why people believe they're very smart. But they can be impressively limited and dumb as well sometimes. Note that to be honest we all know super dumb humans with which we gave up trying to explain certain things. I fear the day we'll have to run interviews to AI assistants to decide if they're smart enough to be hired...
Posted Aug 8, 2025 15:17 UTC (Fri)
by khim (subscriber, #9252)
[Link] (15 responses)
That's already the case: when LLM model is trained it “remembers” facts its “long-term memory”. The only problem: it's entirely cost-prohibitively to offer mode where one may train LLM and later its long-term memory. In fact it's even prohibitively expensive to run existing models, too, thus pretty soon we would see serious regressions in all these tools capabilities. An era of “AI assistants” that you describe would come 10-20 years down the road, maybe even later.
Posted Aug 8, 2025 15:52 UTC (Fri)
by anselm (subscriber, #2796)
[Link] (14 responses)
Not really. LLMs don't deal in facts, they deal in probabilities, as in “what word is most likely to complete this partial sentence/paragraph/text?” These probabilities can be skewed in various ways through training and prompting, but it is important to keep in mind that to an LLM, the world is essentially alphabet soup – it has no underlying body of abstract factual knowledge from which it could draw logical conclusions like we humans do.
LLMs can certainly produce results that seem impressive, but in the long run they're probably just a side branch on the path to actual AI. If you ask Sam Altman he will tell you that OpenAI is only a year or so (and a few tens of billions of dollars) away from “artificial general intelligence”, but he's been doing that for years now and it's very hard to see how that would work given what they've been doing so far.
Posted Aug 8, 2025 16:08 UTC (Fri)
by khim (subscriber, #9252)
[Link] (13 responses)
Yes, but that's entirely different kettle of fish: humans have the world models, in fact they start form in human brain before humans learn to speak, starting from peekaboo and hide-and-seek games. LLMs doesn't have anything remotely similar to that, that's why they couldn't say “I don't know how to do that”: humans say that when their world model shows the “hole” and LLMs couldn't do that since there are no world model, it's all probabilities all the way down. In rare cases where says “I don't know” or “this thing is probably doesn't exist” (it happens sometimes, if rarely) they simply found it highly probably, based on their training set, that this response would be, probably, the most appropriate one. The only “memory” LLM have are related to probabilities… that doesn't mean that there are long-term memory, it just means that it's different from what humans have. That's yet another kettle of fish. I think AGI would soon be relegated to the annals of history, anyway: it's obvious that pure scaling wouldn't give anything similar to “average human worker” any time soon – and AGI is something that have marketable appeal only in that “scale is all you need” world. If we would be forced to match human capabilities by slowly and painstakingly adding more and more specialized modules, then AGI loses it's appeal: is still achievable – but somewhere in XXII or XXIII century where last 0.01% of something that human was doing better than pre-AGI system is conquered. By that time our AI is so drastically superhuman at everything else that saying that we have reached AGI no longer makes sense. It's more of “ oh yeah, finally… at arrived… what else is new?” moment, rather that something to talk about.
Posted Aug 8, 2025 20:45 UTC (Fri)
by wtarreau (subscriber, #51152)
[Link] (12 responses)
Posted Aug 8, 2025 21:09 UTC (Fri)
by khim (subscriber, #9252)
[Link] (10 responses)
Surprisingly enough that's already covered. Existing chatbots don't have “empathy”, “empathy” or “self-conciousness”, but they imitate them well enough to achieve pretty disturbing results. And I'm pretty sure they would do a superb job working as missionaries for various religious sects. No problem there at all: nefarious uses of LLMs scale surprisingly well. LLMs fail utterly when long chains of logical reasoning is needed, though. Highly unlikely. In fact the biggest obstacle to the use of LLMs is the fact that people try to apply what they have learned from books and movies about how “sentient robots” would behave over last century or so. Which is understandable, but also incredibly wrong. In books and movies “sentient robots” are always logical, correct and precise and it's a big problem for them to express emotion or simulate empathy… in real world LLMs can do all these things that “sentient robots” from all these countless books and movies struggled pretty easily… what they couldn't do are things that people expect them to do: logical reasoning, precision, reproducibility… That's another thing that plagues the whole industry: what all the presentations and demos portray and “sell” and what CEO expect to buy… are these “sentient robots” from movies. What they get… is something entirely different, something totally unsuitable for the role where “sentient robots” would fit perfectly. That's why Klarna rehires people back, and IBM hires people for “critical thinking” focused domains and Duolingo puts people back… it's all because LLMs are the total opposite from “sentient robots” in movies. If you read a summary papers written by press-people then you would hear how people are rehired because robots “lack empathy” or “emotions”, but that's a big fat lie: robots have more than enough empathy and emotions, spammers just simply love that aspect of LLMs… what they lack are “common sense” and “logic”.
Posted Aug 8, 2025 22:41 UTC (Fri)
by Wol (subscriber, #4433)
[Link]
Given that "common sense" isn't common, and rarely makes sense, maybe that's just as well!!!
Cheers,
Posted Aug 9, 2025 5:57 UTC (Sat)
by wtarreau (subscriber, #51152)
[Link] (8 responses)
Posted Aug 9, 2025 9:09 UTC (Sat)
by excors (subscriber, #95769)
[Link]
One example I've seen is giving ChatGPT 5 - which was announced as having "PhD-level intelligence" - the prompt "Solve: 5.9 = x + 5.11". When I repeated it myself, 50% of the time it said x=0.79, and 50% of the time it said x=-0.21.
In both cases it gave a superficially reasonable step-by-step explanation, saying things like "5.90 - 5.11 = (5.90 - 5.11) = 0.79 but since 5.90 is less than 5.11 in the hundredths place, the result will be negative: x = -0.21". That's nonsense, but it's confidently-stated half-correct nonsense, which engenders undeserved levels of trust.
In theory the system could make use of external calculators and powerful, reliable, human-designed algebraic tools. In practice it doesn't - it does probabilistic calculations on language tokens, resulting in something that sounds like a mathematical calculation but actually isn't, making it untrustworthy for even trivial tasks like this. (And somehow this is worth half a trillion dollars.)
Posted Aug 9, 2025 9:42 UTC (Sat)
by khim (subscriber, #9252)
[Link] (2 responses)
Nope. It order to think you need reliable world model somewhere under all these words. Half-century old SHRDLU may think while ChatGPT-5 couldn't. Sure, humans make mistakes (especially when they are distracted), but they may also notice them automatically and fix them. Thus doesn't work with LLMs, in fact if you try to push them they become even less accurate, then when they are not “thinking”. That's not imitation of human brain, though. That's imitation of insect brain or, maybe, a chimps brain (although a chimps have world model even if they are less complicated than humans world model). It's pure reaction with nothing to control the “train of though” and to stop it from derailing. The best illustration to what is happening with “reasoning” LLMs is picture from Wikipedia in the article Taylor series where it shows “ It's very easy to see that as “as the degree of the Taylor polynomial rises, it approaches the correct function” – but if you actually look on picture you'll notice how it does that: it becomes ever more precise in the small, but growing area around zero, but, simultaneously, also become ever more absurdly wrong in the area around that central part. And that's what is happening with LLMs: they are becoming ever more impressive at “one-shotting” things, yet. simultaneously, ever more helpless with attempts to handle a long series of tasks. This is similar to how very small kids behave, but eventually human learns to double-check and self-verify things… LLMs couldn't learn that, they simply have no mechanisms suitable for that. The latest fad in AI is to attach “tools” to LLMs and hope that Python interpreter would be able to work a reliable replacement for a world model. It wouldn't work: this would slightly expand area where LLMs would be able to “one-shot” things, but wouldn't fix the fundamental flaw in their construction.
Posted Aug 9, 2025 10:32 UTC (Sat)
by Wol (subscriber, #4433)
[Link] (1 responses)
It's not an imitation of ANY brain. Think about it. The brain has a lot of dedicated hardware, be it visual recognition, auditory recognition, whatever. And a small veneer of general purpose hardware over the top. AI runs on pure general purpose hardware.
And has been pointed out, a lot of the brain's special-purpose hardware is survival-ware - if the hardware gets it wrong, it's likely to end up as a lion's lunch, or whatever ...
Cheers,
Posted Aug 9, 2025 10:40 UTC (Sat)
by khim (subscriber, #9252)
[Link]
Isn't that GPT-5 “tools” and voice recognition in Gemini Live is for? Not really. It can be run, in theory, on general purpose hardware, but it's not clear if GPT-5 run on general purpose hardware would be at all practical. Even if you just think about BF16… it's pretty specialized thingie. Sure, but do we actually use that hardware where we are writing code? Somehow I doubt it. It's like arguing that LLM couldn't write good code because it doesn't have liver… sure, liver is very important for human, but lack of liver is not what stops LLM from being good software designer.
Posted Aug 11, 2025 10:11 UTC (Mon)
by paulj (subscriber, #341)
[Link] (3 responses)
The problem is (invent plausible stat and confidently handwave it about - highly appropriate in a thread on LLMs!) 99.999% of the populace doesn't know this, and lack the combination of technical background, curiosity and time to come to understand this. They think - because the hype machine (the effective combination of companies wanting to sell stuff and non-technical media jumping on pushing the latest buzz) has told them so - that this stuff is "intelligent" and will solve all problems.
Posted Aug 12, 2025 11:48 UTC (Tue)
by wtarreau (subscriber, #51152)
[Link]
But in addition starting to be careful about LLMs also teaches people to be careful of other people looking too smart. There is a huge confusion between knowledge and intelligence in general. Lots of people use the term "smart" or "intelligent" to describe a very knowledgeable person, and consider that someone lacking culture is "dumb". But I've seen people who, once explained the details of a problem, would suggest excellent ideas on how to solve them. *This* is intelligence. Those who only know everything and cannot use it except to look smart in conversations are just parrots. Of course it's way better when you have the two at once in the same person, and often smart people like to learn a lot of new stuff. But each profile has its uses. Right now LLMs solve only one part of the deduction needed for intelligence, and know a little bit of everything but nothing deeply enough to express a valid opinion or advice. Yes most people (as you say, 99.999% to stick with this thread) tend to ask them advices and opinions on stuff they are expected to know well since coming from the internet, but that they only superficially know.
Posted Aug 12, 2025 17:16 UTC (Tue)
by raven667 (subscriber, #5198)
[Link] (1 responses)
Posted Aug 12, 2025 17:21 UTC (Tue)
by pizza (subscriber, #46)
[Link]
To me it's not the "destruction" of so much [human] capital but the wasted/squandered opportunities.
Posted Aug 11, 2025 10:06 UTC (Mon)
by paulj (subscriber, #341)
[Link]
Posted Aug 8, 2025 4:34 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link]
Basically, you want the LLM to "red team" you and continuously *review* the code you're writing rather than writing code itself. I suspect we'll need some better way to interact than just prompt/response as (in order to reduce token count explosions), you want to feed diffs at intervals. If an LLM understood time (highly unlikely), it might even be able to "see" where you're going and possibly help create test cases or the like. If *that* could be fed into an "LSP" that annotates my source as I'm working, that is *much* closer to having `clang-tidy` or `clippy` point out issues as I'm developing (which I already have).
Posted Aug 29, 2025 14:30 UTC (Fri)
by nim-nim (subscriber, #34454)
[Link]
LLMs are over-engineered plagiarism automatons that have no opinion on the correctness of the stuff they are plagiarising, except it should trigger strong reactions (because the field relies on advertiser money). It’s GIGO on a massive scale, with some checks added post-facto to limit the amount of garbage that spills out. No one has checked that every bit of content that has been used to train an LLM is correct, right, proper, good, free of legal encumbrances, etc.
That’s the core difference and why LLM output requires human review.
Posted Aug 29, 2025 15:04 UTC (Fri)
by Wol (subscriber, #4433)
[Link]
> LLMs are not so strong for programming,
Programming is very much a language-related task. So which is it, LLMs are particularly effective for programming, or LLMs are useless at language? You can't have it both ways!
And has been pointed out, LLMs are very capable at chucking out text that is simultaneously extremely plausible, and complete bullshit. THAT is the problem.
The problem we want to solve isn't language, it's communication. And with absolutely no concept of comprehension or truth, LLMs are a serious liability.
That said, LLMs are good at cleaning text up round the edges - until this eager-beaverness of all the peddlers of this rubbish actually gets seriously in the way of actually doing what you want to! I'm sick to death of Acrobat's desperation to "Let me summarise this document for you", when I'm actually looking for the *detail* I need which a summary will pretty much inevitably remove. The same with Sheets and Gemini - if I need detail to solve a problem, the LAST thing I need is Artificial Idiocy trying to summarise what I'm looking at!
Cheers,
Posted Aug 7, 2025 22:57 UTC (Thu)
by SLi (subscriber, #53131)
[Link] (8 responses)
What would be the likelihood of a first submitter submitting a non-low-quality patch without an LLM?
This seems potentially both good and bad. I'd argue: *If* this lowers the bar and makes more people eventually graduate into "real" kernel development, that's not purely negative.
But sure, there's something qualitatively new about this; it's not just kids sending code as a Word document.
Actually, I would predict that we will outgrow this problem in a way that will hugely annoy some and make others decide it's not a problem. LLMs are still improving fast. Sure, there's always people who wouldn't see value in them and would claim they are no good even if they outperformed humans.
I think we will reach in not distant future a point where the LLMs will do well enough that most of their work will be thought to be made by a competent hyena. (I mean human, but I love that autocorrect.)
I think this would mean more pragmatically LLMs growing to the level where they are at least ok at kernel development, but also pretty good at knowing what they are not good at.
But if you want to ease maintainer burden, maybe make an LLM review patches where LLM contributed (I personally find it silly to say that an LLM "authored" something, just like I don't say code completion authored something). And then forward them to an LLM maintainer, who asks TorvaLLM to pull them. And have them argue if there should be a disclosure if unreliable humans touched the patch.
Posted Aug 8, 2025 6:24 UTC (Fri)
by gf2p8affineqb (subscriber, #124723)
[Link] (5 responses)
Posted Aug 8, 2025 7:43 UTC (Fri)
by Wol (subscriber, #4433)
[Link] (3 responses)
At the end of the day, most of the stuff on the net is rubbish. The quality of what an LLM outputs is directly correlated to the quality that goes in (it must be, without human review and feedback, it has no clue). Therefor, most LLM output has to be rubbish, too.
If your AI is based on a SMALL Language Model, where the stuff fed in has been checked for accuracy, then the results should be pretty decent. I don't use AI at all (as far as I know, the AI search engine slop generally has me going "what aren't you thinking !!!"), but my work now has a little AI that has access to all our help docs and thus does a decent job for most people - except that as always, people don't think, and people keep getting referred to Guru docs for more detail - HINT roughly 1/3 of the company doesn't have access to Guru, as a matter of policy!!! Argh!!!
Cheers,
Posted Aug 8, 2025 9:04 UTC (Fri)
by jepsis (subscriber, #130218)
[Link] (2 responses)
Here are some examples of useful prompts:
Is the naming of functions and variables consistent in this subsystem?
Are the comments sufficient, or should they be added to or improved?
If I were to submit this upstream, what aspects might attract nitpicking?
Does the commit message accurately reflect the commit, or are there any gaps?
Posted Aug 8, 2025 9:32 UTC (Fri)
by khim (subscriber, #9252)
[Link] (1 responses)
That's not “ first-time user working with LLVM” (LLM, I assume?). That's “experienced kernel developer trying LLM”. First time user request would be more of “here's the spec for that hardware that I have, write driver to it”. And then the resulting mess is sent to maintainer, warts, bugs and all.
Posted Aug 8, 2025 9:43 UTC (Fri)
by jepsis (subscriber, #130218)
[Link]
Sure. Good example. It would have been good to have that sentence checked by AI, as it would likely have corrected it.
Posted Aug 8, 2025 12:33 UTC (Fri)
by kleptog (subscriber, #1183)
[Link]
The step after that would be an LLM iterating over an AST so it doesn't have to worry about getting the syntax right, but I haven't read about that yet. It's not clear to me if that technology even exists yet.
Posted Aug 8, 2025 8:43 UTC (Fri)
by khim (subscriber, #9252)
[Link] (1 responses)
LLMs do precisely the opposite: they make first ever patch look better than your average patch, but they make it harder for a newcomer to “eventually graduate into "real" kernel development”. That's precisely the issue with current AI: degradation of output. LLMs don't have a world model and when you try to “teach” them they start performing worse and worse. To compensate their makers feed them terabytes, then petabytes of himan-produced data… but that well is almost exhausted, there are simply no data to feed into these. And this scaling only improves the initial output, it does nothing to the lack of the world model and ability to learn during the dialogue. Worse: as we know than when ape and human interact human turns into ape, not the other way around. The chances are high that story with LLMs would be the same: when complete novices would try to use LLMs to “become a kernel developers” they would become more and more accepting to LLM flaws instead of learning to fix them. This, too, would increase load placed on maintainers. Yes and no. They are feed more and more data, which improves the initial response, but does nothing to gradual degradation of output when you try to improve it. Sooner or later you hit the “model collapse” threshold and then you have to start from scratch. So far that haven't worked at all. LLMs are all too happy to generate nonsense output instead of admitting that they don't know how to do something. Given the fact that LLMs tend to collapse when feed their own input (that's why even most expensive plans don't give you the ability to generate long outputs, instead they give you the ability to request many short ones) – this would make the situation worse, not better.
Posted Aug 8, 2025 15:46 UTC (Fri)
by laurent.pinchart (subscriber, #71290)
[Link]
An interesting study on that topic: "Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task" (https://arxiv.org/abs/2506.08872)
Posted Aug 8, 2025 6:49 UTC (Fri)
by 奇跡 (subscriber, #178623)
[Link]
Posted Aug 8, 2025 9:15 UTC (Fri)
by rgb (subscriber, #57129)
[Link] (3 responses)
Posted Aug 10, 2025 11:16 UTC (Sun)
by abelloni (subscriber, #89904)
[Link]
Posted Aug 10, 2025 14:45 UTC (Sun)
by Wol (subscriber, #4433)
[Link] (1 responses)
Not if it affects the TYPE of bug that is in the code! As I think someone else pointed out, AIs and humans make different sorts of bugs. And if you don't know whether it was an AI or a human, it either (a) makes review much harder, or (b) makes missing things much more likely.
Having seen some AI code (that I was given) I wasn't impressed. It did the job, but it wasn't what I would have expected from someone who knew our coding style.
At the end of the day, I'm all for "no surprises". Who cares if it's an AI or a person. What matters is that it's declared, so the next guy knows what he's getting.
Cheers,
Posted Aug 11, 2025 7:47 UTC (Mon)
by kleptog (subscriber, #1183)
[Link]
But then it's easy right? "Doesn't match our coding style" is a perfectly valid reason to reject a patch.
I believe I got it from the PostgreSQL lists: after your patch the code should look like it's always been there.
Arguably, if new code doesn't follow the coding style (which is much broader than just where to put whitespace) then the author has not yet understood the code we'll enough to be submitting. Which covers the LLM case perfectly.
Posted Aug 9, 2025 7:45 UTC (Sat)
by gray_-_wolf (subscriber, #131074)
[Link] (4 responses)
This is interesting. I though pretty much all the LLMs these days place additional restriction(s?), in particular, that you cannot use the output to improve another LLM. Ignoring the hypocrisy of slurping all of GitHub and then putting this rule on their products, how does that work with submitting the code to the kernel?
I basically see only two possibilities. Either the submitter just cannot license the code as GPL-2.0 due to the restriction above, or they are risking their subscription to the LLM with every patch submission.
What am I missing here?
> and that said code does not incorporate any pre-existing, copyrighted material.
And how exactly am I supposed to ensure this?
Posted Aug 9, 2025 9:39 UTC (Sat)
by jepsis (subscriber, #130218)
[Link] (2 responses)
Patches to the Linux upstream are always derivative works of Linux and therefore fall under the GPLv2. In most cases, authors of patches or patch sets to Linux cannot claim separate copyright, and they typically do not meet the threshold of originality. Using any tools or AI does not change this.
Of course, if someone submits an entirely new and unconventional file system like 'VibeFS', copyright issues might arise. However, it is still highly unlikely that such a contribution would be approved, regardless of the tools used.
Posted Aug 14, 2025 13:00 UTC (Thu)
by rds (subscriber, #19403)
[Link] (1 responses)
Disney just decided to not use machine generated images of an actor (Dwayne Johnson) because of concerns over the copyright status of the film.
Posted Aug 14, 2025 14:10 UTC (Thu)
by Wol (subscriber, #4433)
[Link]
But the output of an LLM is based on the (copyrighted) material fed in. If the material that went in is copyrighted, saying "only material created by people ..." does not mean that what comes out of an LLM is copyright-free. All it means is that the LLM cannot add its own copyright to the mix.
This is very clear in the European legislation, which says it's perfectly okay for an LLM to hoover up copyrighted material to learn from (exactly the same as a human would!), but makes no statement whatsoever as to whether the output is copyrightable or a derivative work (just like a human!)
So assuming your statement is correct, US legislation says nothing whatsoever about whether the output of an LLM is copyrighted or not. All it says is that any *original* work by an LLM cannot be copyright.
Cheers,
Posted Aug 11, 2025 13:12 UTC (Mon)
by cesarb (subscriber, #6266)
[Link]
Not all of them. For instance, Qwen3 (https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507 and others) uses the Apache 2.0 license, and DeepSeek-R1 (https://huggingface.co/deepseek-ai/DeepSeek-R1) uses the MIT license (though see the note in that page about its distilled variants, the license depends on the model used as the base).
Posted Aug 10, 2025 11:05 UTC (Sun)
by alx.manpages (subscriber, #145117)
[Link]
— Brian W. Kernighan and P. J. Plauger in The Elements of Programming Style.
The people defending that LLMs might make it easy for new programmers (which are otherwise unable to contribute code) to contribute code, somehow expect those new programmers to be able to review the code produced by an LLM?
And for people that are already good programmers, will this reduce the work? Or will it increase it?
You've changed the task of authoring code --in which case you often self-restrict to a set of coding standards that significantly reduce the possibility of bugs--, to the task of reviewing code --which by nature is already twice as hard--, and fully unrestricted, because you can't trust an LLM to consistently self-restrict to some rules. The bugs will appear in the most unexpected corners.
Even for reviewing my own code, I wouldn't use an LLM. Reason: it might let two bugs pass for each one it catches, and I might have a false feeling of safety. I prefer knowing the limits of my deterministic tools, and improve them. And finding quality reviewers. That's what it takes for having good code.
Abandon all hope, ye who accept LLM code.
Posted Aug 13, 2025 19:04 UTC (Wed)
by mirabilos (subscriber, #84359)
[Link]
It’s a slippery slope starting by supporting slop sliding into the source.
Practical use of LLMs
Practical use of LLMs
Practical use of LLMs
> In 5-10 years, there will be little distinction between AI assistants and coworkers
Practical use of LLMs
Practical use of LLMs
> In addition I'm pretty sure we'll start to imitate the way we currently function with short-term and long-term memory with conversion phases that we call "sleep" in our cases.
Practical use of LLMs
Practical use of LLMs
when LLM model is trained it “remembers” facts its “long-term memory”.
> Not really. LLMs don't deal in facts, they deal in probabilities, as in “what word is most likely to complete this partial sentence/paragraph/text?”
Practical use of LLMs
Practical use of LLMs
> The main problem with "matching humans" is that they'll have to pass by empathy, empathy, self-conciousness and some may even develop their own religions etc.
Practical use of LLMs
Practical use of LLMs
Wol
Practical use of LLMs
Practical use of LLMs
> I.e. in order to think like us they have to be as unreliable.
Practical use of LLMs
sin x
and its Taylor approximations by polynomials of degree 1, 3, 5, 7, 9, 11, and 13 at x = 0”.Practical use of LLMs
Wol
> The brain has a lot of dedicated hardware, be it visual recognition, auditory recognition, whatever.
Practical use of LLMs
Practical use of LLMs
Practical use of LLMs
Practical use of LLMs
Practical use of LLMs
Practical use of LLMs
Practical use of LLMs
Practical use of LLMs
Practical use of LLMs
Wol
Lower bar to start kernel development?
Lower bar to start kernel development?
Lower bar to start kernel development?
Wol
I don’t see any issue with a first-time user working with LLVM.Lower bar to start kernel development?
Lower bar to start kernel development?
That's not “ first-time user working with LLVM” (LLM, I assume?). That's “experienced kernel developer trying LLM”.Lower bar to start kernel development?
Lower bar to start kernel development?
> I'd argue: *If* this lowers the bar and makes more people eventually graduate into "real" kernel development, that's not purely negative.
Lower bar to start kernel development?
Lower bar to start kernel development?
Tool use should be tagged in-tree
Jakub Kicinski argued that the information about tools was "only relevant during the review", so putting it into patch changelogs at all "is just free advertising" for the tools in question.
This strikes me as an oddly myopic take. Are the drawbacks of such "free advertising" not trivial compared to the obvious auditing/analysis benefits of documenting tool use in tree?
Don't Ask, Don't Tell
At the end of the day, a human is the author of the patch. He or she is responsible for the content and also the point of trust that can hold or break.
How they came up with the code, what tools they used, might be interesting, but not more than what school they went to or what other projects they are working on. It's tangential in the end.
Don't Ask, Don't Tell
Don't Ask, Don't Tell
Wol
Don't Ask, Don't Tell
Generative-AI guidance from the Linux Foundation
Generative-AI guidance from the Linux Foundation
Generative-AI guidance from the Linux Foundation
Generative-AI guidance from the Linux Foundation
Wol
Generative-AI guidance from the Linux Foundation
AI generated code is not useful
This is bad.