On the use of LLM assistants for kernel development

By Jonathan Corbet
August 7, 2025

By some appearances, at least, the kernel community has been relatively insulated from the onslaught of AI-driven software-development tools. There has not been a flood of vibe-coded memory-management patches — yet. But kernel development is, in the end, software development, and these tools threaten to change many aspects of how software development is done. In a world where companies are actively pushing their developers to use these tools, it is not surprising that the topic is increasingly prominent in kernel circles as well. There are currently a number of ongoing discussions about how tools based on large language models (LLMs) fit into the kernel-development community.

Arguably, the current round of debate began with this article on a presentation by Sasha Levin at the Open Source Summit North America in June; his use of an LLM to generate a kernel patch came as a surprise to some developers, including the maintainer who accepted that patch. Since then, David Alan Gilbert has posted a patch proposing requirements for the disclosure of LLM use in kernel development. Levin has posted a series of his own focused on providing configurations for coding assistants and guidelines for their use. Both of these submissions have provoked discussions ranging beyond their relatively narrow objectives.

Gilbert suggested the use of a new patch tag, Generated-by, to identify a tool that was used to create a kernel patch; that tag would be expected not just for LLM-generated patches, but also patches from long-accepted tools like Coccinelle. Levin, instead, suggests using the existing Co-developed-by tag, but takes pains to point out that an LLM should not add the Signed-off-by tag that normally is required alongside Co-developed-by. Either way, the suggestion is the addition of information to the tags section of any patch that was generated by an LLM-based tool.

A step back

While much of the discussion jumped directly into the details of these patches, some developers clearly feel that there is a more fundamental question to answer first: does the kernel community want to accept LLM-developed patches at all? Vlastimil Babka responded that Levin's patch set was "premature", and that there was a need to set the rules for humans to follow before trying to properly configure LLMs:

So without such policy first, I fear just merging this alone would send the message that the kernel is now officially accepting contributions done with coding assistants, and those assistants will do the right things based on these configuration files, and the developers using the assistants don't need to concern themselves with anything more, as it's all covered by the configuration.

Lorenzo Stoakes said that "an official kernel AI policy document" is needed first, and suggested that it would be best discussed at the Maintainers Summit (to be held in December). He agreed with Babka that merging the patches in the absence of such a policy would be equivalent to a public statement that LLM-generated patches are welcome in the kernel community.

A number of developers expressed concerns that these tools will be used to generate patches that are not understood by their submitters and which may contain more than the usual number of subtle bugs. David Hildenbrand worried that he would end up dealing with contributors who simply submit his questions to the tool that generated the patch in the first place, since they are unable to explain the code on their own. He also pointed out the policy adopted by the QEMU project, which essentially bans LLM-generated contributions in that project. Al Viro described LLM-based tools as "a force multiplier" for the numerous developers who have, for years, been submitting machine-generated patches that they don't understand.

Mark Brown, instead, suggested that these tools will be used regardless of the kernel policy:

I'm also concerned about submitters just silently using this stuff anyway regardless of what we say, from that point of view there's something to be said for encouraging people to be open and honest about it so it can be taken into consideration when looking at the changes that get sent.

Levin's point of view is that the current policy for the kernel is that "we accept agent generated contributions without any requirements beyond what applies to regular humans"; his objective is to work out what those extra requirements should be. It should also be noted that some developers clearly feel that these tools are helpful; Kees Cook, for example, argued against any sort of ban, saying it would be "not useful, realistic, nor enforceable". Elsewhere, he has commented that "the tools are finally getting interesting".

Disclosure

If the kernel project were to ban LLM-generated code, then the rest of the discussion would be moot, but that would appear to be an unlikely outcome. If one assumes that there will be (more) LLM-generated code entering the kernel, a number of questions come up, starting with disclosure of tool use. Both Gilbert and Levin propose the addition of patch tags to document this use. A couple of developers disagreed with that idea, though; Konstantin Ryabitsev said that this information belongs in the cover letter of a patch series, rather than in the tags. That is how code generated by tools is described now, and he did not see a reason to change that practice. Jakub Kicinski argued that the information about tools was "only relevant during the review", so putting it into patch changelogs at all "is just free advertising" for the tools in question.

The consensus view, though, would appear to be in favor of including tool information in the patch itself. Cook, who initially favored keeping tool information out of the tags, later acknowledged that it would be useful should the need come to track down all of the patches created by a specific tool. Steve Rostedt said that this information could be useful to find patterns of bugs introduced by a specific tool. Laurent Pinchart noted that formalized patch tags would be useful for tracking down any copyright-related problems as well. Gilbert commented that disclosure "lets the people who worry keep of track what our mechanical overlords are doing".

If one takes the position that tool use must be disclosed, the next question is inevitably: where should the line be drawn? Levin asked whether the use of a code-completion tool requires disclosure, for example. Others have mentioned using compiler diagnostics to find problems or the use of language-sensitive editors. There is clearly a point where requiring disclosure makes no sense, but there does not, yet, appear to be a consensus on where that point is. One possible rule might be this one suggested by Rostedt: "if AI creates any algorithm for you then it must be disclosed".

Meanwhile, Levin's first attempt to disclose LLM usage with a Co-developed-by tag drew an amused response from Andrew Morton, who seemingly had not been following this conversation. Hildenbrand responded that a new tag, such as Assisted-by, would be more appropriate; Ryabitsev has also made that suggestion.

Copyright and responsibility

The copyright status of LLM-generated code is of concern to many developers; if LLM-generated code ends up being subject to somebody's copyright claim, accepting it into the kernel could set the project up for a future SCO-lawsuit scenario. This, of course, is an issue that goes far beyond the kernel community and will likely take years of court battles worldwide to work out. Meanwhile, though, maintainers will be asked to accept LLM-generated patches, and will have to make decisions long before the legal processes have run their course.

Levin pointed to the generative-AI guidance from the Linux Foundation, saying that it is the policy that the kernel community is implicitly following now. In short, this guidance suggest that developers should ensure that the tool itself does not place restrictions on the code it generates, and that said code does not incorporate any pre-existing, copyrighted material. Levin suggested using this document as a starting point for judging the copyright status of submissions, but that guidance is only so helpful.

Michal Hocko asked how maintainers can be expected to know whether the conditions suggested in that "quite vague" guidance have been met. Levin's answer reflects a theme that came up a few times in the discussion: that is what the Signed-off-by tag applied by the patch submitter is for. By applying that tag, the submitter is indicating that the patch is a legitimate contribution to the kernel. As with any other patch, a contributor needs to be sure they are on solid ground before adding that tag.

That reasoning extends beyond just copyright status to responsibility for the patch at all levels. Rostedt suggested documenting that a signoff is also an indication that the submitter understands the code and can fix problems with it. Viro said that, for any patch regardless of origin, "there must be somebody able to handle active questioning" about it. Levin added that: "AI doesn't send patches on its own - humans do", so it is the human behind the patch who will ultimately be responsible for its contents.

The reasoning makes some sense, but may not be entirely comforting to nervous maintainers. The people submitting LLM-generated patches are not likely to be in a better position to judge the copyright status of that work than maintainers are. Meanwhile, maintainers have had to deal with patches from contributors who clearly do not understand what they are doing for many years; documenting that those contributors must understand the output from coding tools seems unlikely to slow down that flood. Hildenbrand expressed his concern this way: "We cannot keep complaining about maintainer overload and, at the same time, encourage people to bombard us with even more of that stuff". Based on what has been seen in other areas, it would not be surprising to see an order-of-magnitude increase in the flow of low-quality patches; indeed, Greg Kroah-Hartman said that it is already happening.

More discussion

The end result is that the question of how to incorporate LLM-based development tools into the kernel project's workflow is likely to feature prominently in community discussions for some time. While these tools may bring benefits, including finding patterns that are difficult for humans to see and the patient generation of test code, they also have the potential to bring copyright problems, bugs, and added maintainer stress. The pressure to use these tools is not going away, and even the eventual popping of the current AI bubble seems unlikely to change that.

Within a few milliseconds of the posting of the call for topics for the 2025 Maintainers Summit, there were two separate proposals (from Stoakes and Jiri Kosina) on the issue of AI-based tools in the kernel workflow; they have sparked discussions that will surely have progressed significantly by the time this article is published. One does not, it seems, need an LLM to generate vast amounts of text. This conversation is, in other words, just beginning.

Index entries for this article
Kernel	Development tools

Practical use of LLMs

Posted Aug 7, 2025 22:17 UTC (Thu) by jepsis (subscriber, #130218) [Link] (23 responses)

I don’t understand the extreme concern about using tools. LLMs are particularly effective for language-related tasks - obviously. For example, they can proof-read text, generate high-quality commit messages, or at least provide solid drafts. Grammar checkers have been used for decades, and LLMs are simply the next step forward. They are especially beneficial for non-native English speakers.

LLMs are not so strong for programming, especially when it comes to creating something totally new. They usually need very limited and specific context to work well. Still, they can help in reviewing your own code. I use them mostly for litmus-testing ideas and checking if something could work in practice.

Practical use of LLMs

Posted Aug 7, 2025 23:02 UTC (Thu) by willy (subscriber, #9762) [Link] (19 responses)

With every tool comes an army of people who use them without understanding their limitations (either the limitations of the tool or the limitations of the user). Then they send the results to the overloaded maintainer who has to decide whether to apply it.

The problem the Linux kernel has is not that we have too few patches to review!

Practical use of LLMs

Posted Aug 8, 2025 4:34 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (18 responses)

There are goods and bads there. Well trained LLMs can help humans avoid the mistakes that are often encountered in the first review, and in that sense reduce the number of round trips. On the other hand, having more clueless people sign-off an LLM-generated patch without understanding it is going to add noise.

IMHO mandating that patches disclose it when an assistant was involved is good, at least for the copyright reasons. And this can help reviewers decide whether or not they're interested in reviewing something which looks like junk (because it sometimes is), and even decide to simply ignore certain submitters who only provide such low-quality patches. It can also save some reviewer time to figure that it's probably not worth explaining in finest details what is wrong if the submitter is not going to understand.

We also need to think forward. In 5-10 years, there will be little distinction between AI assistants and coworkers, and we can hardly request from a submitter more when involving an AI assistant than what is required when they get help from a friend or coworker.

What is really not acceptable is submitters not able to discuss their changes, whether it comes from an LLM, they stole it from another project or they had someone else write it for them. But if they used an AI-assisted editor which properly formatted the patch and fixed typos in comments, maybe ran a first-level review or helped pick suitable variable names, and stuff that generally improves the quality and reduces the reviewer effort, I don't see a reason to reject this.

Practical use of LLMs

Posted Aug 8, 2025 8:01 UTC (Fri) by khim (subscriber, #9252) [Link] (17 responses)

> In 5-10 years, there will be little distinction between AI assistants and coworkers

No. There huge distinction: coworkers can be taught something LLMs would be forever clueless to your objections and requests.

You can not teach LLM anything.

Well… technically you can, of you train the new model with a [slightly] different properties – but that's not what contributors or coworkers would be able to afford any time soon.

And that means that we should insist that contributors (and not maintainers) should shoulder the responsibility of fixing the warts LLMs would add to their submissions again and again.

Practical use of LLMs

Posted Aug 8, 2025 14:57 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (16 responses)

> You can not teach LLM anything.

Note that I purposely said "AI assistants" not "LLM". LLMs are dumb because they're only based on language and I anticipate that future generations will come leveraging more of the multi-modal approach and will be more flexible like humans, by involving multiple senses at once. In addition I'm pretty sure we'll start to imitate the way we currently function with short-term and long-term memory with conversion phases that we call "sleep" in our cases. LLMs can already do quite impressive things and that's unfortunately why people believe they're very smart. But they can be impressively limited and dumb as well sometimes. Note that to be honest we all know super dumb humans with which we gave up trying to explain certain things. I fear the day we'll have to run interviews to AI assistants to decide if they're smart enough to be hired...

Practical use of LLMs

Posted Aug 8, 2025 15:17 UTC (Fri) by khim (subscriber, #9252) [Link] (15 responses)

> In addition I'm pretty sure we'll start to imitate the way we currently function with short-term and long-term memory with conversion phases that we call "sleep" in our cases.

That's already the case: when LLM model is trained it “remembers” facts its “long-term memory”.

The only problem: it's entirely cost-prohibitively to offer mode where one may train LLM and later its long-term memory.

In fact it's even prohibitively expensive to run existing models, too, thus pretty soon we would see serious regressions in all these tools capabilities.

An era of “AI assistants” that you describe would come 10-20 years down the road, maybe even later.

Practical use of LLMs

Posted Aug 8, 2025 15:52 UTC (Fri) by anselm (subscriber, #2796) [Link] (14 responses)

when LLM model is trained it “remembers” facts its “long-term memory”.

Not really. LLMs don't deal in facts, they deal in probabilities, as in “what word is most likely to complete this partial sentence/paragraph/text?” These probabilities can be skewed in various ways through training and prompting, but it is important to keep in mind that to an LLM, the world is essentially alphabet soup – it has no underlying body of abstract factual knowledge from which it could draw logical conclusions like we humans do.

LLMs can certainly produce results that seem impressive, but in the long run they're probably just a side branch on the path to actual AI. If you ask Sam Altman he will tell you that OpenAI is only a year or so (and a few tens of billions of dollars) away from “artificial general intelligence”, but he's been doing that for years now and it's very hard to see how that would work given what they've been doing so far.

Practical use of LLMs

Posted Aug 8, 2025 16:08 UTC (Fri) by khim (subscriber, #9252) [Link] (13 responses)

> Not really. LLMs don't deal in facts, they deal in probabilities, as in “what word is most likely to complete this partial sentence/paragraph/text?”

Yes, but that's entirely different kettle of fish: humans have the world models, in fact they start form in human brain before humans learn to speak, starting from peekaboo and hide-and-seek games.

LLMs doesn't have anything remotely similar to that, that's why they couldn't say “I don't know how to do that”: humans say that when their world model shows the “hole” and LLMs couldn't do that since there are no world model, it's all probabilities all the way down.

In rare cases where says “I don't know” or “this thing is probably doesn't exist” (it happens sometimes, if rarely) they simply found it highly probably, based on their training set, that this response would be, probably, the most appropriate one.

The only “memory” LLM have are related to probabilities… that doesn't mean that there are long-term memory, it just means that it's different from what humans have.

> If you ask Sam Altman he will tell you that OpenAI is only a year or so (and a few tens of billions of dollars) away from “artificial general intelligence”, but he's been doing that for years now and it's very hard to see how that would work given what they've been doing so far.

That's yet another kettle of fish. I think AGI would soon be relegated to the annals of history, anyway: it's obvious that pure scaling wouldn't give anything similar to “average human worker” any time soon – and AGI is something that have marketable appeal only in that “scale is all you need” world.

If we would be forced to match human capabilities by slowly and painstakingly adding more and more specialized modules, then AGI loses it's appeal: is still achievable – but somewhere in XXII or XXIII century where last 0.01% of something that human was doing better than pre-AGI system is conquered.

By that time our AI is so drastically superhuman at everything else that saying that we have reached AGI no longer makes sense. It's more of “ oh yeah, finally… at arrived… what else is new?” moment, rather that something to talk about.

Practical use of LLMs

Posted Aug 8, 2025 20:45 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (12 responses)

The main problem with "matching humans" is that they'll have to pass by empathy, emotions, self-conciousness and some may even develop their own religions etc. Then at this point we'll have laws explaining how it's inhumane to treat an AI assistant badly by making it work endlessly and ignoring its suffering. So in the end these assistants will end up being new workers with all the same limitations and problems as other ones in the real world and will not solve that many issues for enterprises, except that they'll eat more power ;-)

Practical use of LLMs

Posted Aug 8, 2025 21:09 UTC (Fri) by khim (subscriber, #9252) [Link] (10 responses)

> The main problem with "matching humans" is that they'll have to pass by empathy, empathy, self-conciousness and some may even develop their own religions etc.

Surprisingly enough that's already covered. Existing chatbots don't have “empathy”, “empathy” or “self-conciousness”, but they imitate them well enough to achieve pretty disturbing results. And I'm pretty sure they would do a superb job working as missionaries for various religious sects. No problem there at all: nefarious uses of LLMs scale surprisingly well.

LLMs fail utterly when long chains of logical reasoning is needed, though.

> So in the end these assistants will end up being new workers with all the same limitations and problems as other ones in the real world and will not solve that many issues for enterprises, except that they'll eat more power ;-)

Highly unlikely. In fact the biggest obstacle to the use of LLMs is the fact that people try to apply what they have learned from books and movies about how “sentient robots” would behave over last century or so. Which is understandable, but also incredibly wrong.

In books and movies “sentient robots” are always logical, correct and precise and it's a big problem for them to express emotion or simulate empathy… in real world LLMs can do all these things that “sentient robots” from all these countless books and movies struggled pretty easily… what they couldn't do are things that people expect them to do: logical reasoning, precision, reproducibility…

That's another thing that plagues the whole industry: what all the presentations and demos portray and “sell” and what CEO expect to buy… are these “sentient robots” from movies. What they get… is something entirely different, something totally unsuitable for the role where “sentient robots” would fit perfectly.

That's why Klarna rehires people back, and IBM hires people for “critical thinking” focused domains and Duolingo puts people back… it's all because LLMs are the total opposite from “sentient robots” in movies.

If you read a summary papers written by press-people then you would hear how people are rehired because robots “lack empathy” or “emotions”, but that's a big fat lie: robots have more than enough empathy and emotions, spammers just simply love that aspect of LLMs… what they lack are “common sense” and “logic”.

Practical use of LLMs

Posted Aug 8, 2025 22:41 UTC (Fri) by Wol (subscriber, #4433) [Link]

> what they lack are “common sense”

Given that "common sense" isn't common, and rarely makes sense, maybe that's just as well!!!

Cheers,
Wol

Practical use of LLMs

Posted Aug 9, 2025 5:57 UTC (Sat) by wtarreau (subscriber, #51152) [Link] (8 responses)

I don't know how robots behave in movies (I don't have a TV) but I'm well aware that you cannot expect from an LLM to be precise/exact, because it should not be seen as a sequential computer program that can be debugged and made reliable, but as something trying to imitate our brain with many inter-connections based on what was learned, probabilities and noise. I.e. in order to think like us they have to be as unreliable. Their value however is in having abilities to directly use computer-based tools without having to physically move fingers, so they can use calculators and web search faster than us. But the risk of inaccurately recopying a result remains non-null... like with humans who can get distracted as well.

Practical use of LLMs

Posted Aug 9, 2025 9:09 UTC (Sat) by excors (subscriber, #95769) [Link]

> Their value however is in having abilities to directly use computer-based tools without having to physically move fingers, so they can use calculators and web search faster than us.

One example I've seen is giving ChatGPT 5 - which was announced as having "PhD-level intelligence" - the prompt "Solve: 5.9 = x + 5.11". When I repeated it myself, 50% of the time it said x=0.79, and 50% of the time it said x=-0.21.

In both cases it gave a superficially reasonable step-by-step explanation, saying things like "5.90 - 5.11 = (5.90 - 5.11) = 0.79 but since 5.90 is less than 5.11 in the hundredths place, the result will be negative: x = -0.21". That's nonsense, but it's confidently-stated half-correct nonsense, which engenders undeserved levels of trust.

In theory the system could make use of external calculators and powerful, reliable, human-designed algebraic tools. In practice it doesn't - it does probabilistic calculations on language tokens, resulting in something that sounds like a mathematical calculation but actually isn't, making it untrustworthy for even trivial tasks like this. (And somehow this is worth half a trillion dollars.)

Practical use of LLMs

Posted Aug 9, 2025 9:42 UTC (Sat) by khim (subscriber, #9252) [Link] (2 responses)

> I.e. in order to think like us they have to be as unreliable.

Nope. It order to think you need reliable world model somewhere under all these words. Half-century old SHRDLU may think while ChatGPT-5 couldn't.

Sure, humans make mistakes (especially when they are distracted), but they may also notice them automatically and fix them. Thus doesn't work with LLMs, in fact if you try to push them they become even less accurate, then when they are not “thinking”.

> as something trying to imitate our brain with many inter-connections based on what was learned, probabilities and noise

That's not imitation of human brain, though. That's imitation of insect brain or, maybe, a chimps brain (although a chimps have world model even if they are less complicated than humans world model). It's pure reaction with nothing to control the “train of though” and to stop it from derailing.

The best illustration to what is happening with “reasoning” LLMs is picture from Wikipedia in the article Taylor series where it shows “sin x and its Taylor approximations by polynomials of degree 1, 3, 5, 7, 9, 11, and 13 at x = 0”.

It's very easy to see that as “as the degree of the Taylor polynomial rises, it approaches the correct function” – but if you actually look on picture you'll notice how it does that: it becomes ever more precise in the small, but growing area around zero, but, simultaneously, also become ever more absurdly wrong in the area around that central part.

And that's what is happening with LLMs: they are becoming ever more impressive at “one-shotting” things, yet. simultaneously, ever more helpless with attempts to handle a long series of tasks.

This is similar to how very small kids behave, but eventually human learns to double-check and self-verify things… LLMs couldn't learn that, they simply have no mechanisms suitable for that.

The latest fad in AI is to attach “tools” to LLMs and hope that Python interpreter would be able to work a reliable replacement for a world model. It wouldn't work: this would slightly expand area where LLMs would be able to “one-shot” things, but wouldn't fix the fundamental flaw in their construction.

Practical use of LLMs

Posted Aug 9, 2025 10:32 UTC (Sat) by Wol (subscriber, #4433) [Link] (1 responses)

> That's not imitation of human brain, though. That's imitation of insect brain or, maybe, a chimps brain (although a chimps have world model even if they are less complicated than humans world model). It's pure reaction with nothing to control the “train of though” and to stop it from derailing.

It's not an imitation of ANY brain. Think about it. The brain has a lot of dedicated hardware, be it visual recognition, auditory recognition, whatever. And a small veneer of general purpose hardware over the top. AI runs on pure general purpose hardware.

And has been pointed out, a lot of the brain's special-purpose hardware is survival-ware - if the hardware gets it wrong, it's likely to end up as a lion's lunch, or whatever ...

Cheers,
Wol

Practical use of LLMs

Posted Aug 9, 2025 10:40 UTC (Sat) by khim (subscriber, #9252) [Link]

> The brain has a lot of dedicated hardware, be it visual recognition, auditory recognition, whatever.

Isn't that GPT-5 “tools” and voice recognition in Gemini Live is for?

> AI runs on pure general purpose hardware.

Not really. It can be run, in theory, on general purpose hardware, but it's not clear if GPT-5 run on general purpose hardware would be at all practical.

Even if you just think about BF16… it's pretty specialized thingie.

> And has been pointed out, a lot of the brain's special-purpose hardware is survival-ware - if the hardware gets it wrong, it's likely to end up as a lion's lunch, or whatever ...

Sure, but do we actually use that hardware where we are writing code? Somehow I doubt it. It's like arguing that LLM couldn't write good code because it doesn't have liver… sure, liver is very important for human, but lack of liver is not what stops LLM from being good software designer.

Practical use of LLMs

Posted Aug 11, 2025 10:11 UTC (Mon) by paulj (subscriber, #341) [Link] (3 responses)

> I'm well aware that you cannot expect from an LLM to be precise/exact, because it should not be seen as a sequential computer program that can be debugged and made reliable,

The problem is (invent plausible stat and confidently handwave it about - highly appropriate in a thread on LLMs!) 99.999% of the populace doesn't know this, and lack the combination of technical background, curiosity and time to come to understand this. They think - because the hype machine (the effective combination of companies wanting to sell stuff and non-technical media jumping on pushing the latest buzz) has told them so - that this stuff is "intelligent" and will solve all problems.

Practical use of LLMs

Posted Aug 12, 2025 11:48 UTC (Tue) by wtarreau (subscriber, #51152) [Link]

Absolutely!

But in addition starting to be careful about LLMs also teaches people to be careful of other people looking too smart. There is a huge confusion between knowledge and intelligence in general. Lots of people use the term "smart" or "intelligent" to describe a very knowledgeable person, and consider that someone lacking culture is "dumb". But I've seen people who, once explained the details of a problem, would suggest excellent ideas on how to solve them. *This* is intelligence. Those who only know everything and cannot use it except to look smart in conversations are just parrots. Of course it's way better when you have the two at once in the same person, and often smart people like to learn a lot of new stuff. But each profile has its uses. Right now LLMs solve only one part of the deduction needed for intelligence, and know a little bit of everything but nothing deeply enough to express a valid opinion or advice. Yes most people (as you say, 99.999% to stick with this thread) tend to ask them advices and opinions on stuff they are expected to know well since coming from the internet, but that they only superficially know.

Practical use of LLMs

Posted Aug 12, 2025 17:16 UTC (Tue) by raven667 (subscriber, #5198) [Link] (1 responses)

I was saying on Mastodon the other day that when the OpenAI board tried to oust Sam Altman after he offered ChatGPT to the public that was probably the right call, they already had tested and knew the weaknesses of LLM models as a path to "AGI", but once the hype-train had left the station it proved impossible to stop. It's unlikely that any of this "AI" fever will work out to our betterment in the long run, and it's going to be used to destroy trillions of dollars of human capital in the meantime, at a time when we probably cannot afford it.

Practical use of LLMs

Posted Aug 12, 2025 17:21 UTC (Tue) by pizza (subscriber, #46) [Link]

> and it's going to be used to destroy trillions of dollars of human capital in the meantime, at a time when we probably cannot afford it.

To me it's not the "destruction" of so much [human] capital but the wasted/squandered opportunities.

Practical use of LLMs

Posted Aug 11, 2025 10:06 UTC (Mon) by paulj (subscriber, #341) [Link]

Asimov has (at least) 1 story around this. E.g. Bicentennial Man (later made into a film, with Robbin Williams as the robot Andrew).

Practical use of LLMs

Posted Aug 8, 2025 4:34 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

This post by Terry Tao was an interesting read: https://mathstodon.xyz/@tao/114915604830689046

Basically, you want the LLM to "red team" you and continuously *review* the code you're writing rather than writing code itself. I suspect we'll need some better way to interact than just prompt/response as (in order to reduce token count explosions), you want to feed diffs at intervals. If an LLM understood time (highly unlikely), it might even be able to "see" where you're going and possibly help create test cases or the like. If *that* could be fed into an "LSP" that annotates my source as I'm working, that is *much* closer to having `clang-tidy` or `clippy` point out issues as I'm developing (which I already have).

Practical use of LLMs

Posted Aug 29, 2025 14:30 UTC (Fri) by nim-nim (subscriber, #34454) [Link]

Grammar checkers rely on a grammar, a formal description of language that has been checked and rechecked and proofed (sometimes at the state level), and is not legally-encumbered.

LLMs are over-engineered plagiarism automatons that have no opinion on the correctness of the stuff they are plagiarising, except it should trigger strong reactions (because the field relies on advertiser money). It’s GIGO on a massive scale, with some checks added post-facto to limit the amount of garbage that spills out. No one has checked that every bit of content that has been used to train an LLM is correct, right, proper, good, free of legal encumbrances, etc.

That’s the core difference and why LLM output requires human review.

Practical use of LLMs

Posted Aug 29, 2025 15:04 UTC (Fri) by Wol (subscriber, #4433) [Link]

> LLMs are particularly effective for language-related tasks - obviously.

> LLMs are not so strong for programming,

Programming is very much a language-related task. So which is it, LLMs are particularly effective for programming, or LLMs are useless at language? You can't have it both ways!

And has been pointed out, LLMs are very capable at chucking out text that is simultaneously extremely plausible, and complete bullshit. THAT is the problem.

The problem we want to solve isn't language, it's communication. And with absolutely no concept of comprehension or truth, LLMs are a serious liability.

That said, LLMs are good at cleaning text up round the edges - until this eager-beaverness of all the peddlers of this rubbish actually gets seriously in the way of actually doing what you want to! I'm sick to death of Acrobat's desperation to "Let me summarise this document for you", when I'm actually looking for the *detail* I need which a summary will pretty much inevitably remove. The same with Sheets and Gemini - if I need detail to solve a problem, the LAST thing I need is Artificial Idiocy trying to summarise what I'm looking at!

Cheers,
Wol

Lower bar to start kernel development?

Posted Aug 7, 2025 22:57 UTC (Thu) by SLi (subscriber, #53131) [Link] (8 responses)

> Hildenbrand expressed his concern this way: "We cannot keep complaining about maintainer overload and, at the same time, encourage people to bombard us with even more of that stuff". Based on what has been seen in other areas, it would not be surprising to see an order-of-magnitude increase in the flow of low-quality patches; indeed, Greg Kroah-Hartman said that it is already happening.

What would be the likelihood of a first submitter submitting a non-low-quality patch without an LLM?

This seems potentially both good and bad. I'd argue: *If* this lowers the bar and makes more people eventually graduate into "real" kernel development, that's not purely negative.

But sure, there's something qualitatively new about this; it's not just kids sending code as a Word document.

Actually, I would predict that we will outgrow this problem in a way that will hugely annoy some and make others decide it's not a problem. LLMs are still improving fast. Sure, there's always people who wouldn't see value in them and would claim they are no good even if they outperformed humans.

I think we will reach in not distant future a point where the LLMs will do well enough that most of their work will be thought to be made by a competent hyena. (I mean human, but I love that autocorrect.)

I think this would mean more pragmatically LLMs growing to the level where they are at least ok at kernel development, but also pretty good at knowing what they are not good at.

But if you want to ease maintainer burden, maybe make an LLM review patches where LLM contributed (I personally find it silly to say that an LLM "authored" something, just like I don't say code completion authored something). And then forward them to an LLM maintainer, who asks TorvaLLM to pull them. And have them argue if there should be a disclosure if unreliable humans touched the patch.

Lower bar to start kernel development?

Posted Aug 8, 2025 6:24 UTC (Fri) by gf2p8affineqb (subscriber, #124723) [Link] (5 responses)

That is quite speculative. The recent rate of growth is no useful indicator for where it will eventually plateau. And right now LLMs are definitely subpar at many programming tasks (also other tasks).

Lower bar to start kernel development?

Posted Aug 8, 2025 7:43 UTC (Fri) by Wol (subscriber, #4433) [Link] (3 responses)

The problem is if the LLM doesn't get it right FIRST TIME, and the person using the LLM doesn't have the experience to recognise this, then it's pretty much assured the result is going to be rubbish. And the chances of this happening with first time submitters is quite high.

At the end of the day, most of the stuff on the net is rubbish. The quality of what an LLM outputs is directly correlated to the quality that goes in (it must be, without human review and feedback, it has no clue). Therefor, most LLM output has to be rubbish, too.

If your AI is based on a SMALL Language Model, where the stuff fed in has been checked for accuracy, then the results should be pretty decent. I don't use AI at all (as far as I know, the AI search engine slop generally has me going "what aren't you thinking !!!"), but my work now has a little AI that has access to all our help docs and thus does a decent job for most people - except that as always, people don't think, and people keep getting referred to Guru docs for more detail - HINT roughly 1/3 of the company doesn't have access to Guru, as a matter of policy!!! Argh!!!

Cheers,
Wol

Lower bar to start kernel development?

Posted Aug 8, 2025 9:04 UTC (Fri) by jepsis (subscriber, #130218) [Link] (2 responses)

I don’t see any issue with a first-time user working with LLVM.

Here are some examples of useful prompts:

Is the naming of functions and variables consistent in this subsystem?

Are the comments sufficient, or should they be added to or improved?

If I were to submit this upstream, what aspects might attract nitpicking?

Does the commit message accurately reflect the commit, or are there any gaps?

Lower bar to start kernel development?

Posted Aug 8, 2025 9:32 UTC (Fri) by khim (subscriber, #9252) [Link] (1 responses)

That's not “ first-time user working with LLVM” (LLM, I assume?). That's “experienced kernel developer trying LLM”.

First time user request would be more of “here's the spec for that hardware that I have, write driver to it”. And then the resulting mess is sent to maintainer, warts, bugs and all.

Lower bar to start kernel development?

Posted Aug 8, 2025 9:43 UTC (Fri) by jepsis (subscriber, #130218) [Link]

That's not “ first-time user working with LLVM” (LLM, I assume?). That's “experienced kernel developer trying LLM”.

Sure. Good example. It would have been good to have that sentence checked by AI, as it would likely have corrected it.

Lower bar to start kernel development?

Posted Aug 8, 2025 12:33 UTC (Fri) by kleptog (subscriber, #1183) [Link]

One of the most promising developments I see for code LLMs is the switch to a more diffusion style. So rather than generating a sequence of words, it's working on a grid of characters like your editor and iterating over that in the same way that stable-diffusion produces images. This means it has a chance of "seeing" the code as a whole and making changes globally. And so noticing mismatches between different parts and aligning them in the next iteration. So more like how people code.

The step after that would be an LLM iterating over an AST so it doesn't have to worry about getting the syntax right, but I haven't read about that yet. It's not clear to me if that technology even exists yet.

Lower bar to start kernel development?

Posted Aug 8, 2025 8:43 UTC (Fri) by khim (subscriber, #9252) [Link] (1 responses)

> I'd argue: *If* this lowers the bar and makes more people eventually graduate into "real" kernel development, that's not purely negative.

LLMs do precisely the opposite: they make first ever patch look better than your average patch, but they make it harder for a newcomer to “eventually graduate into "real" kernel development”.

That's precisely the issue with current AI: degradation of output. LLMs don't have a world model and when you try to “teach” them they start performing worse and worse. To compensate their makers feed them terabytes, then petabytes of himan-produced data… but that well is almost exhausted, there are simply no data to feed into these. And this scaling only improves the initial output, it does nothing to the lack of the world model and ability to learn during the dialogue.

Worse: as we know than when ape and human interact human turns into ape, not the other way around.

The chances are high that story with LLMs would be the same: when complete novices would try to use LLMs to “become a kernel developers” they would become more and more accepting to LLM flaws instead of learning to fix them. This, too, would increase load placed on maintainers.

> LLMs are still improving fast.

Yes and no. They are feed more and more data, which improves the initial response, but does nothing to gradual degradation of output when you try to improve it.

Sooner or later you hit the “model collapse” threshold and then you have to start from scratch.

> but also pretty good at knowing what they are not good at.

So far that haven't worked at all. LLMs are all too happy to generate nonsense output instead of admitting that they don't know how to do something.

> maybe make an LLM review patches where LLM contributed

Given the fact that LLMs tend to collapse when feed their own input (that's why even most expensive plans don't give you the ability to generate long outputs, instead they give you the ability to request many short ones) – this would make the situation worse, not better.

Lower bar to start kernel development?

Posted Aug 8, 2025 15:46 UTC (Fri) by laurent.pinchart (subscriber, #71290) [Link]

> LLMs do precisely the opposite: they make first ever patch look better than your average patch, but they make it harder for a newcomer to “eventually graduate into "real" kernel development”.

An interesting study on that topic: "Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task" (https://arxiv.org/abs/2506.08872)

Tool use should be tagged in-tree

Posted Aug 8, 2025 6:49 UTC (Fri) by 奇跡 (subscriber, #178623) [Link]

Jakub Kicinski argued that the information about tools was "only relevant during the review", so putting it into patch changelogs at all "is just free advertising" for the tools in question.

This strikes me as an oddly myopic take. Are the drawbacks of such "free advertising" not trivial compared to the obvious auditing/analysis benefits of documenting tool use in tree?

Don't Ask, Don't Tell

Posted Aug 8, 2025 9:15 UTC (Fri) by rgb (subscriber, #57129) [Link] (3 responses)

At this point in time I think the only reasonable stance is a don't ask, don't tell policy, which is also basically the status quo anyhow.
At the end of the day, a human is the author of the patch. He or she is responsible for the content and also the point of trust that can hold or break.
How they came up with the code, what tools they used, might be interesting, but not more than what school they went to or what other projects they are working on. It's tangential in the end.

Don't Ask, Don't Tell

Posted Aug 10, 2025 11:16 UTC (Sun) by abelloni (subscriber, #89904) [Link]

This is not true, there is a difference between a patch made to silence a static checker and one for an issue that was actually seen in the field.

Don't Ask, Don't Tell

Posted Aug 10, 2025 14:45 UTC (Sun) by Wol (subscriber, #4433) [Link] (1 responses)

> How they came up with the code, what tools they used, might be interesting, but not more than what school they went to or what other projects they are working on. It's tangential in the end.

Not if it affects the TYPE of bug that is in the code! As I think someone else pointed out, AIs and humans make different sorts of bugs. And if you don't know whether it was an AI or a human, it either (a) makes review much harder, or (b) makes missing things much more likely.

Having seen some AI code (that I was given) I wasn't impressed. It did the job, but it wasn't what I would have expected from someone who knew our coding style.

At the end of the day, I'm all for "no surprises". Who cares if it's an AI or a person. What matters is that it's declared, so the next guy knows what he's getting.

Cheers,
Wol

Don't Ask, Don't Tell

Posted Aug 11, 2025 7:47 UTC (Mon) by kleptog (subscriber, #1183) [Link]

> Having seen some AI code (that I was given) I wasn't impressed. It did the job, but it wasn't what I would have expected from someone who knew our coding style.

But then it's easy right? "Doesn't match our coding style" is a perfectly valid reason to reject a patch.

I believe I got it from the PostgreSQL lists: after your patch the code should look like it's always been there.

Arguably, if new code doesn't follow the coding style (which is much broader than just where to put whitespace) then the author has not yet understood the code we'll enough to be submitting. Which covers the LLM case perfectly.

Generative-AI guidance from the Linux Foundation

Posted Aug 9, 2025 7:45 UTC (Sat) by gray_-_wolf (subscriber, #131074) [Link] (4 responses)

> In short, this guidance suggest that developers should ensure that the tool itself does not place restrictions on the code it generates,

This is interesting. I though pretty much all the LLMs these days place additional restriction(s?), in particular, that you cannot use the output to improve another LLM. Ignoring the hypocrisy of slurping all of GitHub and then putting this rule on their products, how does that work with submitting the code to the kernel?

I basically see only two possibilities. Either the submitter just cannot license the code as GPL-2.0 due to the restriction above, or they are risking their subscription to the LLM with every patch submission.

What am I missing here?

> and that said code does not incorporate any pre-existing, copyrighted material.

And how exactly am I supposed to ensure this?

Generative-AI guidance from the Linux Foundation

Posted Aug 9, 2025 9:39 UTC (Sat) by jepsis (subscriber, #130218) [Link] (2 responses)

> I basically see only two possibilities. Either the submitter just cannot license the code as GPL-2.0 due to the restriction above, or they are risking their subscription to the LLM with every patch submission.

Patches to the Linux upstream are always derivative works of Linux and therefore fall under the GPLv2. In most cases, authors of patches or patch sets to Linux cannot claim separate copyright, and they typically do not meet the threshold of originality. Using any tools or AI does not change this.

Of course, if someone submits an entirely new and unconventional file system like 'VibeFS', copyright issues might arise. However, it is still highly unlikely that such a contribution would be approved, regardless of the tools used.

Generative-AI guidance from the Linux Foundation

Posted Aug 14, 2025 13:00 UTC (Thu) by rds (subscriber, #19403) [Link] (1 responses)

The GPL is based upon copyright. Only material created by people can currently be copyrighted (in the USA at least). Using LLM's runs the risk of destroying the license of any code its used on.

Disney just decided to not use machine generated images of an actor (Dwayne Johnson) because of concerns over the copyright status of the film.

Generative-AI guidance from the Linux Foundation

Posted Aug 14, 2025 14:10 UTC (Thu) by Wol (subscriber, #4433) [Link]

> Only material created by people can currently be copyrighted (in the USA at least).

But the output of an LLM is based on the (copyrighted) material fed in. If the material that went in is copyrighted, saying "only material created by people ..." does not mean that what comes out of an LLM is copyright-free. All it means is that the LLM cannot add its own copyright to the mix.

This is very clear in the European legislation, which says it's perfectly okay for an LLM to hoover up copyrighted material to learn from (exactly the same as a human would!), but makes no statement whatsoever as to whether the output is copyrightable or a derivative work (just like a human!)

So assuming your statement is correct, US legislation says nothing whatsoever about whether the output of an LLM is copyrighted or not. All it says is that any *original* work by an LLM cannot be copyright.

Cheers,
Wol

Generative-AI guidance from the Linux Foundation

Posted Aug 11, 2025 13:12 UTC (Mon) by cesarb (subscriber, #6266) [Link]

> I though pretty much all the LLMs these days place additional restriction(s?), in particular, that you cannot use the output to improve another LLM.

Not all of them. For instance, Qwen3 (https://huggingface.co/Qwen/Qwen3-235B-A22B-Instruct-2507 and others) uses the Apache 2.0 license, and DeepSeek-R1 (https://huggingface.co/deepseek-ai/DeepSeek-R1) uses the MIT license (though see the note in that page about its distilled variants, the license depends on the model used as the base).

AI generated code is not useful

Posted Aug 10, 2025 11:05 UTC (Sun) by alx.manpages (subscriber, #145117) [Link]

"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."

— Brian W. Kernighan and P. J. Plauger in The Elements of Programming Style.

The people defending that LLMs might make it easy for new programmers (which are otherwise unable to contribute code) to contribute code, somehow expect those new programmers to be able to review the code produced by an LLM?

And for people that are already good programmers, will this reduce the work? Or will it increase it?

You've changed the task of authoring code --in which case you often self-restrict to a set of coding standards that significantly reduce the possibility of bugs--, to the task of reviewing code --which by nature is already twice as hard--, and fully unrestricted, because you can't trust an LLM to consistently self-restrict to some rules. The bugs will appear in the most unexpected corners.

Even for reviewing my own code, I wouldn't use an LLM. Reason: it might let two bugs pass for each one it catches, and I might have a false feeling of safety. I prefer knowing the limits of my deterministic tools, and improve them. And finding quality reviewers. That's what it takes for having good code.

Abandon all hope, ye who accept LLM code.

This is bad.

Posted Aug 13, 2025 19:04 UTC (Wed) by mirabilos (subscriber, #84359) [Link]

Why is Levin still a kernel developer? Why were his questionable commits not reverted? Why do Linux, Debian and OpenBSD not have a hard stance against any uses of the eso-fascist worldburning plagiarism machine like Gentoo and NetBSD do? Why do at least some of them let some of the slop in? Why is rsyslog still in Debian main?

It’s a slippery slope starting by supporting slop sliding into the source.