|
|
Log in / Subscribe / Register

Forking Vim to avoid LLM-generated code

By Daroc Alden
April 15, 2026

Many people dislike the proliferation of Large Language Models (LLMs) in recent years, and so make an understandable attempt to avoid them. That may not be possible in general, but there are two new forks of Vim that seek to provide an editing environment with no LLM-generated code. EVi focuses on being a modern Vim without LLM-assisted contributions, while Vim Classic focuses on providing a long-term maintenance version of Vim 8. While both are still in their early phases, the projects look to be on track to provide stable alternatives — as long as enough people are interested.

The Vim project has had a policy on the use of LLMs since December 2025: code generated with assistance from LLMs is acceptable, so long as the use is disclosed and the code matches the style of existing Vim code. NeoVim, the long-term fork of Vim focused on refactoring the code to be more maintainable and extensible, has a similar policy. These policies may have been added too late, however. In November 2025, Brian Carbone claimed (in a comment that is now hidden for being off-topic) that a contributor to both projects had probably been using an LLM in their recent contributions, many of which predate the policy.

Vim maintainer Christian Brabandt didn't think that assessment was fair, but by that point the horse may have already left the stable. The contributor never confirmed whether the contributions Carbone listed were LLM-assisted or not, but the ensuing discussion ended up deciding that the project would be fine with using LLMs. Newer contributions from Brabandt and others have openly included LLM-assistance, ranging from the trivial (fixing a regex) to the security critical (handling composing Unicode characters securely). At least seven such commits have gone into Vim itself, while 22 such have been included in NeoVim at the time of writing.

EVi

EVi was forked in March by "NerdNextDoor" from Vim v9.1.0, released in January 2024. As such, it supports most new Vim features, including Vim9 script. The version to fork from was chosen to balance having recent Vim features available for compatibility while probably predating any unknown LLM-driven contributions. While there could in theory have been LLM-assisted commits prior to 2024, the community springing up around the fork deemed that unlikely.

The real challenge for any fork, however, is attracting an actual community to the project, given that many people will prefer to use upstream Vim. EVi looks to be on track to do that, with 13 contributors adding 86 commits in the past month. Vim itself had 214 commits from 54 contributors during the same period. Most of the development work up to this point has been concerned with changing the various places in the program that refer to the name "Vim", but a handful of bug fixes and backports have gone in as well.

Vim Classic

Vim Classic, on the other hand, was forked (also in March) from Vim 8.2.0148, the last version before the introduction of Vim9 script. In the blog post announcing the fork, Drew DeVault explained that he chose a version without Vim9 script because it was still new when Bram Moolenaar, Vim's original creator, passed away in 2023. DeVault felt that Vim Classic would struggle to find the resources to keep up with the work that has been done on Vim9 script since then, and having a buggy, incompatible version would be a disservice to users.

DeVault has backported a handful of patches from the main Vim project to fix security problems and minor bugs. That is also how he means to go on with Vim Classic: focusing on long-term maintenance over adding new features or changing things. That backporting makes it a little difficult to tell exactly how much active work there is on Vim Classic. Patches from 18 authors have made it into the repository, but almost half of the patches were authored by Moolenaar and have been backported. The development mailing list is not very active, but does have some participants, with 65 messages in the few weeks since the fork's announcement.

Prospects

Neither EVi nor Vim Classic have had a formal release yet, but both projects seem to be gearing up to make a release in the near future. That's an important first step, but building a fork up into a durable, separate project is a difficult prospect. The main thing a fork needs, in order to grow a supporting community, is a group of people who prefer the direction of the fork, even in the face of a slower pace of development and less community support; it would not be surprising if either project failed to scrape together the necessary enthusiasm to become viable in the long term.

On the other hand, people can feel strongly about their text editors. For the kinds of people who use Vim, it is not hard to imagine that they spend nearly as much time interacting with Vim as interacting with the rest of their operating system. That's certainly the case for my relationship with Emacs. That kind of time-investment makes it easy to feel connected to one's tools in a way that isn't true of other software. The Vim forks have a natural stream of work in common, in the form of backporting LLM-free fixes for security problems, so some people may choose to contribute to both. Also, the Vim community has supported a long-term fork before, in the form of NeoVim. It may be reasonable to expect the projects to come to resemble previous forks based around excluding a technology, such as Devuan, the systemd-free fork of Debian. Devuan is supported by a core group of enthusiasts who keep the project going, but generally follows the Debian project's lead in areas other than init systems.

LLM-assisted contributions are coming to a lot of open-source projects, from the kernel to the browser, and even to good old-fashioned text-editors like Vim. Avoiding LLM-generated software entirely seems like it is fast becoming a relative impossibility. But the open-source-software community was formed by the conviction that people have the right to adapt the software they use for their own needs, and this case is no different: for those Vim users who feel strongly that LLMs should not intrude on the code of their editor, there are options. Whether other projects will head down a similar path is unclear: only time will tell.



to post comments

As with drinks

Posted Apr 15, 2026 14:44 UTC (Wed) by burki99 (subscriber, #17149) [Link]

We can expect every Open Source to come in regular, light and zero brands (what is sugar or alcohol to some people are LLMs to others)

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 15, 2026 14:59 UTC (Wed) by cen (subscriber, #170575) [Link] (55 responses)

LLMs have gotten very good at writing, assisting, reviewing code and finding vulnerabilities. In some tasks they are already better than your average developer. As a contributor and maintainer of a bunch of very niche open source projects, they are a godsend. In a sole maintainer role with having nobody else to review my contributions, LLM is at least good enough reviewer to keep me in check.

Projects with strict anti-LLM policy will probably be stagnating and left behind. Also, good luck attracting younger contributors who'll likely work with AI assist from day 1.

This to me all looks very similar to the no binary blobs purist. Sure, you can deblob a distro, but does it actually run on any relavant hardware? At a certain point practicality wins over purist political views.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 15, 2026 15:11 UTC (Wed) by jpeisach (subscriber, #181966) [Link]

This. Exactly this.

I think it's more "are you using AI appropriately" and not just blindly trusting output (and also testing it throughly).

Unfortunately, as for things like environmental concerns - well we were already doomed, this will just bring more attention to the environment :)

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 15, 2026 15:12 UTC (Wed) by pizza (subscriber, #46) [Link] (24 responses)

> Also, good luck attracting younger contributors who'll likely work with AI assist from day 1.

We don't want "younger" contributors, we want *competent* ones that understand the problem space.

Otherwise it's like going to a gym with a forklift. Sure, the weights are going up and down, but how exactly is that supposed to get you into shape?

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 15, 2026 16:08 UTC (Wed) by cen (subscriber, #170575) [Link] (8 responses)

You can use AI assist and also be (somewhat) competent in the space. If a project has a blanket ban on any AI assisted code then what is such a person supposed to do? They'll either try to hide their usage of AI and hope it goes through or just move somewhere else.

I fully agree that just contributing without having a clue what you're actually doing is bad and not what you want but you can have sane AI policies around that.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 15, 2026 17:11 UTC (Wed) by pizza (subscriber, #46) [Link] (7 responses)

> If a project has a blanket ban on any AI assisted code then what is such a person supposed to do?

At the risk of stating the obvious... maybe that person should not do that?

> I fully agree that just contributing without having a clue what you're actually doing is bad and not what you want but you can have sane AI policies around that.

So far this year I'm looking at 0/4 with respected to "AI assisted" patch sets. The submitters don't understand things well enough to have their LLMs fix the glaring correctness problems (of the "tries writing to hardware that the LLM completely hallucinated" variety), much less deal with more subtle issues [2]. Actually, "don't understand things" is the charitable interpretation; I suspect the reality is that they simply can't be bothered to put in the additional effort [1]. Either way the result is the same -- they ghost us and the patches eventually get flushed.

[1] representing an order or two more magnitude more than they put into the initial submission.
[2] eg regressions with different target devices/settings

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 15, 2026 18:56 UTC (Wed) by Paf (subscriber, #91811) [Link] (6 responses)

Yes, and this is bad llm usage. I use llms every day for my development, with care to observe and inspect the outputs. And it works well, with unusual blind spots and problems you have to get used to and guard against.

It has allowed me to make useful merged contributions to several projects I would not ordinarily touch. (The maintainers were happy. But I engaged, didn’t ghost anyone, and labeled my code as llm generated (I do not personally feel this will be necessary long term, but I think it is helpful currently).)

Overall, I predict a blanket ban will quickly come to look like banning planks cut with a particular kind of saw. But it is certainly easy to use the things to let you not understand anything, and that doesn’t end well.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 16, 2026 8:05 UTC (Thu) by elru (subscriber, #170791) [Link] (5 responses)

I agree completely with this. The alternative is to have a more senior programmer of the project work with me and ask questions to as I work on the code, and then eventually put in a PR after working on it for a few rounds.

And where will I find that senior programmer who has the free time and patience for me to fumble and struggle through, and iterate along side with, especially late at night? I don’t want to waste anyone’s time, and everyone is busy enough as it is. If I’m going to ask a question, I’d rather have exhausted all my available tools at my disposal first.

And an LLM is such a tool I’ll use. I can iterate on code with more contextual feedback than generic compiler errors, and then I can test it and continually iterate upon it until I’m confident enough to submit a PR. It doesn’t replace the senior members on the team, no way, but it frees them up from having to help me until I’ve gone through more iterations. By the time I submit that PR, I have read, rewritten, broken it, and fixed it enough times that I know the code I’m submitting well enough for a review.

And LLMs are far from perfect. They’ll suggest constant changes for the sake of changes, and then sometimes confabulate functions that don’t even exist. But like any other tool, you learn to take what you need from them, or you improve the tool.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 16, 2026 11:21 UTC (Thu) by LtWorf (subscriber, #124958) [Link] (2 responses)

> The alternative is to have a more senior programmer of the project work with me and ask questions to as I work on the code, and then eventually put in a PR after working on it for a few rounds.

Not all projects have the bandwidth to be teachers and that is entirely ok.

And teaching a person can be worthwhile because people learn. Spending time hand holding an LLM contribution is a complete waste of time since models don't learn this way.

It might be the case that you are not competent enough to contribute to some projects. That is fine and there is no shame. Nobody is omniscient. Wasting people's time with LLMs won't solve much.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 16, 2026 11:45 UTC (Thu) by pizza (subscriber, #46) [Link] (1 responses)

> And teaching a person can be worthwhile because people learn. Spending time hand holding an LLM contribution is a complete waste of time since models don't learn this way.

...Hand holding a person submitting an LLM contribution represents real, actual *work* from both parties, and both have to be willing to undertake that asymmetrical effort.

> It might be the case that you are not competent enough to contribute to some projects. That is fine and there is no shame. Nobody is omniscient. Wasting people's time with LLMs won't solve much.

I think this is akin to "to meaningfully contribute to this project you need to understand calculus" whereas the submitter is at the "doesn't understand algebra" stage (or heck, "doesn't understand basic arithmetic"). While F/OSS maintainers are usually willing to help teach the domain specific stuff, it's hard to fault them for not being willing/able to individually teach years of foundational theory.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 16, 2026 15:39 UTC (Thu) by elru (subscriber, #170791) [Link]

I agree with what you both said, but I suppose the angle I’m coming from is that LLMs are not mutually exclusive to producing in a learner an improved understanding. This is no replacement for studying the source material, nor for proper empirical testing of code.

I think the concept of ‘vibe coding’ has unfortunately taken over many things, but it’s not the only way to use an LLM. I don’t think we have to abandon a high-rigor technical understanding when we use these new tools. That said, I’m not on the receiving end of open source PRs, so perhaps the reality is starkly different.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 16, 2026 11:29 UTC (Thu) by pizza (subscriber, #46) [Link] (1 responses)

> By the time I submit that PR, I have read, rewritten, broken it, and fixed it enough times that I know the code I’m submitting well enough for a review.

For every one that shares your attitude, there are at least three orders of magnitude more folks that unquestioningly use/accept/submit the first iteration that appears to vaguely "work" [1]. And are utterly incapable iterating beyond that, if they even care to do so.

[1] ie it compiles (albeit with a large pile of warnings) and doesn't immediately catch fire.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 16, 2026 15:23 UTC (Thu) by elru (subscriber, #170791) [Link]

But then isn’t banning LLMs the wrong answer, in that case? The policy should be to not submit code that you yourself do not understand, shouldn’t it? I think the core issue that you describe is the issue of people submitting code for which they have no agency over.

And that’s “agency” in not only the legal context, but also the practical context of simply having the ability to iterate upon it because they lack understanding. Methodology and technical rigor can be, and should be, demanded from contributors regardless of whether or not an LLM is used.

I can certainly acknowledge that LLMs enables this bad behavior though. After all, the code wouldn’t even exist if not for LLMs in the cases you described, but I do worry that the approach to ban them wholesale is throwing the baby out with the bath water.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 15, 2026 18:40 UTC (Wed) by nas (subscriber, #17) [Link] (14 responses)

> Otherwise it's like going to a gym with a forklift. Sure, the weights are going up and down, but how exactly is that supposed to get you into shape?

I don't think that's a great analogy. It's more like someone comes to a construction site with a forklift and starts moving blocks around with it. Then the old timers say "we only allow moving blocks around with human muscles" and "how else are you going to get strong?". To stretch the analogy further, you have bozos coming into the job site with forklifts, flinging blocks around randomly, making a huge mess, because they have no knowledge of how the construction is supposed to work or where the blocks should actually go. The old timers learned the ropes while developing the muscles, when they were weak and inexperienced, they couldn't do much damage.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 15, 2026 20:39 UTC (Wed) by rgmoore (✭ supporter ✭, #75) [Link] (6 responses)

There is more than one potential issue with using LLM, and they all tend to get jumbled together as "LLM bad". In addition to the two you mention- vibe coders clogging projects with low quality LLM-coded contributions and LLM use preventing coders from developing their skills- there are serious worries about the copyright status of LLM-generated code and whether it exposes projects to legal risk. I think all of these questions can be answered with time, and the objections will either be accepted as valid (decreasing LLM use) or invalid (allowing LLM use).

One issue that probably won't just go away is the question of the morality of creating LLMs. Many people still oppose LLMs in general because they depend so heavily on massive use of copyrighted material for their training, much of it obtained illegitimately. The only way of fixing that objection is to figure out how to train LLMs on much smaller data sets. I think that's probably something the LLM designers need to do anyway, but it's going to be a serious moral problem until the training set problem is fixed.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 16, 2026 6:37 UTC (Thu) by ibukanov (subscriber, #3942) [Link] (5 responses)

LLMs fundamentally are not different from the first neural networks from over 60 years ago. The real difference is hardware allowing to train on huge data sets. They literally need all human knowledge to work even in specialized areas.

So I am sceptical that we will have models that can be trained on a smaller datasets but with the power of big LLMs anytime soon.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 16, 2026 15:02 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link] (4 responses)

They literally need all human knowledge to work even in specialized areas.

I'm not sure how true that is. I am absolutely not an expert, by my impression is that the big chatbot LLMs have been trained on absolutely everything because the companies making them want the ultimate generalist that's able to work equally well in any field of knowledge, not because that level of knowledge is actually necessary for specific purposes. For example, an AI coding assistant needs some level of language training so it can understand natural language, but it doesn't necessarily need to have been trained on every novel and movie script to get there.

I think there are two reasons the LLM companies might have taken the generalist approach. One is that they made a strategic decision that it's easier to train a single generalist model than dozens of specialists. I can definitely see the appeal of that approach, though I don't have the technical background to know if it's the right choice. The other possible reason is that they actually want a true generalist, both because they dream of AGI and because it's great marketing. LLM companies really seized public attention by promising their bots were going to be able to be PhD-level smart at every task, and that probably wouldn't have happened if they had presented separate models that functioned as coding assistants, customer service agents, and writing advisors.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 16, 2026 15:42 UTC (Thu) by kleptog (subscriber, #1183) [Link] (3 responses)

> One is that they made a strategic decision that it's easier to train a single generalist model than dozens of specialists

And the fact that it is fairly straightforward to turn a generalist model into a specialist, but the other way round is impossible. Specialising a general model is a service people pay for.

There's also the argument that a coding assistant doesn't need to know the classics. Any natural language contains all sorts of idioms, metaphors, etc that it's going to have to understand to be useful to users. We tell educated people they should read certain classics so they understand where certain phrases, idioms, jokes, etc come from. The meaning of "gaslighting" is much more understandable if you know the story that created the term. A dictionary definition doesn't really cover it. So arguably an LLM should be trained with these classic so it "understands" the idioms.

The next step will be models that attempt separate the language recognition like metaphors and idioms from the underlying knowledge base. I think it's an interesting experiment, since it's not something we can test with humans. It's not at all clear this is possible, so the attempts will teach us something useful.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 16, 2026 22:53 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link] (2 responses)

Any natural language contains all sorts of idioms, metaphors, etc that it's going to have to understand to be useful to users. We tell educated people they should read certain classics so they understand where certain phrases, idioms, jokes, etc come from. The meaning of "gaslighting" is much more understandable if you know the story that created the term. A dictionary definition doesn't really cover it. So arguably an LLM should be trained with these classic so it "understands" the idioms.

I'm not sure how much people actually need to know the original sources to understand the references. I'm willing to bet that most of the people who talk about "gaslighting" have never watched Gaslight, but they understand what it means because they've encountered it enough to pick it up. An interesting aspect of this is that idioms shift precisely because people using them lack the original context. For example, the term "sea change" is originally from Shakespeare's "The Tempest", where it was used quite literally to mean a change caused by the sea, but it has taken on a different, idiomatic meaning of a substantial change. Because of those shifts, you would probably learn more about how idioms are used in practice by observing ordinary usage rather than going back to the origins.

One of the interesting things about learning the classics is that the core idea behind "the classics" is curation. Humans have a limited capacity to absorb material, if only because there are only so many hours in a human lifespan, so we have no hope of reading every book and watching every movie and TV show. Instead, we try to find the most important works so we can maximize the value we get from our effort. That's almost the opposite approach from that taken by LLMs, which seem to be trying to take in absolutely everything, good, bad, and indifferent. I do wonder if LLMs wouldn't be better if their trainers applied some curation to their training material. They would likely be better programmers if they learned exclusively from good code rather than bad, for instance.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 17, 2026 9:02 UTC (Fri) by paulj (subscriber, #341) [Link]

> For example, the term "sea change" is originally from Shakespeare's "The Tempest", where it was used quite literally to mean a change caused by the sea,

Having been dragged out sailing a lot as a kid, I always thought that just referred to how the state of the sea around you can sometimes suddenly change quite substantially - a sea change. The state of the sea, and how it is changing, is a fairly regular topic of conversation when you're out on boats at sea.

I strongly suspect Shakespeare's use is just drawing from routine mariners' talk.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 17, 2026 9:10 UTC (Fri) by kleptog (subscriber, #1183) [Link]

> I'm willing to bet that most of the people who talk about "gaslighting" have never watched Gaslight, but they understand what it means because they've encountered it enough to pick it up.

Judging by the number of people I see using it incorrectly I'm not sure this works. This is of course how language drifts so not really a problem. You don't need to watch the movie though, just the one paragraph synopsis is enough.

> That's almost the opposite approach from that taken by LLMs, which seem to be trying to take in absolutely everything, good, bad, and indifferent. I do wonder if LLMs wouldn't be better if their trainers applied some curation to their training material.

This is the trend though. It been fairly clear for a while that throwing more training data at the problem doesn't improve models measurably. All the improvements now are going to be architecture and curating the training data.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 20, 2026 8:41 UTC (Mon) by autious (subscriber, #114303) [Link] (6 responses)

Also the forklift is composed of stolen parts.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 20, 2026 11:15 UTC (Mon) by smurf (subscriber, #17840) [Link] (5 responses)

> Also the forklift is composed of stolen parts.

No it's not. It might have been built by somebody who examined a whole lot of machinery, in various states of (dis)repair, and some of which they don't own.

However, I'd contend that, for the most part, LLMs are trained on publicly-available code. If they "steal" by learning and applying patterns that are useful, then so do you and me.

I'm more concerned about the fact that LLMs don't have taste, thus the "states of disrepair" problem . They copy bad patterns, outdated workarounds, obscure XY-problem "solutions" and whatnot, from Stack Overflow and neglected git archives and forum pages etc., along with the proven and up-to-date ones. This shows in their output.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 24, 2026 11:19 UTC (Fri) by autious (subscriber, #114303) [Link] (4 responses)

I'd argue that "learning" is a semantic trick and ask if everyone was instead talking about "encoding" if people's view would be different.

If LLM's are viewed as a database that encodes data and stores in it a novel way, is that transformative enough? Where between training on video and generating clips or just switching codecs of a video-stream does it stop being transgressive?

And what the LLM's are trained on isn't exactly public information, and I suspect it's a case of it training on anything they can get access to. As would be rational in a gold rush such as this, where every little minor edge is a massive financial success. Large companies hire smaller companies for datasets, small companies do everything they can to create useful datasets. So I don't even think they themselves know or track what they've ingested.

As an example, GitHub has not explicitly answered the question weather they train on private repos, only that they do not train on enterprise repos. So I don't trust that it's cleanly filtered on "publically available" source.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 24, 2026 11:55 UTC (Fri) by malmedal (subscriber, #56172) [Link]

> "learning" is a semantic trick

I'll argue that LLM learning is extremely human-like. Specifically it's an almost exact match for a human learning a new language. All the quirks attributed to LLMs, need for endless repetition, hallucinations, catastrophic forgetting etc. are also things that normally happen when you try to learn a new language.

Difference is that humans clearly have additional machinery that works differently, but the language bits works pretty much the same.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 24, 2026 17:43 UTC (Fri) by bluca (subscriber, #118303) [Link] (2 responses)

> If LLM's are viewed as a database that encodes data and stores in it a novel way, is that transformative enough? Where between training on video and generating clips or just switching codecs of a video-stream does it stop being transgressive?

In the bit where the law says it does. All training is done in derogation to copyright law, provided the datasets are publicly available and don't explicitly opt out.
On the other hand if _you_ add copyrighted data that you shouldn't touch to the local prompt/context, then _you_ as the user are doing something wrong. Just like if you copy and pasted from stack overflow without attribution it's not your browser's or your IDE's fault, it's all yours.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 26, 2026 4:46 UTC (Sun) by smurf (subscriber, #17840) [Link] (1 responses)

> All training is done in derogation to copyright law, provided the datasets are publicly available and don't explicitly opt out.

I don't understand that sentence. Don't you mean "except when" and not "provided"?

> Just like if you copy and pasted from stack overflow without attribution it's not your browser's or your IDE's fault, it's all yours.

But if I transform and use that copy in my own work it's OK, and so is me learning from it and using that information.

Why should a machine that does the exact same thing, and/or a user who prompts it to do so, be illegal?

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 27, 2026 1:38 UTC (Mon) by bluca (subscriber, #118303) [Link]

> > All training is done in derogation to copyright law, provided the datasets are publicly available and don't explicitly opt out.
> I don't understand that sentence. Don't you mean "except when" and not "provided"?

No? The only exceptions are if dataset is private and obtained illegally (eg: that company that got caught torrenting stuff), or if the owner of the dataset explicitly opted out (unless the training is for educational/research purposes, in which case there's no opt out). If none of these conditions apply, then anyone is free to use the dataset to train a model in any way they like, as an explicit exception to copyright.

> But if I transform and use that copy in my own work it's OK, and so is me learning from it and using that information.
>
> Why should a machine that does the exact same thing, and/or a user who prompts it to do so, be illegal?

It's not per se, but it becomes a case-by-case situation, depending on the actual details of the individual instance, as there's no explicit permission baked in the law, so one needs to be careful - exactly the same situation whether it's a person or a machine using the materials. For example: is the snippet being copied even copyrightable at all? Or is it "too short" or "too obvious" or "too generic" to even qualify? These questions can't really be answered for the generic case, not in a foolproof way.
Training is different: the law explicitly allows it, so none of that matters, and it's always generally permitted, modulo the exceptions noted above.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 15, 2026 22:00 UTC (Wed) by josh (subscriber, #17465) [Link]

> LLMs have gotten very good at writing, assisting, reviewing code and finding vulnerabilities.

It's one thing to use an LLM to find a vulnerability, so that a human can fix it.

It's another to have LLMs writing code. An LLM-maintained project will quickly become only LLM-maintainable.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 19, 2026 2:31 UTC (Sun) by mirabilos (subscriber, #84359) [Link] (26 responses)

I have two quotes for you:

> A hard ban on LLM/"AI" use in a FLOSS project is the moderate stance.

-- https://come-from.mad-scientist.club/@algernon/statuses/0...

> Folks, I think we're now past the time where we can claim to be in the middle about generative AI usage.
>
> You're either for it and okay with intellectual theft, racism, land grabs, polluted water, higher power bills, and creating an addicted population that can't think.
>
> Or you're against it.

-- https://mstdn.social/@Editrix_Rachel/116369904032087546

If you argue the way you do… well, you’re painting yourself into that corner yourself.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 20, 2026 22:27 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (25 responses)

I really hate the toxic "with us or against us" mentality (note: I hate the *mentality*, not the people who express it).

Yes, AI has problems, many of them quite substantial. Yes, there is a conversation to be had here. But how are we supposed to have that conversation, when you're explicitly polarizing the discussion and demanding that everyone pick one extreme or the other? It is a disingenuous thought-terminating cliché, and I see no point in trying to engage with it.

And yes, I am biased as a Google employee and shareholder (speaking for myself, not my employer). I own a fair amount of RSUs, which will probably lose a lot of value if AI turns out to be a bubble. Even so, I feel deeply conflicted about the technology. But I refuse to turn my brain off and brand the whole enterprise as "good," "evil," or whatever other thing people are labeling it with. It's more complicated than that.

Why is it complicated? For starters, at least half of the complaints people have about AI are at least somewhat applicable to general computing (climate change, data centers), and/or capitalism as a whole. And while I will accept that "no ethical consumption under capitalism" is another disingenuous thought-terminating cliché, we do need to engage with these problems on their own terms.

In the case of climate change, the complaint frankly makes no sense to me at all. There are numerous industrial and commercial processes which use a ton of energy in one form or another. Data centers are one of the *least* concerning of those uses, because they are already electrified and connected to the grid. We already need to swap the grid over to renewables anyway. Sure, increased energy consumption makes that harder, but the whole point of the exercise is to accommodate all current and future electricity needs. We can and should account for growth, because basically all economies are deliberately trying to induce economic growth at basically all times, and it is hardly plausible that we can have significant economic growth without significantly increased power consumption. AI is just the particular means by which the market is expressing that growth. If it wasn't AI, it would be something else, and we'd still have to build more renewables anyway.

Theft, as I've explained before, is a shaky analogy at best and outright misleading at worst. There is a fundamental disconnect over whether copyright is merely a legal instrument, or an ethical right as well. If it is merely a legal instrument, then the answer to this objection is given by whatever the courts say about the matter. If it is an ethical right, then things are far murkier. What is the scope of this right, and how does that scope differ from its legal scope? How do we analyze a proposed invasion of this right, and how does that analysis differ from what judges are already doing? More specifically, can an invasion of this right exist in the absence of substantial similarity, and if so, why? I have yet to see an intellectually serious response to questions such as these, but I'm willing to hear one if it is given.

I'm not going to go through the remaining objections, because this rant is too long already. Instead, I would encourage the reader to think critically and draw your own conclusions.

If I have answers for everything, then why do I feel conflicted? Mostly because of the slop. I don't think there's a great answer for that. But I also don't think that all AI output is necessarily and irredeemably slop. We now know it's somewhat good at finding CVEs, and it is probably good at one or two other things, at a minimum. But flooding the internet with slop feels like a steep price to pay for that.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 20, 2026 22:49 UTC (Mon) by mirabilos (subscriber, #84359) [Link] (20 responses)

> But how are we supposed to have that
> conversation, when you're explicitly polarizing the discussion and
> demanding that everyone pick one extreme or the other?

Which part of “is the moderate stance” did you not understand?

> Why is it complicated? [… lots of supposed reasons …]

Perhaps, but collaborators have found similarly complicated reasons at many points in the past.

Data centres exist, but “AI” data centres are orders of magnitude worse. Merely connectivity is not the point. The amount of electricity required is, toxic gases (yes! people are getting ill from that!), water pollution, etc. are also problems. I invite you to dig through the list of references I put together, which I’m pasting below, in which there are several that debunk this argument… as well as the others.

And yes, it *is* theft. Unconsensual, DDoSing, theft, with lack of attribution, and worse.

Here’s the references:

Let alone the building of new polluting coal/oil/gas power plants and even of millions-of-years-of-long-term-damage-and-destruction nuclear power plants, including the destruction of "one of the most green and vibrant ecosystems in Iceland", as Bjaldur noted. https://www.baldurbjarnason.com/2024/your-use-of-ai-harms...

And the eso-fascism of the TESCREAL ideology behind it, active haters of humanity. https://en.wikipedia.org/wiki/TESCREAL ← this especially affects you as a Google person

https://fediscience.org/@jameshowell/116309452763757936 another list

https://unstable.systems/@jneen/116302498173795149 more practically, for those wo do want to set aside all the other concerns (notwithstanding what we think about that, as shown above)

https://kolektiva.social/@sidereal/116155034249976161 another practical argument

https://mastodon.scot/@AlienAnomaly/116258275386325907 what every techbro considering using it should know

https://hachyderm.io/@dalias/116150709467968620 on the plagiarism point

https://mastodon.social/@marginalia/116098502707996339 on the brain-rot point

https://mastodon.green/@gerrymcgovern/116125146223980748 on the… ugh… TESCREAL point, I guess, and others:

Big Tech's plan for human genocide and eugenics

"Sam Altman defends AI’s energy toll by saying it also takes a lot to ‘train a human’"

https://social.treehouse.systems/@bodil/116093694360361788 the most glaring point

https://social.treehouse.systems/@jnkrtech/11608377052754... the fascism angle

https://tldr.nettime.org/@w0bb1t/115910853597507208 fashtech network

https://wetdry.world/@lucydev/115912618642468067 it doesn’t democratise anything

https://mastodon.online/@danirabbit/114999615783590491 it’s right-wing anti-labour instead

https://c.im/@cdarwin/115914320802090355 yet more evidence of theft

https://wandering.shop/@xgranade/115760265788529588 more on their dissembling of discussion

https://corteximplant.com/@m4ra/115803691440852788 on the human brain angle

https://infosec.exchange/@reverseics/115622628761543237 a relevant prahou comics, also available as sticker

https://mastodon.social/@nixCraft/115554484108496189 criticism from a founder of modern AI research

https://mastodon.social/@razeback/115531398990093591 on the ecologic angle

https://social.tchncs.de/@kuketzblog/115518890876430898 (🇩🇪) Doctolib puts sensible health data into the slop machine

https://mastodon.green/@gerrymcgovern/114987330170413413 … which is super dangerous

https://helvede.net/@jwcph/115475458905029125 they cannot even summarise!

https://eldritch.cafe/@Tattie/115471948737378653 it’s a trick!

https://mastodon.world/@ApostateEnglishman/11472137601844... more fascist angle

https://mas.to/@mcdutchie/115278780610869693 long-term asbestos

https://toot.cafe/@baldur/115241804708250257 it cannot possibly not confabulate!

https://beige.party/@maxleibman/114520618508217071 it’s confabulation, not hallucination

https://unstable.systems/@sop/114898566686215926 iconic anigif

https://mastodon.green/@gerrymcgovern/115189490568935553 slavery / unethical preparation of “training data”

https://mastodon.green/@gerrymcgovern/115151695574808432 more of the same

https://mastodon.social/@dahukanna/115178318018668062 “Over four months, LLM users consistently underperformed at neural, linguistic and behavioural levels.”

https://mastodon.green/@gerrymcgovern/115151682173197970 “AI chatbots can distort memory”

https://www.baldurbjarnason.com/2025/trusting-your-own-ju... and you cannot self-experiment with this

https://mastodon.green/@gerrymcgovern/115100758921146954 why data centres pollute so much water

https://social.erambert.me/@eramdam/114309822759993528 oh, btw, alt text doesn’t train “AI”

https://social.chinwag.org/@FediThing/113386276555849839 the e-waste angle

https://awful.systems/comment/6568236 thread (multiple relevant posts) about the eso-fascist angle, with a terrifying relationship to why they want Trump to annex Greenland (my [excerpts] in case it doesn’t load)

https://hachyderm.io/@cczona/114518900536403299 a (older, by now) look at environmental impact, climate, but also contamination

https://link.springer.com/content/pdf/10.1007/s10676-024-... it’s bullshit

https://www.theregister.com/2025/05/12/us_copyright_offic... (thanks Fefe)

US Copyright Office found AI companies sometimes breach copyright. Next day its boss was fired

https://awawa.club/objects/93987903-d4bb-41a0-8d07-361006... what to put into a job application form’s field asking about “AI” use

https://toot.cafe/@baldur/114410495463114687 they’re not tools!

https://mastodon.social/@anon_opin/114406490519883990 except to fraudulently hide how bad you really are at your job

https://aus.social/@jimbob/114406766000950223 on dependency

https://social.marxist.network/@yogthos/114348159778338092 “100+ Meta employees, including Head of AI Policy, confirmed as ex-🇮🇱IDF”

https://mastodon.social/@caseynewton/114265531280810009 bandwidth costs caused by “AI” scrapers illegally and without consent DDoSing everyone

https://mastodon.social/@SteveFaulkner/114217712724605423 it cannot even write alt texts!

https://www.baldurbjarnason.com/2024/openai-whisper-risks/ nor transcribe audio

https://mamot.fr/@krazykitty/114179525603659436 its use leads to significantly more security issues (see also the March-April 2026 leak of the code behind Claude)

https://mastodon.social/@firusvg/114091868795515812 “Microslop CEO admits that AI is generating basically no value”

https://towns.gay/@PedestrianError/114067667937606778 do not even use it to prove it wrong / laugh about it / criticise it, for that ⓐ does significant damage to the environment each time and ⓑ confirms their path

https://mastodon.bits-und-baeume.org/@bits_und_baeume/113... (🇩🇪) another older article on environmental impact

https://mastodon.functional.computer/@samir/1140146319707... the term “AI-generated” for its output is misleading; call it “regurgitated”, for that is what it is

https://toot.cafe/@baldur/114013196130019000 it cannot even summarise! (one would have thought that to be the one thing large language models were good at)

http://www.tesio.it/2021/09/01/a_decompiler_for_artificia... “There is no "learning" in "artificial neural network"”

https://someone.elses.computer/@mikarv/113973564347240002 Microslop knows they’re damaging the next generation

https://mastodon.social/@benroyce/113969545750172492 Fakebook admits use of dozens of Tebibytes of stolen works

https://hachyderm.io/@shafik/113909383572730415 companies notice that people who know more about AI are less likely to react well to use of it

https://mastodon.well.com/@rk/113935547297276697 they cannot produce anything novel (or even exceeding average)

https://mstdn.io/@deutrino/113772182693397843 what use of slop in marketing, headlines/teasers/photos, etc. signals (other than “go away”)

https://toot.cafe/@baldur/113724322984825466 LLMs are a dead-end: “From the perspective of software engineering, current AI systems are unmanageable, and as a consequence their use in serious contexts is irresponsible.”

https://toot.cat/@dwenius/113629651684569770 even one gratis “AI“ computer is unsustainable due to power requirements

https://mstdn.social/@ashleemboyer/113567433741433301 disabled people justifiedly getting angry at slop instead of true accessibility

https://infosec.exchange/@david_chisnall/113554112644010193 “LLMs are the new memory-safety bugs.”

https://www.schneier.com/blog/archives/2024/11/ai-industr... Bruce Schneier on how slopbros ruined the Open Source Initiative

https://social.linux.pizza/@lalle/113370532309892596 one of the (by now several) suicides of children(!) caused by the slop machines

https://toot.aquilenet.fr/@civodul/113370339503788304 just say no

https://toot.cafe/@baldur/113373172447099565 because their use in hospitals to transcribe speech confabulates wrong things about the patients 🙀

https://fosstodon.org/@saramg/113365761968400493 it’s a planet-burning scam in a series of worldburning scams

https://weatherishappening.network/@wordshaper/1133569037... what LLM summaries actually are “useful” for 😹

https://final.town/objects/17ac3f46-2cff-4a53-83dc-bc6253... scammer caught

https://tech.lgbt/@ShadowJonathan/113290617698486637 / https://mastodon.me.uk/@CatherineFlick/113294299997512316 “LLMs cannot do formal reasoning” (we knew)

https://xoxo.zone/@clarity/113289517483316507 it’s slavery all the way down

https://toot.cafe/@baldur/113214687561016279 not even significant productivity gains!

https://mastodon.social/@jawnsy/113192864281867455 “By turning off your lights all day every day for a month, you conserved about 1 percent of the energy needed for AI to generate a picture of a duck wearing sunglasses. Isn’t he cute? Aside from the fact that he has the feet of a human man, of course.”

https://mstdn.social/@rysiek/113108955596753698 be careful with metaphors

https://social.heise.de/@heiseonline/113091056784860072 (🇩🇪) of course it violates copyright

https://mas.to/@gleick/113058537194470078 described as “money laundering for copyrighted data” a year before the python-chardet incident

https://mastodon.social/@bkastl/112961789576564659 (🇩🇪) a university discovers how wrong they were about its utility

https://mas.to/@Techaltar/112940344612631126 “if I see a slop intro/teaser/photo for it, I assume it's low quality spam”

https://post.lurk.org/@emenel/112111014479288871 “remove all your stuff from Microslop Github” (and Microslop LinkedIn, which spies on you every time you load it in a Chromium-based browser, getting thousands of data points, including deeply personal info)

https://fediscience.org/@petergleick/111226207940194391 more on energy use

and of course what I wrote about this on https://mbsd.evolvis.org/permalinks/wlog2021_e20240726.htm and https://evolvis.org/~tg/cc.htm#refs contains a few references as well, below the part where I put interpretation guidelines for those works I CC-licenced because the Creative Commons institution followed the OSI in kowtowing to the fashtech bros

https://social.nouveau.community/@andnull/116335068815668672 the social effect and why individuals that cannot respect anti-llm policies must be removed (thread)

https://mastodon.social/@tef/116426764250277484 more on "AI" directly causing preventable death of children, and impact on brains in general; thread links to multiple studies

https://writing.exchange/@hiisikoloart/116368038680446376 they cannot even do one thing (like removing the background from a drawing) without fucking up the rest, which is forbidden under § 14 UrhG; short thread

https://anatomyof.ai/ was reported to be a phantastic resource as well

https://scholar.social/@olivia/116421199865718260 an open letter to sign, yearly if need be

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 20, 2026 23:29 UTC (Mon) by intelfx (subscriber, #130118) [Link] (16 responses)

> Which part of “is the moderate stance” did you not understand?

If your "moderate stance" is already _that_ much extremist, that isn't a rebuttal you were hoping for. Neither was the overt personal attack.

> even of millions-of-years-of-long-term-damage-and-destruction nuclear power plants

Oh, boy. :-)

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 0:04 UTC (Tue) by mirabilos (subscriber, #84359) [Link] (15 responses)

"oh boy", indeed.

https://norden.social/@grimm/116438691194180273

The completely inacceptable damage to the environment from nuclear waste even not calculated in.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 0:38 UTC (Tue) by pizza (subscriber, #46) [Link] (1 responses)

> The completely inacceptable damage to the environment from nuclear waste even not calculated in.

It is technically (quite) feasible to process nuclear waste into something more inert than the original raw uranium ore. It is even feasible to store completely unprocessed waste indefinitely. Unfortunately, there's no political will to do either (at least in the US) to the point where we're stuck with the worst of all worlds thanks to the rug pull at the Yucca Mountain site.

Meanwhile, on average, coal plants release far more radioactive byproducts into the environment than fission plants do, even factoring in the likes of Chernobyl and Fukushima -- And coal ash piles are responsible for some pretty heinous (yet somehow "acceptable") damage to the environment. Oh, and then there's the numerous geopolitical disasters due to the heavily-subsidized fossil fuel supply chain.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 1:00 UTC (Tue) by mirabilos (subscriber, #84359) [Link]

Yes, it is *technically* possible.

But does anyone do that?

Meanwhile, we shat down coal plants, which are now getting reactivated for “AI” data centres.

Easy “no”.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 2:10 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (11 responses)

> The completely inacceptable damage to the environment from nuclear waste even not calculated in.

That's pure BS. In fact, nuclear is _the_ _only_ type of energy source where waste management is baked into the energy price. And right now it's simply more cost-effective to store the waste in dry casks. It costs very little, and the total amount of high-level waste (things like minor actinides) is negligible. The breakpoint of reprocessing versus storage is more than 100 years in the future.

The price of nuclear might not be, in general, competitive with renewables + storage. But in the particular case of Germany it _is_, and that's why we see so many outright lies from the anti-nuclear activists.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 2:22 UTC (Tue) by mirabilos (subscriber, #84359) [Link] (10 responses)

Have you even read the post?

In the calculation from the post, it isn’t.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 3:06 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (9 responses)

Yes. Have YOU? It says nothing whatsoever about the lifecycle cost of nuclear waste.

We'll leave the totally unrealistic 4 cents per kWh estimate of renewable costs aside.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 3:42 UTC (Tue) by mirabilos (subscriber, #84359) [Link] (8 responses)

Feel free to take it up with those who actually did the study and those who monitored that, instead of claiming the armchair LWN commentor you are knows better.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 4:01 UTC (Tue) by intelfx (subscriber, #130118) [Link] (1 responses)

> Feel free to take it up with those who actually did the study and those who monitored that

No, sir, that's not how debates work. You don't get to cite something as an argument and then say "take it up with them" when it is challenged.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 4:47 UTC (Tue) by mirabilos (subscriber, #84359) [Link]

What gave you the impression I want to *debate* with you, people who still haven’t realised the times in which we live in and/or that cling to tech-fascism?

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 4:14 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (5 responses)

I actually _do_ know better because I studied this particular question professionally. As in "being paid money to study it".

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 4:48 UTC (Tue) by mirabilos (subscriber, #84359) [Link] (4 responses)

Given you post things such as…

> > The completely inacceptable damage to the environment from nuclear waste
>
> What damage?

… I cannot believe that. Those who paid you ought to ask for their money back.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 5:46 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

Yes, please elaborate on the damage to the environment the nuclear waste from energy-producing reactors is inflicting or can cause in the near future.

Modern nuclear waste management means chemically separating the high-level waste and storing it in dry casks. These casks do as much damage to the environment as concrete blocks of the same size. All the difference is that they are slightly warm to the touch.

The chemical separation process is not at all nice, but all the waste streams are accounted for and tightly controlled. Any waste that can't be rendered safe is buried in the same casks. The total amount of waste that needs to be or was already reprocessed is around 500000 tons. The resulting amount of high-level waste is around 1-2% of that, so around 10000 tons.

That's really not a lot. And consider that solar panel, wind turbine, transmission line, and battery production also do damage to the environment from mining and production.

So no, you don't get to wave hands around and proclaim that everything around you is fashtech. You can claim that nuclear can't be scaled fast enough, or that European heavy industry is far too gone to be able build nuclear reactors, or that renewables will be cheaper in the end. These are points that are at least arguable. But not that nuclear power is somehow über-damaging.

Stop now

Posted Apr 21, 2026 7:50 UTC (Tue) by corbet (editor, #1) [Link] (2 responses)

You all know that this has gone far off topic, stop here please.

Stop now

Posted Apr 21, 2026 14:24 UTC (Tue) by mgb (guest, #3226) [Link] (1 responses)

This happens far too often. Why not just lock the thread?

Stop now

Posted Apr 21, 2026 14:25 UTC (Tue) by jzb (editor, #7867) [Link]

We have. However, we usually try to gently nudge people before hitting the moderation button since that affects all readers and commenters and not just a few folks arguing.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 3:56 UTC (Tue) by intelfx (subscriber, #130118) [Link]

> https://norden.social/@grimm/116438691194180273

Yeah, for starters, that so-called study does not provide any source for the numbers whatsoever. The only way I could re-derive the "10 cent/kWh" figure from the populist tweet was by assuming a useful life of a nuclear power plant to be 10 years, which is a borderline malicious lie.

In other words, bring better sources.

> The completely inacceptable damage to the environment from nuclear waste

What damage?

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 0:16 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (2 responses)

I wanted a discussion, not "here's a giant pile of links, trawl through it and figure out my argument on your own."

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 0:18 UTC (Tue) by bluca (subscriber, #118303) [Link]

You know that thing about trying to play chess with a pidgeon?

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 1:01 UTC (Tue) by mirabilos (subscriber, #84359) [Link]

We have had “a discussion”, for way over a year.

Anyone who is still proposing “AI” knowing these arguments is a fashtech proponent, at this point.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 20, 2026 23:26 UTC (Mon) by pizza (subscriber, #46) [Link] (1 responses)

> Yes, AI has problems, many of them quite substantial. Yes, there is a conversation to be had here. But how are we supposed to have that conversation, when you're explicitly polarizing the discussion and demanding that everyone pick one extreme or the other?

I like how merely questioning the narrative heavily being pushed by purveyors of AI tools (and those that would use said tools to replace humans en masse) is "explicitly polarizing".

It's hard to have a "conversation" with someone actively trying to kick in your teeth.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 0:13 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

Saying "you're with us or against us" (paraphrased) is not "merely questioning the narrative." I'd love to have a discussion where we question the narrative. "You're with us or against us" is not that discussion.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 1:11 UTC (Tue) by dskoll (subscriber, #1630) [Link] (1 responses)

I run a few open-source projects. My main one has essentially one author (me) though I've taken minor patches from other people.

I don't accept AI-generated code or documentation in my project. Maybe I'm "insane". Maybe I'm a Luddite. But for me, this is an ethical question.

I'm not putting all my objections to AI in this comment directly, but they are all laid out on my web site.

Anyway. I know I'm probably fighting a losing battle. I'm retired, but my friends still working in the industry are all using AI, either by choice or (more likely) because their management mandates it. I will insanely and Luddite-like keep my little corner of the Internet free of Generative AI.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 21, 2026 1:19 UTC (Tue) by mirabilos (subscriber, #84359) [Link]

Yes, this nonsense needs to stop.

Thanks for staying strong. On the Fediverse, you’ll find that you’re not alone in resisting the latest fad, if you like.

Ignoring LLMs in open source is becoming pure insanity

Posted Apr 19, 2026 21:48 UTC (Sun) by Karellen (subscriber, #67644) [Link]

At a certain point practicality wins over purist political views.

The creation and continued existence of the Free Software movement in the face of commercial software ecosystems would seem to be a living contradiction to this point. If keeping LLMs out of software requires a bunch of bloody-minded contrarians insistent on doing things the hard way over taking the practical convenient path because of some weird philosophical ideal, well, that community seems ideally suited to the task if you ask me :-)

are llm contributions unavoidable?

Posted Apr 15, 2026 16:53 UTC (Wed) by bentley (subscriber, #93468) [Link] (4 responses)

I'm curious how projects like this will handle backports, especially security patches, that were originally contributed with assistance from LLMs. My mind immediately goes to something like a bounds-checking bug - if it was found by an LLM, and the fix is a one line patch co-"authored" by an LLM, should the maintainers of the ai-free project...accept it as is (seemingly violating the project's policy)? re-write the patch (which feels like like a spiritual violation of open source principles, even if not an actual license violation)? leave it un-patched?

are llm contributions unavoidable?

Posted Apr 15, 2026 17:37 UTC (Wed) by khim (subscriber, #9252) [Link] (3 responses)

It may surprise you, but the answer already exist and it's so obvious I wonder why anyone even bother: if patch is so simple that it couldn't be copyrighted (as in: three lines of code and all three are dictated by relevant standards) then it doesn't matter whether it's written by LLM or human, if patch is complicated enough to by copyrightable — then such contributions are unacceptable.

are llm contributions unavoidable?

Posted Apr 15, 2026 18:04 UTC (Wed) by bentley (subscriber, #93468) [Link] (2 responses)

Maybe I'm misreading things, but I don't think the objection to llm written (or llm assisted) code primarily relates to copyright. Either the projects allow "co-authored-by: claude-code@anthropic.ai" (or similar) for simple changes, or they re-write the commit messages (which again, is probably not a copyright violation but I think goes against the spirit of giving proper attribution to fixes).

are llm contributions unavoidable?

Posted Apr 15, 2026 19:06 UTC (Wed) by khim (subscriber, #9252) [Link] (1 responses)

Copyright if part of the reason why people reject LLM-produced things, but, more importantly, it's ages-long de minimis principle tells us when change is so simple as to make it irrelevant who made it. Should be applicable to LLM-written code, too, why not?

are llm contributions unavoidable?

Posted Apr 16, 2026 13:44 UTC (Thu) by NAR (subscriber, #1313) [Link]

The EVi contributing documentation does not mention copyright issues. They refuse generative AI use, seemingly regardless that the product of AI use is copyrightable or not.

Unenforceable rule

Posted Apr 15, 2026 18:10 UTC (Wed) by mb (subscriber, #50428) [Link] (8 responses)

The rule to ban LLMs reduces the quality of the projects doing so. Today's coding agents are extremely good at writing correct code, if used correctly. And they are extremely good at finding problems. If you ban LLM review assistance, it puts your users at risk, because attackers will just find security problems with LLMs.

And this rule is not enforceable. Or how do you distinguish my handwritten code from the code that I generated and reviewed? It's not possible. The bits have no LLM color.
And why does it even matter?

What does it even mean to have LLM assisted or generated code?
If Copilot gives me an autocompletion for the obvious next edit, is that LLM generated or not? I would have typed the exact same thing anyway.

When accepting code from a developer with or without LLM assist, the situation is exactly the same:
The code has to be reviewed.
I don't see why LLM makes the situation worse.
And yes, I have also already rejected contributions which were LLM slop. Just like I have also rejected other bad human contributions in the past.

Unenforceable rule

Posted Apr 15, 2026 18:15 UTC (Wed) by josh (subscriber, #17465) [Link] (4 responses)

> And this rule is not enforceable. Or how do you distinguish my handwritten code from the code that I generated and reviewed? It's not possible. The bits have no LLM color.

How do you distinguish between your original code and code you copied from a proprietary project in violation of its license? The bits have no color...except that they do. https://ansuz.sooke.bc.ca/entry/23

We solve it in the same way: we make a policy against it, we tell people not to do it, we ask them to assert that they have not, and we treat violation of that policy as a severe breach of trust. It won't be perfect, but it's not nothing.

Unenforceable rule

Posted Apr 15, 2026 18:22 UTC (Wed) by mb (subscriber, #50428) [Link] (3 responses)

And how is that related to LLMs? Nobody is copying code. That's not how LLMs work.

Sure, you can establish all kinds of rules.
Adding rules of any kind will not improve the projects popularity, except in the LLM-hater niche.

>we treat violation of that policy as a severe breach of trust

Sure. That means you have to detect it first, though. How? What color are the bits?

Unenforceable rule

Posted Apr 15, 2026 19:42 UTC (Wed) by smurf (subscriber, #17840) [Link] (2 responses)

Exactly.

There's another logical problem with rejecting LLM output for copyright-color reasons: the LLM training has consumed a bazillion of bits, of which maybe 1‰ (I'm generous here) are relevant to the code it produced. The human writing functionally the same patch has some experience plus they browsed through stackoverflow for possible solutions, which quite obviously influenced the result by, well, I don't know how much, but arguably way more than 0.1%.

… but the LLM output is supposed to be copyright-tainted, while the human's is not? Gimme a break here.

Unenforceable rule

Posted Apr 17, 2026 6:26 UTC (Fri) by cpitrat (subscriber, #116459) [Link] (1 responses)

I don't know many humans who can reproduce the whole Harry Potter first chapter when prompted with the firsts sentence.

Unenforceable rule

Posted Apr 17, 2026 6:37 UTC (Fri) by mb (subscriber, #50428) [Link]

If I was prompted (asked) to reproduce parts of the proprietary source code I work with daily, I could surely reproduce major parts of it out of my memory to the point where it becomes a copyright violation.

Does that make my brain output potentially copyright-tainted? Obviously not.

You explicitly asked for a copyright violation, you got a copyright violation.
This is not special to LLMs. It's a matter of you got what you asked for, which is independent from AI and LLMs.

Unenforceable rule

Posted Apr 16, 2026 10:09 UTC (Thu) by NAR (subscriber, #1313) [Link]

If Copilot gives me an autocompletion for the obvious next edit, is that LLM generated or not? I would have typed the exact same thing anyway.

Exactly. If I do a search and replace of a variable name with a text editor (instead of typing the new variable name N times), is it OK or is it machine generated? If I click on the Refactor -> Change variable name in the IDE and have it replace the variable names, is it OK or is it machine generated? If I ask copilot to "Please rename this variable", is it OK or is it machine generated?

I have a bad feeling about this whole AI stuff, but refusing to use code that was generated by a different tool seems - counterproductive.

By the way, on more than one occasion I got an autocomplete where I felt "How the hell did it know exactly what I wanted?" It's an uneasy feeling...

Unenforceable rule

Posted Apr 16, 2026 11:18 UTC (Thu) by LtWorf (subscriber, #124958) [Link] (1 responses)

> Today's coding agents are extremely good at writing correct code, if used correctly.

That is akin of saying that bugfree software is easily achievable, one must just avoid writing bugs.

Technically true but how realistic is that?

Unenforceable rule

Posted Apr 16, 2026 17:08 UTC (Thu) by mb (subscriber, #50428) [Link]

No, it's very different from that.
The "If" is on a completely different level of abstraction for AIs. It's on the level of how humans think and communicate.

It's not much different from telling your very experienced coworker to write a specific program.

And it has the added advantage that you as a human can do a real unbiased review of the code implementation, which you cannot do in the same way for code you wrote by yourself.
You can then decide to fix the problems manually - which is usually what I do - or describe the problem you see to the AI in one or two short sentences. Doing the fix manually takes much longer, but costs less money. It's a trade off.

With "if used correctly" I meant things like:
If there is a problem with the implementation, then the problem has to be described to the LLM. Just saying "there's something wrong, fix it" almost never leads to a correct fix. That is incorrect use of LLMs and also incorrect use human interaction with your human coworker. There's no difference.

>how realistic is that

It's how it works today.
I would like to encourage you to just try it out.

I personally like to combine agentic AI together with strong compilers that produce meaningful warnings and error messages like the Rust compiler. These two tools complement each other perfectly.

Cognitive surrender

Posted Apr 15, 2026 22:55 UTC (Wed) by philh (subscriber, #14797) [Link]

See:

https://youarenotsosmart.com/2026/04/13/yanss-337-how-to-...

and/or:

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646

It seems that using LLMs carries the risk that one will simply not bother to apply the critical thinking that one would have applied in the absence of the LLM. Given what programming actually requires of humans, that seems sub-optimal to me.

The sad thing is that people are going to do this regardless, and will then be convinced that they themselves had whatever "brilliant" ideas the LLM fed them, and will claim that the result is proof enough that LLMs are an unrivaled good that we should all accept without a second thought. *sigh*

Cheers, Phil.

Pointless

Posted Apr 15, 2026 23:04 UTC (Wed) by dw (subscriber, #12017) [Link] (7 responses)

This is a bit like trying to run a club where membership requires your person to measure below some sievert threshold following a thermonuclear war, it's basically meaningless. So we basically got autocorrect on steroids and currently it is being abused to generate ungodly amounts of code nobody has any hope of maintaining, the moment will naturally pass. We're not far from almost everyone providing cheap or flat rate inference switching to a billing model more reflective of the underlying costs, and this would have been true anyway even if it weren't for the pending energy crisis.

Even where the eternal September of code continues, really the mark for a meaningful contribution has not changed: is it readable, reviewable, well presented, well reasoned, and is the person contributing likely to maintain it in future? What would it matter if the contribution was generated with an eye tracking system, happy hacker keyboard, or autocompleted from a single sentence if the source is otherwise known good?

Pointless

Posted Apr 16, 2026 4:35 UTC (Thu) by gmprice (subscriber, #167884) [Link] (6 responses)

In some sense I agree. In other senses - we're already struggling with maintainer burnout, and an avalanche of random garbage does no one any good. I think the OSS burnout reckoning is a long time coming, and mass-shitcode generation might be the thing that pushes some folks over the edge.

And then the world will move on - and the issue is no one knows whether we'll be better for it.

The game has changed, the shitcode you'd spend a week toying with as an experiment "Just to see how it looks" takes an hour now - and the competent among us STILL THROW THAT AWAY AND START OVER - it's just happening faster.

The "problem" (as perceived, anyway) is that the bar for contributing has now lowered from "Knows how to write code that will compile" to "Knows how to write half-intelligible English into a prompt long enough to send a patch". It's a problem, but if you stop and listen - it's also a mass of new people finding ways to shout their software requirements into an aether previously unattainable to them - and in the language they couldn't speak yesterday.

It's an interesting time to be writing software.

Pointless

Posted Apr 16, 2026 9:38 UTC (Thu) by dottedmag (subscriber, #18590) [Link] (4 responses)

Maybe the equilibrium of contribution will shift over time from "patches are welcome" to "patches are an additional load for us, so only submit them if they solve some real problem as we see it (mostly from the roadmap and a list of acknowledged bugs), and follow our guidelines to a T, or it will be closed with no further action or maybe a short pointer to where the process was not followed".

Note that "AI" does not show up in this sentence — as already mentioned, bytes don't have a color.

Obviously this will raise a bar for contribution, but that's the inevitable outcome of lowering the bar for contribution in the first place.

Pointless

Posted Apr 17, 2026 6:45 UTC (Fri) by jorgegv (subscriber, #60484) [Link] (3 responses)

I find It curious that almost everyone here is talking about the increased load AI generated Code causes on OSS projects, but they don't realize that AI can also be used to review the submitted Code.

AI can help both parties, and as a result, the overall development pace for the projects can be much higher.

In the case of a solo developer, AI can be used for both things (in separate instances: it's trivial to setup an AI agent to write Code based on specs, and an independent one to review the Code based on the same specs). And the Net result here is a huge force multiplier. As an example, I have spent the last month developing a retro emulator that would have taken me years to do without AI.

But for this to work, AI has to be correctly used, as someone has also mentioned above.

"Defensive" AI use

Posted Apr 17, 2026 6:56 UTC (Fri) by dottedmag (subscriber, #18590) [Link]

Increasingly sophisticated linters, code formatting tools and static code analysers have decreased the burden of mechanical checks. AI semantic supercharged linting is a great step in the same direction: it can easily shoot down contributions that don't pass the bar. It even can do some architectural checks. That's a natural place of agentic help in the contribution process: smart linter.

The scarce resource remains judgement: Does this change need to be made at all? Does it have repercussions beyond the developed system? Are these repercussions positive or negative? What are architectural rules that are worth having in the first place? Is this approach for solving the problem acceptable? These are much harder to encode in a set of rules in the current state of AI. If it changes then we'll just be pressing the button "make me an app", I guess.

Pointless

Posted Apr 17, 2026 7:08 UTC (Fri) by mb (subscriber, #50428) [Link] (1 responses)

I have a couple of projects that are in maintenance mode since many many years (some over a decade).
Just because they do what me and most other people expect them to do.

However, there are (and have always been) people who want more features. Big features.

Since about a year ago, with massively increasing rates, people start contributing to these projects.
This is a "problem", because it increases my workload from basically zero to anything nonzero. Even with the help of AI on my side it won't become zero.

I mean, I generally welcome those contributions and I try to filter them to get the best and most helpful ones merged. But it definitely increases my workload and reduces what I can do for other projects.

The hard part in software development is not - and has never been - writing new code. It's maintenance.

This is a very real tradeoff.

Pointless

Posted Apr 17, 2026 10:16 UTC (Fri) by mbunkus (subscriber, #87248) [Link]

As maintainers we can always say "no" to new features (or any kind of request, really). Sure, we usually want to make users happy and are grateful for contributions, so it's not always easy. On the other hand in OSS projects the users always have a choice to fork your project if your "no new features" stance becomes unacceptable to them, so… we really do have that option.

Pointless

Posted Apr 17, 2026 12:27 UTC (Fri) by smurf (subscriber, #17840) [Link]

Yes, slop code from well-meaning (or not-so-well-meaning …) contributors is a problem for maintainers.

The flip side is that as a maintainer I can take a head-scratcher bug report which before LLM would have languished indefinitely for lack of time, reproducibility, or motivation, tell Claude to do a root cause analysis, find a reproducer, and propose a fix. Then *maybe* I need to ask it to refine the solution so it's actually future-proof instead of a band-aid – but that's still two orders of magnitude faster than doing it all by hand.

This also applies to new features.
Data point: I maintain a small infrastructure daemon with a ten-year-old "add secure packet mode, dammit" bug. Guess what appeared in my inbox last week.
Yes, the contribution needs some refinement and cleanup, but the alternative isn't to write the code by hand. The alternative is another ten years of insecure-only operation.

Abandoning vim(1) ASAP

Posted Apr 15, 2026 23:14 UTC (Wed) by alx.manpages (subscriber, #145117) [Link]

Thanks! I'll stop using vim(1) as soon as either of the alternatives are packaged for Devuan (Debian). It *was* a good editor!

I definitely want a Vim with an intelligent 'exit helper'

Posted Apr 17, 2026 7:48 UTC (Fri) by k3ninho (subscriber, #50375) [Link] (1 responses)

While I've got `[esc]:wq!` tattooed on an arm, there's definitely a UX win available for a smart helper to explain to the unwary user how to escape the twisty maze of input and edit modes that look very much alike.

K3n.

I definitely want a Vim with an intelligent 'exit helper'

Posted Apr 17, 2026 15:46 UTC (Fri) by jfb (subscriber, #60805) [Link]

'[esc]ZZ' is much easier to type and does the same thing, I believe.


Copyright © 2026, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds