|
|
Log in / Subscribe / Register

Is it free software?

Is it free software?

Posted Feb 20, 2026 16:13 UTC (Fri) by dskoll (subscriber, #1630)
In reply to: Is it free software? by epa
Parent article: The Book of Remind

Thank you for that comment. I have updated the README.md file to reflect it.

I don't intend for Remind to be non-Free, but I still want to prevent it from being used to train LLMs whose outputs is not GPLv2, to the greatest extent permitted under copyright law.


to post comments

Is it free software?

Posted Feb 20, 2026 18:35 UTC (Fri) by rfontana (subscriber, #52677) [Link] (14 responses)

So the new language is:

> It is not yet settled whether, if you train an AI model on this
source code, the resulting model is a derivative work of the code. But
if it is, and does not fall under "fair use" or equivalent in your
jurisdiction, then as with any other derivative work you may only
distribute it under the terms of the GNU General Public License,
version 2.

That seems OK to me, but I think it's a different kind of assertion than what it replaces since here you're talking about the "resulting model" possibly being a GPLv2 derivative work, while before you talked about the "output of the model" or "anything the model produces". Not sure if that change in focus was intentional or not. Even if a model trained on GPLv2 stuff is a derivative work of that GPLv2 stuff, it doesn't mean the output would also be a derivative work (and vice versa).

Is it free software?

Posted Feb 20, 2026 20:09 UTC (Fri) by epa (subscriber, #39769) [Link] (1 responses)

Yes, that’s another open question. I think it would be hard to argue that the output of ChatGPT is a derivative work of the model definition, without at the same time admitting that the model is derivative of its training data.

Is it free software?

Posted Feb 23, 2026 8:49 UTC (Mon) by taladar (subscriber, #68407) [Link]

I don't think anyone ever contested that it is derivative of the training data, just to which extent it is derivative of any particular piece in the training data and if that meets the threshold for copyright to become an issue.

Technically speaking you could e.g. make a model that is a single bit and reflects if the input has an odd or an even number of characters and produces output with the same characteristics. Most people would agree that that would not violate copyright.

On the other end of the spectrum you could just make a "model" that is a filesystem folder of all the input data and the algorithm just selects a random section from the input verbatim. Most people would agree that that should violate copyright.

The question is where in between those two extremes do we draw the line where copyright no longer applies.

Is it free software?

Posted Feb 20, 2026 21:31 UTC (Fri) by dskoll (subscriber, #1630) [Link] (11 responses)

I might change "the resulting model" to "the resulting model and the output it produces", though I think there's already American precedence stating that the output of an AI model is not copyrighted.

As I said, I don't want Remind to be non-Free. But it seems to me that LLMs can be used for "copyleft washing" whereby you pass copyleft code thought an LLM and then the results, even if very similar to the training input, are somehow magically no longer copyleft. This seems like an underhanded trick and I'm trying to express my disapproval of it.

Is it free software?

Posted Feb 23, 2026 7:28 UTC (Mon) by gfernandes (subscriber, #119910) [Link]

I think it's perfectly fair for you, the author, to express your intent that the license covers human use and not AI/LLM training.

It has been OK to dial license the GPL since roughly forever. That is exactly what you have done. You should not have to justify this.

Is it free software?

Posted Feb 24, 2026 0:31 UTC (Tue) by rgmoore (✭ supporter ✭, #75) [Link] (9 responses)

though I think there's already American precedence stating that the output of an AI model is not copyrighted.

That's right. Under US law, a work must show a degree of creativity to qualify for copyright. Under current precedent, machines and non-human animals are assumed not to have any creativity, so their works are inherently unable to be copyrighted. Just because something can't be copyrighted, though, doesn't mean it can't violate a copyright. For example, the official work of US government employees is automatically in the public domain, but there have been cases where artists have successfully sued the US government for copyright infringement.

Is it free software?

Posted Feb 24, 2026 14:50 UTC (Tue) by anselm (subscriber, #2796) [Link]

Here in Germany, copyright law stipulates that any works must be the “personal mental creation” of a human in order to qualify for protection. Hence, at least for now, no copyright protection for generative-AI output in the Land der Dichter and Denker (country of poets and thinkers).

Is it free software?

Posted Feb 24, 2026 15:08 UTC (Tue) by Wol (subscriber, #4433) [Link] (7 responses)

The obvious way to think of it is that an AI cannot claim creation rights over its work. Aka as commented "the ouput of an AI is not copyrightable".

But that clearly does not preclude the AI *copying* a copyrighted work, and hence infringing copyright. Going back to that "Bach and Beethoven" example, here it's inside out, but the principle is the same, with the music the music itself is copyright-free but has a wrapper of copyrightable arrangements, fonts, layout etc. With the AI, the AI wrapper is copyright-free, but the contents may well be copyrighted.

Cheers,
Wol

Is it free software?

Posted Feb 24, 2026 21:47 UTC (Tue) by rgmoore (✭ supporter ✭, #75) [Link] (6 responses)

It would be very strange if the output of an AI were somehow incapable of infringing copyright. The output of a machine everyone can agree is not intelligent, like a film camera or a photocopier, can clearly be infringing. The output of a human being, something everyone agrees is at least capable of being intelligent, can be infringing. Why should the output of something in the middle suddenly be incapable of infringing? The real question isn't whether the AI's output can be infringing but whether the AI itself is responsible for infringing output or if the humans around it are. Right now, it's pretty clear "AI" isn't actually intelligent enough to be capable of infringement on its own, so the liability should rest on the humans.

In practice, even if the courts decided and AI was capable of infringement, the humans around it would still wind up liable. The AI would probably be treated the same way as an employee is, and businesses can be held liable for their employees' misdeeds when those employees are doing their official duties. You could wind up with several levels of liability. The company operating the AI would be primarily liable, but the company that created it could be secondarily liable for contributory infringement. Unless/until we have free existing AIs that do things on their own without any kind of human prompting, liability will eventually wind up with the people who keep the AI running and the ones who provide it with the prompts that elicit infringing outputs.

Is it free software?

Posted Feb 25, 2026 9:51 UTC (Wed) by farnz (subscriber, #17727) [Link] (5 responses)

The case of concern is more around how the legalities wind up shaking out.

It's easy to imagine a scenario in which the AI model is not itself responsible for infringement, nor is it inherently a derived work, and where liability therefore falls on a mix of the AI vendor and the user.

In that world, the big copyright holders are in a strong position to negotiate a deal with the AI vendor, where the AI vendor is paying for licensing for "minor" infringement, and passes details of "deliberately" infringing prompts and outputs to the big copyright holders for legal enforcement.

That's a worst case for FOSS. We're not big copyright holders, so we don't get into the deals for "minor" infringement and "deliberate" infringement. But our output is still used to train the models, and to make the AIs profitable - meaning that it's now on us to find the people infringing our copyright, and pursue them in court (giving them the option of bringing the AI vendor in as a contributor to their infringement) to stop AI becoming just another way to infringe FOSS copyrights without significant risk.

The history of FOSS licence enforcement (or rather, the lack thereof in most cases) does not fill me with hope here - the cost of blanket enforcement is high, and that's what's needed to make the AI vendors worry about the risk of infringing our copyrights.

Is it free software?

Posted Feb 26, 2026 11:15 UTC (Thu) by kleptog (subscriber, #1183) [Link] (4 responses)

I'm a bit confused about what kind of infringement you're worried about?

Are you worried that somebody could use an LLM to "vibe-code" a competitor to PostgreSQL while it not looking like PostgreSQL, purely because the source is in the training set? What FOSS is out there that people would like to copy, don't feel like they can just get away with it but figure they can use an LLM to get around it?

What you are referring to with big companies I do see as a thing for purely creative works. To prevent LLMs and Stable Diffusion from producing video clips with X-Wing fighters and Mickey Mouse. Though to me it this feels more trademark than copyright related. ISTM non-commerical stuff should be allowed anyway.

But the value of copyright on source code has always been a bit weird. Code is mostly functional. For any given problem there really are only a few good solutions. Sure, you can split the functions up in different ways, the variables can have different names, you can use a different language. If you want to write a new program, in almost always you are not doing anything new, just doing what other people have done before, but framed in a different way.

The GPL is mostly a political tool. We don't care if the FreeBSD guys want to take the source of some driver the make a better driver for FreeBSD. Most of the code in the kernel is useless in any other context. We care that someone takes the Linux kernel as a whole. The tool we are given is copyright law, so we have the GPL.

If someone takes an algorithm in the Linux kernel and uses it in their own product, that's not copyright infringement. ISTM LLMs will mostly take the functional structure and ignore the literal text. There where their value is after all.

But maybe there is some risk to FOSS from LLMs that I'm missing. I'm just not seeing it.

Is it free software?

Posted Feb 26, 2026 11:19 UTC (Thu) by farnz (subscriber, #17727) [Link] (3 responses)

If all free software is in the training set, and the output of the LLM is effectively not copyright protected (because it's too expensive to prove), then things like OpenWRT would never have happened, because you could avoid the problem with GPL licensing by having the LLM "vibe-code" an entire router OS, instead of copying Linux.

It's not a new risk. But if you're looking at LLMs as "they will obliterate copyright walls", I think you're barking up the wrong tree; I expect them to be bad for small copyright holders (since proving copying becomes harder), but not big ones.

Is it free software?

Posted Feb 27, 2026 12:27 UTC (Fri) by taladar (subscriber, #68407) [Link] (2 responses)

At least right now it is hard to get an LLM to move a function from the file foo/bar to the file foo/baz without modification but a hypothetical future AI based on a different fundamental concept might be usable that way.

Is it free software?

Posted Feb 27, 2026 12:32 UTC (Fri) by farnz (subscriber, #17727) [Link] (1 responses)

For someone abusing LLMs to "copyright-wash" infringement, that modification is a feature, not a bug. The copying is no longer literal, so it makes it harder to show that copying took place, while as long as the modifications don't introduce too many new bugs, you can refer back to the original and fix them (or ignore them if they're irrelevant to your product - if you're building a WiFi router, and the bugs relate to DCCP NAT, "just" don't support that).

Is it free software?

Posted Feb 27, 2026 12:58 UTC (Fri) by taladar (subscriber, #68407) [Link]

In the specific case I had in mind the LLM just kept changing APIs back from the version of a library I was using to the version of the same library that was current when it was trained (bevy 0.17 -> 0.14) which was incredibly annoying.

Is it free software?

Posted Feb 20, 2026 20:10 UTC (Fri) by epa (subscriber, #39769) [Link]

Thank you for changing that. I believe it makes explicit what is the legal position for any GPL-covered work.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds