|
|
Subscribe / Log in / New account

The FSF considers large language models

By Jonathan Corbet
October 14, 2025

Cauldron
The Free Software Foundation's Licensing and Compliance Lab concerns itself with many aspects of software licensing, Krzysztof Siewicz said at the beginning of his 2025 GNU Tools Cauldron session. These include supporting projects that are facing licensing challenges, collecting copyright assignments, and addressing GPL violations. In this session, though, there was really only one topic that the audience wanted to know about: the interaction between free-software licensing and large language models (LLMs).

[Krzysztof
Siewicz] Anybody hoping to exit the session with clear answers about the status of LLM-created code was bound to be disappointed; the FSF, too, is trying to figure out what this landscape looks like. The organization is currently running a survey of free-software projects with the intent of gathering information about what position those projects are taking with regard to LLM-authored code. From that information (and more), the FSF eventually hopes to come up with guidance of its own.

Nick Clifton asked whether the FSF is working on a new version of the GNU General Public License — a GPLv4 — that takes LLM-generated code into account. No license changes are under consideration now, Siewicz answered; instead, the FSF is considering adjustments to the Free Software Definition first.

Siewicz continued that LLM-generated code is problematic from a free-software point of view because, among other reasons, the models themselves are usually non-free, as is the software used to train them. Clifton asked why the training code mattered; Siewicz said that at this point he was just highlighting the concern that some feel. There are people who want to avoid proprietary software even when it is being run by others.

Siewicz went on to say that one of the key questions is whether code that is created by an LLM is copyrightable and, if not, if there is some way to make it copyrightable. It was never said explicitly, but the driving issue seems to be whether this software can be credibly put under a copyleft license. Equally important is whether such code infringes on the rights of others. With regard to copyrightability, the question is still open; there are some cases working their way through the courts now. Regardless, though, he said that it seems possible to ensure that LLM output can be copyrighted by applying some human effort to enhance the resulting code. The use of a "creative prompt" might also make the code copyrightable.

Many years ago, he said, photographs were not generally seen as being copyrightable. That changed over time as people figured out what could be done with that technology and the creativity it enabled. Photography may be a good analogy for LLMs, he suggested.

There is also, of course, the question of copyright infringements in code produced by LLMs, usually in the form of training data leaking into the model's output. Prompting an LLM for output "in the style of" some producer may be more likely to cause that to happen. Clifton suggested that LLM-generated code should be submitted with the prompt used to create it so that the potential for copyright infringement can be evaluated by others.

Siewicz said that he does not know of any model that says explicitly whether it incorporates licensed data. As some have suggested, it could be possible to train a model exclusively on permissively licensed material so that its output would have to be distributable, but even permissive licenses require the preservation of copyright notices, which LLMs do not do. A related concern is that some LLMs come with terms of service that assert copyright over the model's output; incorporating such code into a free-software project could expose that project to copyright claims.

Siewicz concluded his talk with a few suggested precautions for any project that accepts LLM-generated code, assuming that the project accepts it at all. These suggestions mostly took the form of collecting metadata about the code. Submissions should disclose which LLM was used to create them, including version information and any available information on the data that the model was trained on. The prompt used to create the code should also be provided. The LLM-generated code should be clearly marked. If there are any use restrictions on the model output, those need to be documented as well. All of this information should be recorded and saved when the code is accepted.

A member of the audience pointed out that the line between LLMs and assistive (accessibility) technology can be blurry, and that any outright ban of the former can end up blocking developers needing assistive technology, which nobody wants to do.

There were some questions about how to distinguish LLM-generated code from human-authored code, given that some contributors may not be up-front about their model use. Clifton said that there must always be humans in the loop; they, in the end, are responsible for the code they submit. Jeff Law added that the developers certificate of origin, under which code is submitted to many projects, includes a statement that the contributor has the right to submit the code in question. Determining whether that right is something the contributor truly holds is not a new concern; developers could be, for example, submitting code that is owned by their employer.

A real concern, Siewicz said, is whether contributors are sufficiently educated to know where the risks actually are.

Mark Wielaard said that developers are normally able to cite any inspirations for the code they write; an LLM is clearly inspired by other code, but is unable to make any such citations. So there is no way to really know where LLM-generated code came from. A developer would have to publish their entire session with the LLM to even begin to fill that in.

The session came to an end with, perhaps, participants feeling that they had a better understanding of where some of the concerns are, but nobody walked out convinced that they knew the answers.

A video of this session is available on YouTube.

[Thanks to the Linux Foundation, LWN's travel sponsor, for supporting my travel to this event.]

Index entries for this article
ConferenceGNU Tools Cauldron/2025


to post comments

Now can we?

Posted Oct 14, 2025 16:43 UTC (Tue) by gwolf (subscriber, #14632) [Link]

Can we programmers actually «cite any inspirations for code we write»? Do we often do that?
Be it that I learnt programming at school or by reading books, or that I took a "BootCamp", I cannot usually said where I got a particular construct from. I could, of course, say that I write C in the K&R style — but I doubt that's what Siewicz refers to. And of course, Perl-heads will recognize a "Schwartzian transform". But in general, I learnt _how to code_, and I am not able to attribute specific constructs of my programming to specific bits of code. Just like an LLM.

If most of my programming consisted of searching for answers to a question related to mine in StackOverflow... I *could* get persuaded to link to the post in question in a comment before each included snippet. But that's also not something I've seen to be frequent. And if I didn't write the comment _the same moment_ I included said snippet, it's most likely I never will.

So... I think there is an argumentative issue in here :-)

Define “prompt”

Posted Oct 15, 2025 2:13 UTC (Wed) by Baughn (subscriber, #124425) [Link] (21 responses)

Unless you’ve only ever used ChatGPT, you will know that LLM-produced code is not the result of a single prompt, not even a conversation, but rather a workflow that often goes as such:

- Discuss a problem with the LLM. The LLM autonomously reads large parts of the repository you’re working in, during the discussion.

- Ask it to write a plan. Edit the plan. Ask it about the edited plan. Edit it some more.

- Repeatedly restart the LLM, asking it to code different parts of the plan. Debug the results. Write some code yourself. Create, rebase, or otherwise play around with the repository; keep multiple branches of potential code.

- Go back and edit the original plan, now that you know what might work. Port some unit tests back in time, sometimes.

- Repeat until done.

There is a prompt. Actually, there are many prompts, all conveniently stored in verbose JSONL that also requires point in time snapshots of the repository you’re working in to make sense of.

If someone were to ask me for that, I wouldn’t know where to start. It’s like asking for a recording of my desktop so they can be sure I’m not doing something they disapprove of.

Define “prompt”

Posted Oct 15, 2025 4:21 UTC (Wed) by mussell (subscriber, #170320) [Link] (9 responses)

That sounds like way too much effort compared to a standard edit-compile-debug cycle without any LLM and costs of orders of magnitude more power to boot. What's the benefit of these things again? Do we really want to outsource our thinking that much?

Define “prompt”

Posted Oct 15, 2025 14:25 UTC (Wed) by Baughn (subscriber, #124425) [Link] (7 responses)

It’s really not much effort. If you’re doing your job right, you should already have written designs for whatever features you’re coding.

LLMs just force it, since they don’t work well without a plan. You can rely on them reading your mind.

And I don’t know. Is a 5x increase in project scope worthwhile? Because that’s what I’ve been getting.

Define “prompt”

Posted Oct 15, 2025 21:18 UTC (Wed) by SLi (subscriber, #53131) [Link] (5 responses)

Even in the early ChatGPT days when I wouldn't have considered asking it to produce code I found this an excellent way to flesh and simplify designs. Not because they were right; often in fact because their ridiculous solutions made me think of approaches that I would have otherwise missed.

According to lore, some programmers talk to rubber ducks to solve their problems. Well, even GPT-3 was definitely more than a rubber duck. Not necessarily 10x better, but still better. These recent models? I think they're genuinely useful also in domains that you don't know so well. An example (I could also give another from a domain I knew even less about, book binding, but this message is already long):

I've been taking a deep dive into Rust for the past few days, and I don't think how I would replace the crate and approach suggestions I've got from LLMs. Probably the old-fashioned way, reading enough rust code to see what people do today, but I'm sure that would have been several times the effort. The same applies to them digging quickly the reason why a particular snippet makes the borrow checker unhappy and suggesting an alternative. One does not easily learn to search for `smallvec` without having ever heard of it.

Or, today, diving into the interaction of process groups, sessions, their interaction with ptys (which I didn't know well), and "why on earth do I end up with a process tree like this"—the LLM taught be about subreapers, which I did not know and would not have easily guessed to search for.

I think one problem is that people get angry if LLMs are not right 100% of the time. Even that seems a bit like "you're using it wrong". Don't rely on it to be right all the time. (As a side note, don't rely on humans to be either, unless they say very little.) Rely on it to give a big picture fast, which is where you might be after some time of self-study while still harboring misconceptions to be corrected—and much preferable to having no idea.

Define “prompt”

Posted Oct 16, 2025 7:03 UTC (Thu) by Wol (subscriber, #4433) [Link] (2 responses)

> According to lore, some programmers talk to rubber ducks to solve their problems.

I have a stuffed Tux on my desk for exactly that reason (although I rarely use it).

But how often has explaining the problem to a colleague resulted in you solving it, often without a word from said colleague? That's why a rubber duck / stuffed Tux / whatever is such a useful debugging aid. It might feel weird holding a conversation with an inanimate object, but don't knock it. It works ...

Cheers,
Wol

Define “prompt”

Posted Oct 16, 2025 11:48 UTC (Thu) by iabervon (subscriber, #722) [Link] (1 responses)

I've been finding that typing the explanation like I was talking to coworkers in a group chat works just as well as saying it out loud, and putting it in a version-controlled file that I clear out before making a pull request often results in having some great phrasing to use in the documentation or commit message, even though the original form would be useless in organization outside of an unfinished topic branch. This also results in some great information when I come back to a preempted project a few months later and want to know what I said to the duck when I was working on it.

Of course, it means I have a file in version control which says that it's a list of explanations of the issues I'm facing with features in progress, and then doesn't have anything else in any mainline commit.

Define “prompt”

Posted Oct 16, 2025 16:04 UTC (Thu) by SLi (subscriber, #53131) [Link]

I agree. Often even better when you put some time into it.

But I think writing clearly in a non-dialog setting is a skill that perhaps even most engineers lack. I think all engineers should be taught technical writing (I know my university didn't for me). Many don't even seem to realize it's a rather different skill set.

Define “prompt”

Posted Oct 16, 2025 13:40 UTC (Thu) by kleptog (subscriber, #1183) [Link] (1 responses)

I find LLMs especially useful for finding what the big picture is. If I'm trying to working why something isn't working, it can give you the name of the component that probably has the issue and so then you can search for it.

The first time I really saw this was when I was trying to do something with CodeMirror and was getting all sorts of conflicting advice from different sites. Eventually fed the errors to ChatGPT and it pointed out that version 5 and 6 use completely different configuration styles. No search engine would have told me that info. No website specifies which version they are using.

And for one off scripts it's amazing. Hey, I need a script that does the steps X, Y and Z in Python. Here is the previous bash script that did this. And voila.

Treat it like an idiot that knows everything and understands nothing. Because that's what it is... The trick is to combine your understanding with its knowledge.

Define “prompt”

Posted Oct 23, 2025 11:07 UTC (Thu) by nye (subscriber, #51576) [Link]

> Treat it like an idiot that knows everything and understands nothing. Because that's what it is... The trick is to combine your understanding with its knowledge.

I think this is the best description of an LLM that I've seen anywhere.

Better than human (sometimes)

Posted Oct 20, 2025 2:14 UTC (Mon) by gmatht (subscriber, #58961) [Link]

While LLMs can produce rubbish, sometimes they can do a better job than me.

Like all C programmers, I can write C in any language. Sometimes when I start writing C in Python the LLM will offer to complete my involved algorithm with a 2 line pythonic solution. Also the LLM's initial draft of a UI looks nicer than the functional but plain version I would call v1.0.

I seem to recall a quote saying something along the lines of: I will always write better code than a compiler/LLM, because I can use a compiler/LLM.

The biggest weakness of LLMs seems to be that it is not possible to reach v1 with vibe coding because once the code base reaches a certain level of quality the LLM will become more interested in adding new bugs than fixing old ones. For example, it will find a polished algorithm and observe that the tests only cover several values so it can simplify the algorithm by just hardcoding those values and still "pass".

Define “prompt”

Posted Oct 16, 2025 6:06 UTC (Thu) by azumanga (subscriber, #90158) [Link]

It's up to you of course, but if you haven't tried any of the more recent models, I'd give it a try.

I was stuck with a slowly dying Python 2 program, which a few people had tried to update (and failed) to Python 3. I previously tried for 4 full days before I realised I was no-where close, and gave up.

I sat for an afternoon with Claude Code, and finished a full Python 3 translation.

Claude found replacement libraries for things without a Python 3 version, wrote fresh implementations of a couple of functions that didn't get a Python 3 upgrade (I checked, it didn't just copy the originals), and helped me then fix up all the unicode issues from the Python 2 -> Python 3 upgrade process.

Define “prompt”

Posted Oct 15, 2025 20:39 UTC (Wed) by SLi (subscriber, #53131) [Link] (7 responses)

> Unless you’ve only ever used ChatGPT, you will know that LLM-produced code is not the result of a single prompt, not even a conversation, but rather a workflow that often goes as such:

Even with ChatGPT this should be the case.

I've come to suspect that the usual difference between people who insist LLMs are absolutely useless and those who get a lot of good out of them is likely exactly that: Take a human who's likely not even very good at communicating textually (few of us are; technical writing is a discipline for a reason), have him write a single sloppy prompt and dismiss the results when the LLM was not able to read his mind.

Define “prompt”

Posted Oct 15, 2025 20:54 UTC (Wed) by Wol (subscriber, #4433) [Link] (1 responses)

Except every time I've tried to make it clearer, the AI just digs itself deeper into the same hole.

Okay, the only AI I've (knowingly) used is Google search. And at least it has the decency to rephrase my query into the query it's going to answer (which it then answers pretty well). It's just that the question it's answering bears precious little resemblance to the question I asked it.

Cheers,
Wol

Define “prompt”

Posted Oct 15, 2025 21:19 UTC (Wed) by SLi (subscriber, #53131) [Link]

Yes, Google search is very hilarious, especially if you mean the "people also ask" results :)

Define “prompt”

Posted Oct 16, 2025 8:17 UTC (Thu) by taladar (subscriber, #68407) [Link]

Oddly enough none of the people who "get a lot of good out of them" have ever made a video showing that off on Youtube or anywhere else that had a convincing result in terms of the ratio of effort to output quality.

Define “prompt”

Posted Oct 20, 2025 8:35 UTC (Mon) by ssmith32 (subscriber, #72404) [Link] (3 responses)

That's a bit of a straw man argument. There are plenty of people who, like me, find it useful for simple transformations or generating boilerplate that, unfortunately, continues to persist in the codebase, for various and sundry reasons. But also recognize it can fail hilariously at simple tasks.

I asked my claude-powered assistant to:

- upgrade a library to a specific version. Instead, it updated an unrelated config value that had a similar name to the library to be the name of the library. The config file was most emphatically _not_ part of the build system. If LLMs truly understood "context" like people claim, it should have ruled out touching that file completely.

- generate a bunch of boilerplate for writing out new objects to a datastore that still needs boilerplate. Mostly got it right.

- generate a dockerfile for me. It saved time and worked, but added an unusual amount of completely useless cruft. Still faster to quickly remove it then make it myself from scratch.

- how to install a particular java version on my mac. Utterly failed. Kept on insisting on using a cask that no longer exists, on downloading it from locations that no longer hosted that particular version, etc. It was clearly just barfing up the suggestions from a bunch of outdated blogs.

For something that has similar patterns in your codebase, or has plenty of (correct) examples in documentation and random websites, it can do great.

For something novel or unique, even if it is something as banal as updating a library version by understanding it's pulled in transitively, and another library must be updated - or something both unique and genuinely interesting, LLMs fail miserably.

Which is not surprising. They are useful tools, once you know how they work. And a remarkable amout of code is not really doing anything that novel or unique.

For conversations about design, a co-worker or rubber duck is still much better for me.

Define “prompt”

Posted Oct 20, 2025 12:36 UTC (Mon) by pizza (subscriber, #46) [Link] (2 responses)

> Which is not surprising. They are useful tools, once you know how they work. And a remarkable amout of code is not really doing anything that novel or unique.

In other words, where LLMs are most useful is are twofold:

* A successor to the boilerplate-generating development environment "Wizards" [1]
* Fancy autocomplete.

[1] Referring to interactive prompt-guided templating engines popularized by Microsoft in Visual<whatever> development environments in the early 90s.

Define “prompt”

Posted Oct 22, 2025 17:25 UTC (Wed) by raven667 (subscriber, #5198) [Link] (1 responses)

That seems about right, I might also add that very simple usage of existing comprehensive frameworks seems like something LLMs should be able to cough up, like boilerplate describing how to make a simple CRUD app should have plenty of examples in the training data, so telling it what field names you want it should be possible to spit out a Django app, but I haven't tested that theory as I haven't touched LLMs, not even once. Maybe a freeform text frontend to ffmpeg invocations ;-)

Define “prompt”

Posted Oct 23, 2025 7:32 UTC (Thu) by taladar (subscriber, #68407) [Link]

If you just want a CRUD app auto-generated LLMs seem like overkill, it is probably easy to do that with a regular template engine, possibly even with the simple ones in project template tools (e.g. cargo-generate, not sure about a Python equivalent)

Define “prompt”

Posted Oct 23, 2025 10:34 UTC (Thu) by jvoss2 (guest, #7065) [Link] (1 responses)

I want to second this: In my opinion, requests to document "the prompt" are not practical, because there are too many small bits of prompt, some typed by the user and some from sytem prompts, tool descriptions, pre-written guidance for sub-agents etc. Even if this could somehow all be bundled together in a digestible form, it would still not be very useful, because what the LLM does also depends on the state of the file system at the time the request was made. (Example: "Please review the code in somefile.c. [LLM thinks about it and reports back] Ok, please fix the integer overflow you found by adding an explicit check near the start of the function.")

Define “prompt”

Posted Oct 24, 2025 7:50 UTC (Fri) by taladar (subscriber, #68407) [Link]

More importantly it also depends on the exact model in use and at least for the hosted models there is no way to get exactly the same version as on some past request.

Define “prompt”

Posted Oct 25, 2025 19:28 UTC (Sat) by davidgerard (guest, #100304) [Link]

you mean: "it can't be that stupid, you must be prompting it wrong"?

Libre AI?

Posted Oct 15, 2025 4:42 UTC (Wed) by pabs (subscriber, #43278) [Link]

Was there any discussion of what a Libre ML model, LLM or AI could look like?

Personally I like Debian's document about that:

https://salsa.debian.org/deeplearning-team/ml-policy/

It would be very useful to have at least some of the former, for things like human language translation, noise removal from audio, text to speech, speech to text and so on.

And if we use bugs in our model?

Posted Oct 16, 2025 1:20 UTC (Thu) by davecb (subscriber, #1574) [Link]

It's hard enough to statistically train from one geometric and one algebraic model. Now make one of them buggy.

Rinse, repeat.


Copyright © 2025, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds