Human authorship?

Posted Jul 1, 2025 10:30 UTC (Tue) by paulj (subscriber, #341)
In reply to: Human authorship? by jani
Parent article: Supporting kernel development with large language models

Does this mean that companies moving to AI-coding will be producing code they hold no copyright over (because being machine produced, there simply is no [new] copyright created in the output)? If someone produces an app / a piece of software that is 100% AI generated, will others be able to redistribute as they wish?

Human authorship?

Posted Jul 1, 2025 11:14 UTC (Tue) by jani (subscriber, #74547) [Link] (1 responses)

I think at this point the only thing we can say with any certainty is that we're going to see plenty of litigation and lobbying to protect corporate interests, both for LLM input *and* output.

Human authorship?

Posted Jul 1, 2025 16:18 UTC (Tue) by paulj (subscriber, #341) [Link]

It's going to be fascinating. There are corporate interests on both sides of this fence - in a number of cases, the /same/ corporate. ;) Different jurisdictions may well come to different answers too.

Human authorship?

Posted Jul 1, 2025 16:00 UTC (Tue) by kleptog (subscriber, #1183) [Link]

At least in NL law, the copyright belongs to the person that made the creative choices. In other words, no AI tool can ever produce anything copyrightable by itself. The user who made the creative choices that lead to the output has the copyright. This is the same principle that prevents Microsoft from claiming copyright over your Word documents, or a compiler writer claiming copyright over your binaries.

If those companies somewhere in the chain include a human who is deciding which AI output is acceptable and which isn't, then that would be copyrightable. Even if they were just writing a program that did the evaluation for them. Although I expect the actual protection to be somewhat commensurate to the amount of effort. And if you're talking to a chatbot, the output is copyright of the person typing.

This is Civil Law, so by statute and no court case can change that. At best the courts can prod the legislature to tell them the law might need updating, but that's it. The US being Common Law however is likely to attract a lot of litigation, unless the legislature explicitly goes to fix it.

Human authorship?

Posted Jul 1, 2025 18:37 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

It's likely that only non-trivial _prompts_ will be copyrighted.

I can see that in future source code repos will have the LLM-generated source code along with the list of prompts that resulted in it. And then lawyers will argue where exactly the copyright protection is going to stop. E.g. if a prompt "a website with the list of issues extracted from Bugzilla" is creative enough or if it's just a statement of requirements.

Human authorship?

Posted Jul 2, 2025 13:43 UTC (Wed) by kleptog (subscriber, #1183) [Link] (2 responses)

> It's likely that only non-trivial _prompts_ will be copyrighted.

If the prompt is copyrightable, then the output is too. An LLM is just a tool. Photos don't lose copyright by feeding them through a tool, so why would an LLM be any different? You'd have to somehow argue that an LLM is somehow a fundamentally different kind of tool than anything else you use to process text, which I don't think is a supportable idea.

Human authorship?

Posted Jul 2, 2025 14:24 UTC (Wed) by jani (subscriber, #74547) [Link]

> If the prompt is copyrightable, then the output is too.

It's just not that clear cut: https://www.skadden.com/insights/publications/2025/02/cop...

Human authorship?

Posted Jul 2, 2025 18:55 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

But _where_ does the copyright start? A prompt may be copyrightable if it's specific enough. And edits made by humans to the generated code are likely to be copyrightable.

But suppose we have this case, you build a web service to track sleep times using an LLM. And then I build a service to track the blood sugar data using an LLM.

The source code for them ends up 95% identical, just because there are so many ways to generate a simple CRUD app and we both used the same LLM version. And if you had looked at these two code bases 15 years ago, it would have been a clear-cut case of copyright infringement.

But clearly, this can't be the case anymore. Mere similarity of the code can't be used as an argument when LLMs are at play.