|
|
Subscribe / Log in / New account

How to check copyright?

How to check copyright?

Posted Oct 3, 2025 15:12 UTC (Fri) by stefanha (subscriber, #55072)
In reply to: How to check copyright? by Wol
Parent article: Fedora floats AI-assisted contributions policy

I am not claiming that all AI output is covered by the copyright of its training data. It seems reasonable that generated output is treated in the same way as when humans who have been exposed to copyrighted content create something.

In the original comment I linked to a paper about extracting copyrighted content from LLMs. A web search brings up a bunch more in this field that I haven't read. Here is one explicitly about generated code (https://arxiv.org/html/2408.02487v3) that says "we evaluate 14 popular LLMs, finding that even top-performing LLMs produce a non-negligible proportion (0.88% to 2.01%) of code strikingly similar to existing open-source implementations".

I think AI policies are getting ahead of themselves when they assume that a contributor can vouch for license compliance. There needs to be some kind of lawyer-approved solution to this so that the open source community is protected from a copyright mess.


to post comments

How to check copyright?

Posted Oct 3, 2025 15:25 UTC (Fri) by farnz (subscriber, #17727) [Link] (4 responses)

There's a critical piece of data missing - what proportion of human-written code is strikingly similar to existing open-source implementations?

We know that humans accidentally and unknowingly infringe, too. Why can't we reuse the existing lawyer-approved solution to that problem for LLM output?

How to check copyright?

Posted Oct 3, 2025 16:47 UTC (Fri) by Wol (subscriber, #4433) [Link] (3 responses)

And another thing - how much copyright violation is being blamed on the LLM, when the query being *sent* to the LLM itself is a pretty blatant copyright violation? At which point we're seriously into "unclean hands", and if the querier is not the copyright holder, they could easily find themselves named as a co-defendant (quite likely the more culpable defendant!) even if they're not the deeper pocket.

If I had an LLM and found myself sued like that, I'd certainly want to drag the querier into it ...

Cheers,
Wol

How to check copyright?

Posted Oct 6, 2025 14:24 UTC (Mon) by stefanha (subscriber, #55072) [Link] (2 responses)

> If I had an LLM and found myself sued like that, I'd certainly want to drag the querier into it ...

Hence why contributors need a way to check copyright compliance.

How to check copyright?

Posted Oct 6, 2025 14:29 UTC (Mon) by farnz (subscriber, #17727) [Link]

TBF, you also need such a mechanism to check copyright compliance of any code you've written yourself - you are also quite capable of accidental infringement (where having seen a particular way to write code before, you copy it unintentionally), and to defend yourself or the project you contribute to, you have to prove either that you never saw the original code that you're alleged to have copied (the clean room route) or that this code is on the "idea" side of the idea-expression distinction (however that's expressed in local law).

How to check copyright?

Posted Oct 6, 2025 14:36 UTC (Mon) by pizza (subscriber, #46) [Link]

> Hence why contributors need a way to check copyright compliance.

This is a legal problem, and cannot be solved via (purely, or even mostly) technical means.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds