Parts of Debian dismiss AI-contributions policy

Posted May 13, 2024 13:52 UTC (Mon) by mb (subscriber, #50428)
In reply to: Parts of Debian dismiss AI-contributions policy by anselm
Parent article: Debian dismisses AI-contributions policy

>LLM is obviously not the personal mental creation of anyone

Well, that is not obvious at all.

Because the inputs were mental creations.
At which point did the data loose the "mental creation" status traveling through the algorithm?
Will processing the input with 'sed' also remove it, because the output is completely processes by a program, not a human being?
What level or processing do we need for the "mental creation" status to be lost? How many chained 'sed's do we need?

Parts of Debian dismiss AI-contributions policy

Posted May 13, 2024 21:39 UTC (Mon) by mirabilos (subscriber, #84359) [Link] (7 responses)

Chained sed isn’t going to solve it.

Even “mechanical” transformation by humans does not create a work (as defined by UrhG, i.e. copyright). It has to have some creativity.

Until then, it’s a transformation of the original work(s) and therefore bound to the (sum of their) terms and conditions on the original work.

If you have a copyrighted thing, you can print it out, scan it, compress it as JPEG, store it into a database… it’s still just a transformation of the original work, and you can retrieve a sufficiently substantial part of the original work from it.

The article where someone reimplemented a (slightly older version of) ChatGPT in a 498-line PostgreSQL query showed exactly and easily understandable how this is just a lossy compression/decompression: https://explainextended.com/2023/12/31/happy-new-year-15/

There are now feasible attacks obtaining “training data” from prod models in large scale, e.g: https://not-just-memorization.github.io/extracting-training-data-from-chatgpt.html

This is sufficient to prove that these “models” are just databases with lossily compressed, but easily enough accessible, copies of the original, possibly (probably!) copyrighted, works.

Another thing I would like to point out is the relative weight. For a work which I offer to the public under a permissive licence, attribution is basically the only remuneration I can ever get. This means failure to attribute so has a much higher weight than for differently licenced or unlicenced stuff.

Parts of Debian dismiss AI-contributions policy

Posted May 13, 2024 21:55 UTC (Mon) by bluca (subscriber, #118303) [Link] (6 responses)

> This is sufficient to prove that these “models” are just databases with lossily compressed, but easily enough accessible, copies of the original, possibly (probably!) copyrighted, works.

While the AI bandwagon exaggerates greatly the capability of LLMs, let's not fall into the opposite trap. ChatGPT&al are toys, real applications like Copilot are very much not "just databases". A database is not going to provide you with autocomplete based on the current, local context open in your IDE. A database is not going to provide an accurate summary of the meeting that just finished, with action items and all that.

Parts of Debian dismiss AI-contributions policy

Posted May 13, 2024 22:20 UTC (Mon) by mirabilos (subscriber, #84359) [Link] (5 responses)

Oh, it totally is. Please *do* read the explainextended article: it shows you exactly how precisely the context is what parametrises the search query.

Parts of Debian dismiss AI-contributions policy

Posted May 13, 2024 22:44 UTC (Mon) by bluca (subscriber, #118303) [Link] (4 responses)

No, it totally isn't, because it's not about reproducing existing things, which is the only thing a database query can do.

Parts of Debian dismiss AI-contributions policy

Posted May 13, 2024 23:14 UTC (Mon) by mirabilos (subscriber, #84359) [Link] (3 responses)

Just read that.

Consider a database in which things are stored lossily compressed and interleaved (yet still retrievable).

Parts of Debian dismiss AI-contributions policy

Posted May 13, 2024 23:58 UTC (Mon) by bluca (subscriber, #118303) [Link] (2 responses)

A database query doesn't work differently depending on local context. You very clearly have never used any of this, besides playing with toys like chatgpt, and it shows.

Parts of Debian dismiss AI-contributions policy

Posted May 14, 2024 0:28 UTC (Tue) by mirabilos (subscriber, #84359) [Link] (1 responses)

Just read the fucking explainextended article, which CLEARLY explains all this, or go back to breaking unsuspecting peoples’ nōn-systemd systems, or whatever.

I don’t have the nerve to even try and communicate with systemd apologists who don’t even do the most basic research themselves WHEN POINTED TO IT M̲U̲L̲T̲I̲P̲L̲E̲ ̲T̲I̲M̲E̲S̲.

Second try

Posted May 14, 2024 1:26 UTC (Tue) by corbet (editor, #1) [Link]

OK, I'll state it more clearly: it's time to bring this thread to a halt, it's not getting anywhere.

That's all participants should stop, not just the one I'm responding to here.

Thank you.