Debian dismisses AI-contributions policy

Posted May 11, 2024 13:56 UTC (Sat) by Paf (subscriber, #91811)
In reply to: Debian dismisses AI-contributions policy by josh
Parent article: Debian dismisses AI-contributions policy

“ Humans and AI are not the same…

AI, on the other hand, should be properly considered a derivative work of the training material. The alternative would be to permit AI to perform laundering of copyright violations.”

I would like to understand better why this is. Plenty of things in my brain are in fact covered by copyright and I could likely violate quite a bit of copyright from memory. Instead it’s entirely about how much of the input material is present in the output.

If we’re just saying “humans are different”, it would be nice to understand *why* in detail and if anything non human could ever clear those hurdles. I get the distinct sense a lot of these arguments actually boil down to “humans are special and nothing else is like a human, because humans are special”

Debian dismisses AI-contributions policy

Posted May 11, 2024 14:38 UTC (Sat) by willy (subscriber, #9762) [Link] (2 responses)

I actually don't have a problem with "humans are special". You can't meaningfully kill an AI. You can't send an AI to prison. An AI cannot get married. And so on.

Debian dismisses AI-contributions policy

Posted May 12, 2024 1:29 UTC (Sun) by Paf (subscriber, #91811) [Link] (1 responses)

I guess to this I'd just say that I grew up watching sci-fi, and I am not so comfortable stating that humans are simply special. It's not a principle I feel all at all comfortable with as a basis for general moral reasoning.

Debian dismisses AI-contributions policy

Posted May 13, 2024 1:38 UTC (Mon) by raven667 (subscriber, #5198) [Link]

> I grew up watching sci-fi

Sci-fi is also a fictional scenario that swaps people for aliens or AI or whatever to be able to talk about power dynamics and relationships without existing bias creeping in, but that doesn't mean that LLMs are "alive" or "moral agents" in any way, they are no where near complex and complete enough for that to be a consideration. People see faces in the side of a mountain or a piece of toast, and in the same way perceive the output of LLMs, mistaking cogent-sounding statistical probability with intelligence. There is no there there because while an LLM might in some small way approximate thought, it's thoroughly lobotomized with no concept of concepts.

Debian dismisses AI-contributions policy

Posted May 11, 2024 20:21 UTC (Sat) by flussence (guest, #85566) [Link] (5 responses)

> If we’re just saying “humans are different”, it would be nice to understand *why* in detail and if anything non human could ever clear those hurdles.

Are you saying there's a threshold of "AI-ness", whereby in crossing it, someone distributing a 1TB torrent of Disney DVD rips and RIAA MP3s, encrypted with a one time pad output from a key derivation function with a trivially guessable input, and being caught doing so, would result in the torrent file itself being arrested instead? Does a training set built by stealing the work of others have legal personhood now? Does the colour of the bits and the intent of the deed no longer matter to a court if the proponent of the technology is sufficiently high on their own farts?

Debian dismisses AI-contributions policy

Posted May 12, 2024 1:28 UTC (Sun) by Paf (subscriber, #91811) [Link] (4 responses)

I don't think I understand this comment - It seems to start from the premise that computerized processes are inherently different from biological ones and just proceed from there. I can't really engage on those terms - there's no argument to have.

Debian dismisses AI-contributions policy

Posted May 13, 2024 10:54 UTC (Mon) by LtWorf (subscriber, #124958) [Link] (3 responses)

A person can learn C from 1 book. An AI needs millions of books. Certainly you see a certain difference in orders of magnitude?

Debian dismisses AI-contributions policy

Posted May 13, 2024 15:40 UTC (Mon) by atnot (guest, #124910) [Link] (2 responses)

I think calling it "learning C" is being too generous. If you learn a language like C from nothing, you will have a relatively complete understanding of the language and be able to write semi-working, conceptually correct solutions to pretty arbitrary simple problems with relative ease.

LLMs don't have that, they just try to predict what the answer would be on stackoverflow. Including aparently, much to my delight, "closed as duplicate". If you try using them for actually writing code, it very quickly becomes clear they have no actual understanding of the language beyond stochastically regurgitating online tutorials[1]. They falter as soon as you ask for something that isn't a minor variation of a common question or something that has been uploaded on github thousands of times.

If we are to call both of these things "learning", we do have to acknowledge that they are drastically different meanings of the therm.

[1] And no, answers to naive queries about how X works do not prove it "understands" X, merely that the training data contains enough instances of this question being answered to be memorizeable. Which for a language like C is going to be a lot. Consider e.g. that an overwhelming majority of universities in the world have at least one C course.

Debian dismisses AI-contributions policy

Posted May 13, 2024 15:44 UTC (Mon) by bluca (subscriber, #118303) [Link]

> LLMs don't have that, they just try to predict what the answer would be on stackoverflow. Including aparently, much to my delight, "closed as duplicate". If you try using them for actually writing code, it very quickly becomes clear they have no actual understanding of the language beyond stochastically regurgitating online tutorials[1]. They falter as soon as you ask for something that isn't a minor variation of a common question or something that has been uploaded on github thousands of times.

That's really not true for the normal use case, which is fancy autocomplete. It doesn't just regurgitate online tutorials or stack overflow, it provides autocompletion based on the body of work you are currently working on, which is why it's so useful as a tool. The process is the same stochastic parroting mind you, of course language models don't really learn anything in the sense of gaining an "understanding" of something in the human sense.

Debian dismisses AI-contributions policy

Posted May 13, 2024 20:39 UTC (Mon) by rschroev (subscriber, #4164) [Link]

Have you tried something like CoPilot? I've been trying it out a bit over the last three weeks (somewhat grudgingly). One of the things that became clear quite soon is that it does not just gets it code from StackOverflow and GitHub and the like; it clearly tries to adapt to the body of code I'm working on (it certainly doesn't always gets it right, but that's a different story.)

An example, to make things more concrete. Let's say I have a struct with about a dozen members, and a list of key-value pairs, where those keys are the same as the names of the struct members, and I want to assign the values to the struct members. I'll start writing something like:

for (auto &kv: kv_pairs) {
	if (kv.first == "name")
		mystruct.name = kv.second;
	// ...
}

It then doesn't take long before CoPilot starts autocompleting with the remaining struct members, offering me the exact code I was trying to write, even when I'm pretty sure the names I'm using are unique and not present in publicly accessible sources.

I'm not commenting on the usefulness of all this; I'm just showing that what it does is not just applying StackOverflow and GitHub to my code.

We probably should remember that LLMs are not all alike. It's very well possible that e.g. ChatGPT would have a worse "understanding" (for lack of a better word) of my code, and would rely much more on what it learned before from public sources.