|
|
Log in / Subscribe / Register

Preferred form of modification

Preferred form of modification

Posted Mar 10, 2026 15:42 UTC (Tue) by geofft (subscriber, #59789)
In reply to: Preferred form of modification by kleptog
Parent article: Debian decides not to decide on AI-generated contributions

In the sense of bit-for-bit reproducibility, yes. But in the sense of human understanding, I think it is actually like flex/bison: if you want to understand what a parser is doing, you're going to have a better time looking at the highest-level inputs instead of the C code. Or for autoconf, which I'm more familiar with: it's always a better time editing configure.ac and rerunning autoconf than editing ./configure directly, and also this remains true even if you're on a different version of autoconf and your regenerated ./configure has a whole bunch of uninteresting changes, because the intent of the two generated ./configure files is the same.

The term "preferred form of modification" is from the GPL, and is intended to protect the four software freedoms, specifically, the freedom to study and improve the software, and I think it should be interpreted in that context. By the word "modification" it implies not trying to regenerate anything exactly. I think it's a natural extension to reproducible builds to desire that a small change to the sources produces a correspondingly small change in the binary, but that is not a requirement for the sort of reproducibility you want for automated builds, and it's quite common (especially with compiler optimizations, etc.) for this not to be true already.

For the goal of bit-for-bit reproducibility, I wonder if you can do something like check in both the input and output of the LLM as well a proof that the output was generated from the given neural network and given inputs, which probably just takes the form of the RNG bitstream and the specific order of evaluation (even if you use a DRBG to deal with the randomness, my understanding is that operating on differently-shaped hardware with different parallelism is going to trigger some chaos theory in the outputs of a neural network). Apparently it is also more efficient to verify matrix multiplication than to actually perform it (Freivald's algorithm). This might be both too much data and too much computation to be practical at the moment, but maybe it's what we do many years in the future.


to post comments

Preferred form of modification

Posted Mar 10, 2026 16:30 UTC (Tue) by neggles (subscriber, #153254) [Link] (3 responses)

Running the same LLM with the same prompts and the same RNG seed on the same device type will always produce the same output, so there's that.

Preferred form of modification

Posted Mar 10, 2026 16:35 UTC (Tue) by koverstreet (✭ supporter ✭, #4296) [Link] (2 responses)

No, it won't. You can run an LLM that way - temperature = 0 - but you generally don't want to. Like in many other algorithms, introducing some stochastic noise often produces better results.

Preferred form of modification

Posted Mar 10, 2026 16:54 UTC (Tue) by geofft (subscriber, #59789) [Link]

There's a difference between setting the temperature to zero, i.e. deterministically taking the most-likely token (basically changing softmax to regular-old deterministic max), and using a PRNG with a deterministic seed with a non-zero temperature, which will sometimes take less-likely tokens but will make that decision in the same way for every re-execution of the same network (with the same hardware, resources, etc.) with the same seed. I agree that setting the temperature to zero is probably not what you want.

Preferred form of modification

Posted Mar 10, 2026 17:34 UTC (Tue) by phm (subscriber, #168918) [Link]

No, it won't. You can run an LLM that way - temperature = 0 - but you generally don't want to. Like in many other algorithms, introducing some stochastic noise often produces better results.
Given the same settings (temperature, model, RNG seed) an LLM will produce the same results. Here are some example sessions run with llama.cpp (LLM is mdradermacher's quantization i1 of Apertus 8B, abliterated) on a Thinkpad.

t420:~/llama.cpp/build/bin$ ./llama-cli --temp 20 --seed 12345 $LLM

> Hello!

Hello! It's great to connect with you. If you have a question or a [^C]

# Running the same command again:

t420:~/llama.cpp/build/bin$ ./llama-cli --temp 20 --seed 12345 -m $LLM

> Hello!

Hello! It's great to connect with you [^C]

./llama-cli --temp 20000000 --seed 12345 -m $LLM

> Hello!

Hello! Welcome to the SwissAI assistant service. What can I help [^C]

Preferred form of modification

Posted Mar 10, 2026 19:50 UTC (Tue) by ptime (subscriber, #168171) [Link] (3 responses)

Flex/bison involve a deterministic mapping between the high level abstraction and generated code. LLMs do not.

Preferred form of modification

Posted Mar 10, 2026 20:17 UTC (Tue) by geofft (subscriber, #59789) [Link] (2 responses)

I don't think this comment responds to any of the things I said about determinism, nor does it take into account any of the existing comments about how deterministic execution of LLMs is entirely possible.

Preferred form of modification

Posted Mar 11, 2026 0:15 UTC (Wed) by ptime (subscriber, #168171) [Link]

It might be possible, just like it might be possible to take the tires off a bike and put furniture casters on instead, but the nondeterminism is why people want to use the LLMs in the first place.

Preferred form of modification

Posted Mar 12, 2026 3:46 UTC (Thu) by gf2p8affineqb (subscriber, #124723) [Link]

But determinism isn't the only point. The point is that other tools have formal semantics, and that changes have a predictable local effect. Compare that to LLM where the semantics are "whatever it outputs" and no one can predict how the output changes in response to a change in input.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds