|
|
Log in / Subscribe / Register

Determinism, reproducibility, and numerical analysis

Determinism, reproducibility, and numerical analysis

Posted Jun 6, 2025 10:44 UTC (Fri) by farnz (subscriber, #17727)
In reply to: Accelerators, parallelism, and determinism by DemiMarie
Parent article: The importance of free software to science

GPUs do provide the same IEEE 754 guarantees on determinism when it comes to floats as CPUs do; they're fully deterministic machines in that regard.

The reason people talk about GPUs as non-deterministic is all in arbitration logic; arbitration logic is often deliberately non-deterministic between requesters of the same priority, and in GPUs that happens during memory access and in hardware scheduling of GPU work items.

Also note that having internal non-determinism (e.g. because you're reordering FP operations) does not imply that your result is non-deterministic. You can do things with numerical analysis that prove that, for a given allowed set of possible orders of operation, the output is always the same; if you then ensure that the allowed set of possible orders of operation in your analysis is a non-strict superset of the possible orders of operation due to non-determinism in your implementation, you have a proof that you have a deterministic output from a non-deterministic machine.

And numerical analysis can extend beyond purely numerical results; if your prediction is that a statistical measure of the simulation's output falls in a range, I can do numerical analysis to show that the simulation's output's error bars are such that the statistical measure must fall inside that range if it falls inside an analytically determined range for a single run of the simulation.

Finally, note that if your result is not reproducible, what you're doing is arguably not science; reproducibility is necessary because otherwise you can claim results from your simulation and insist that your hypothesis is correct, even though my results are different and falsify your hypothesis, simply by saying that my run of the simulation is wrong.


to post comments

Determinism, reproducibility, and numerical analysis

Posted Jun 6, 2025 13:11 UTC (Fri) by Wol (subscriber, #4433) [Link]

> Finally, note that if your result is not reproducible, what you're doing is arguably not science; reproducibility is necessary because otherwise you can claim results from your simulation and insist that your hypothesis is correct, even though my results are different and falsify your hypothesis, simply by saying that my run of the simulation is wrong.

Not "arguably", it *cannot* be science.

Science is accurately predicting the results of your experiments, not just doing experiments and "seeing what happens" - that's called playing.

That's why whenever you see "a new experiment has proven that ...", you know either they don't know what they're doing, or they do know what they're doing and it's called propaganda/lying.

It's not Science until you do the exact same experiment and get the result you predicted. Nothing wrong with the prediction being vague, as long as it is correct as far as it goes - you can always refine it afterwards. And then do another experiment, of course!

Cheers,
Wol

Determinism, reproducibility, and numerical analysis

Posted Jun 7, 2025 16:48 UTC (Sat) by DemiMarie (subscriber, #164188) [Link] (1 responses)

At least CUDA explicitly does not guarantee reproducibility unless the hardware and software are unchanged. In fact, you have to opt-in to determinism in some cases.

Determinism, reproducibility, and numerical analysis

Posted Jun 9, 2025 9:54 UTC (Mon) by farnz (subscriber, #17727) [Link]

CUDA's non-determinism is in scheduling and memory access, not computation itself. This is the same guarantees as you get with a CPU; the computation is deterministic, but the point at which you work across CPU cores is non-deterministic.

However non-determinism is not the same as non-reproducibility. There exist plenty of non-deterministic algorithms that have a deterministically reproducible output; it's just extra analysis steps to confirm that your result is reproducible from a set of non-deterministic intermediate steps.

And note that for science purposes, we don't necessarily even need a deterministic output; if your prediction is "this simulation will never have an output whose mean is below 64.0", you can do an analysis that confirms that the algorithm's error bars on the mean are (say) ±31.5 due to non-determinism. Then, if the simulation gets an output of 127.0, you know that, while you can't reproduce the exact value, you can deterministically answer the question "is the mean of the output greater than or equal to 64.0" with "yes, because the worst case an attempted reproducer will see is a mean of 95.5, and the error bar on that will tell them that the algorithm's output must be above 64.0".


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds