Linear speedup

Posted Mar 27, 2025 16:27 UTC (Thu) by Spudd86 (subscriber, #51683)
In reply to: Linear speedup by Athas
Parent article: Two new graph-based functional programming languages

That's a memory placement problem, and it would also be under control of the compiler. Because of the execution model of GPUs there is far less reason to keep one representation of data. It's usually far better to build kernels that store their output to a copy, and change the layout of that new copy to be better for the next step.

If you can do the analysis to prove something can be done in parallel without locks you can also likely do the work to re-organize the data with the same analysis. Especially in functional languages where people aren't writing update in place logic in the first place.