Making CPython faster
Making CPython faster
Posted Jun 5, 2021 6:16 UTC (Sat) by mkbosmans (subscriber, #65556)In reply to: Making CPython faster by Paf
Parent article: Making CPython faster
For a typical 'HPC' workload you would have a much larger working set, either a single bigger matrix, or lots of these small to medium sized matrices. In the latter case you would parallelize over the matrices and do each individual matrix multiplication in a single thread.
Posted Jun 5, 2021 9:02 UTC (Sat)
by anton (subscriber, #25547)
[Link]
Of course typical HPC workloads take more than 92ms on one CPU, otherwise nobody would bother to build parallel systems for them and nobody would bother with parallelizing the code for these CPUs. But the question is if CPython programs behave like typical HPC workloads.
All 490,000 elements of the result matrix can be computed independently, i.e., in parallel. Parallel scalability in this case is limited by resource constraints and by parallel overhead (starting more threads, telling them their jobs, and waiting for all of them to complete their jobs).
Making CPython faster