Completing and merging core scheduling
Completing and merging core scheduling
Posted May 14, 2020 15:57 UTC (Thu) by jtaylor (guest, #91739)Parent article: Completing and merging core scheduling
This is very surprising benchmark result. I would not have assumed SMT on or off makes any significant difference for a linpack benchmark.
Typically well written linear algebra programs utilize all relevant components of the cpu constantly with very little idle time. As I understand it SMT does not increase the number of floating point units of the physical cpu it only allows them to be used by another thread if they are idle which should not be the case for most numerical code.
Poorly written code or operations lower than BLAS level 3 may of course be blocked on memory loads but if that is the case you SMT or not should also not make much difference either as it does not double your memory bandwidth (and linpack should not be poorly written).
In my experience SMT for numerical programs give almost zero performance improvements. In most cases it decreases performance when the program assumes a logical core is a physical core and starts too many workers leading to cache trashing.
