AVX-512
AVX-512
Posted Oct 1, 2022 11:31 UTC (Sat) by atnot (subscriber, #124910)In reply to: AVX-512 by drago01
Parent article: Hybrid scheduling gets more complicated
Posted Oct 2, 2022 7:30 UTC (Sun)
by drago01 (subscriber, #50715)
[Link] (5 responses)
Posted Oct 2, 2022 12:06 UTC (Sun)
by khim (subscriber, #9252)
[Link]
The biggest problem there is the fact that decision to use (or not use) SSE, AVX, AVX-512 is local (you pick these on level is tiny, elementary, functions) while the question about whether AVX-512 is beneficial or not is global. Essentially the same dilemma which killed Itanic, just not as acute.
Posted Oct 3, 2022 9:32 UTC (Mon)
by farnz (subscriber, #17727)
[Link] (3 responses)
The difficulty comes in when my workload is scheduled on a single server with other workloads. The right decision for my workload if on a machine by itself is AVX-512 at the lower clocks; however, depending on what the scheduler does, the right decision might become AVX2 if other workloads are more important than mine, and are adversely affected by the core doing AVX-512 downclocking.
This is the problem with using local state ("does this OS thread make use of AVX-512") to drive a global decision ("what clock speed should this core run at"). The correct answer depends not only on my workload, but also on all other workloads sharing this CPU core - which is fine for HPC type workloads, where there are no other workloads sharing a CPU core, but more of a problem with general deployment of AVX-512.
As a side note, as Intel moves on with process from the 14nm of original AVX-512 CPUs, the downclock becomes less severe, and it's nearly non-existent on the latest designs. This, to me, suggests that the downclock is a consequence of backporting AVX-512 to Skylake on 14nm, and thus will become a historic artefact over time.
Posted Oct 3, 2022 15:15 UTC (Mon)
by drago01 (subscriber, #50715)
[Link] (2 responses)
Clocks don't matter much though, what matters is the performance you are getting. And if you workload benefits from wide vectors it will offset any clock changes.
Posted Oct 3, 2022 15:33 UTC (Mon)
by farnz (subscriber, #17727)
[Link]
The critical difference is that with SKX, the maximum permitted clock assuming that thermals allowed was massively reduced for "heavy" AVX-512, because it caused thermal hot-spots on the chip that weren't properly accounted for by "normal" thermal monitoring. With ICL and with RKL there's no longer a huge limit - instead of the SKX thing (where a chip could drop from 3.0 GHz "base" to 2.8 GHz "max turbo" if you used AVX-512), you now can always sustain the same "base", but the max turbo is reduced by 100 MHz or so.
Posted Oct 13, 2022 13:54 UTC (Thu)
by roblucid (guest, #48964)
[Link]
AVX-512
AVX-512
AVX-512
AVX-512
But give how CPUs work now days that's not entirely true either because a lighter workload will result into higher clocks and vise versa. CPUs try to maximize performance within the power budget.
AVX-512
AVX-512
