Hardware-level micro-op optimization

Posted Jul 12, 2018 18:57 UTC (Thu) by corbet (editor, #1)
In reply to: Hardware-level micro-op optimization by ncm
Parent article: Spectre V1 defense in GCC

> However, a hardware-level optimization to make a conditional move unconditional because
> the optimizer knows nothing has changed the status bit since its last use is not speculation.

If said "last use" was speculative, and thus the state of the condition code is speculative, then using that code for optimization *is* speculation, instead. The whole point is what happens during speculative execution; the instruction is a no-op in the real world. But an instruction that is defined as not being executed speculatively cannot be elided as the result of a speculative branch prediction.

Hardware-level micro-op optimization

Posted Jul 12, 2018 23:27 UTC (Thu) by jcm (subscriber, #18262) [Link]

Jon is right in his summary. But the point about uop caching and optimization is still a good one. Multiple efforts are underway in the industry to analyze this part of the front end in more detail for side channels. There are quite a few interesting possibilities I can think of, in particular with abuse of value prediction. I've asked a few research teams to consider looking at how badly people screwed up value predictors.

Hardware-level micro-op optimization

Posted Jul 12, 2018 23:44 UTC (Thu) by ncm (guest, #165) [Link]

I agree, but the op that set the status flag was not a speculative op (unless it was in a block that is itself speculative*); it was a regular check that was supposed to be guarding the block where we inserted the conditional move, with, most likely, no micro-ops between it and the conditional move, thus ideally situated to be made unconditional.

(*Speculation may pile upon speculation, up to the limit of microarchitectural resources.)

Ultimately we will need assurances from vendors that the conditional nature of the move is not, and won't ever be, optimized away. Later, we will want another version of conditional move that we specifically allow to be micro-optimized; but first things first.