Posted Sep 1, 2009 20:24 UTC (Tue) by jzbiciak
(✭ supporter ✭
In reply to: HWPOISON
Parent article: HWPOISON
That's not how I read this. See section 15.6, "Recovery of Uncorrected Recoverable Errors" and especially 15.6.3, "UCR Error Classification".
The first two error types are the "an error was detected, but the CPU hasn't consumed the errant data yet" error types. If you want to pick nits, the first one (UCNA) is not reported as a Machine Check Exception; rather it is reported as a Corrected Machine Check Error Interrupt (described in Section 15.5). My bad for being sloppy; it is a Machine Check Error, but it isn't a Machine Check Exception. The second recoverable error type (SRAO) is a Machine Check Exception, however.
In any case, both are machine checks.
Now flip with me to page 15-34 and look at what SRAO errors are architecturally defined, there in section 126.96.36.199:
The following two SRAO errors are architecturally defined.
- UCR Errors detected by memory controller scrubbing; and
- UCR Errors detected during L3 cache (L3) explicit writebacks.
So there we have it. Recoverable, Action Optional Machine Checks due to scrubbing. Can it be any clearer? In case you think this feature is old and was supplanted by something more recent, I urge you to flip back to 15-23 and read along here at the intro to Section 15.6:
Recovery of uncorrected recoverable machine check errors is an enhancement in machine-check architecture. The first processor that supports this
feature is 45nm Intel 64 processor with CPUID signature DisplayFamily_DisplayModel encoding of 06H_2EH. This allow system soft-
ware to perform recovery action on certain class of uncorrected errors and
If I'm not mistaken, that's the processor family this article was referring to. (This document is dated June 2009, so it's not like it's anceint.)
Do you have different documentation that suggests otherwise?
to post comments)