Posted Aug 31, 2009 6:36 UTC (Mon) by jzbiciak
(✭ supporter ✭
In reply to: HWPOISON
Parent article: HWPOISON
There are a couple things at play here:
- The MCA can occur on any "word", where "word" is defined by the width of the ECC code applied at the corresponding level of memory. It could be a 64-bit word on a 64-bit + 8-bit DRAM bus, or it could be on the order of a 64-byte cache line. (I think Athlon's on-chip ECC works on whole cache lines, but I admit to not knowing for sure. I know a particular DSP core's L2 cache ECC works in terms of 256-bit data phases on the chips that support that feature.)
- The CPU need not have referenced the particular word that triggered the fault. A CPU read, or better yet, a data prefetch (either triggered explicitly by an instruction or implicitly by a prefetch engine) may have triggered the memory reference that triggered the MCA. If the faulting word is due to a prefetch, or is late in the cache line that was read due to a demand fetch, that data may arrive at the CPU quite long after the instruction that triggered that line fill.
- Whether or not the CPU referenced the particular word that triggered the fault, the existing MCA may consider such faults catastrophic at the task level, and so does not bother to precisely track which instruction(s) may have consumed the bogus data. (See Chapter 15 in this reference where it says: "The implementation of the machine-check architecture does not ordinarily permit the processor to be restarted reliably after generating a machine-check exception.") All that's necessary is to keep track of which task(s) to kill, which is mainly a function of keeping track of the physical address that had a fault.
- In some systems, the MC exception could be asserted by the chipset, not the CPU. The chipset may actually detect the fault and alert the CPU via an exception pin, but nothing really aligns that exception to the data's arrival. Note that this property would be system dependent—not all systems would necessarily be this imprecise.
to post comments)