Posted Mar 18, 2012 16:53 UTC (Sun) by khim
In reply to: Hardware?
Parent article: Toward better NUMA scheduling
Why isn't page migration done automatically by the hardware, by basically applying a cache protocol to RAM contents as well?
Because it's pointless.
I guess the issue is that the CPU caches include hardware to check all cache lines in a set in parallel for whether they contain a certain address, while DDR3 doesn't, so the CPU would need to do lookup operations manually via extra reads/writes, which might be so expensive that just using the interconnect is faster.
DDR3 is not a problem. DMA is. Caches are only “transparent” for userspace programs. Kernel need to perform a complex dance to support system with caches and DMA. And this will be true for your “automatic page migration”, too! This makes the whole exercise totally insane: instead of adding migration logic to kernel (where it can be done transparently from the rest if the system because hardware already includes required logic: it's called page tables) we add all that complexity to the hardware and then introduce complex dance in the kernel anyway to support DMA. What will it give us? Additional complexity and restrictions? Do not want.
to post comments)