Prefetching considered harmful
The kernel's linked list macros include a set of operators for iterating through a list. At the top of a list-processing loop, the macros will issue a prefetch operation for the next entry in the list. The hope is that, by the time one entry has been processed, the CPU will have fetched the following entry into its cache, avoiding a stall at the beginning of the next trip through the loop. It seems like the sort of micro-optimization which can only help, and nobody has looked closely at these prefetch operations for a long time - until now. Andi Kleen has just posted a patch removing most of those prefetches.
Andi's contention is that, on contemporary processors, the prefetch operations are actually making things worse. These processors already prefetch everything they can get their hands on, so the explicit prefetch is unlikely to help. Even if that prefetch does start a memory cycle earlier than it would have otherwise happened, list processing loops tend to be so short that the amount of additional parallelism gained is quite small. Meanwhile the prefetch operations bloat the kernel image, increase register use, and cause the compiler to generate worse code. So, he says, we are better off without them.
With the prefetch operations removed, Andi's kernel image ends up being
10KB smaller. It also shows no performance regressions over mainline
kernels. Unless somebody else gets different results, that seems like
enough to justify putting this patch into the mainline.
Index entries for this article | |
---|---|
Kernel | Prefetch |
Posted Sep 9, 2010 8:53 UTC (Thu)
by sandmann (subscriber, #473)
[Link] (1 responses)
http://lists.cairographics.org/archives/pixman/2010-June/...
although in that case, the memory access is very linear, so the software prefetch basically competes with the hardware prefetcher.
For more random access, such as accessing a linked list, prefetching seems more likely to be beneficial.
Posted Sep 9, 2010 13:48 UTC (Thu)
by ejr (subscriber, #51652)
[Link]
Explicit prefetch seems useful in a narrow set of conditions: You have many light-weight tasks with tiny memory footprint that require >10s of cycles and don't pollute the cache. If the tasks are too short, the prefetches will stomp on each other. If they're too long, you often end up needing other data in the meantime. Graph analysis algorithms benefit, but not much else seems to benefit.
(What may be more useful architecturally is the ability to stop a HW prefetch engine and retarget it. Consider repeatedly processing the same image/audio frame in memory. The prefetch engine is happily continuing to fetch past the end... It might be nice to say "hey, no, restart prefetching from the beginning" when you're on the last scanline, etc. I don't know if any HW supports it, or if that would even have a place in an OS kernel.)
Posted Sep 9, 2010 12:04 UTC (Thu)
by Trou.fr (subscriber, #26289)
[Link] (1 responses)
Posted Sep 9, 2010 22:41 UTC (Thu)
by smoogen (subscriber, #97)
[Link]
Posted Sep 9, 2010 17:06 UTC (Thu)
by pr1268 (guest, #24648)
[Link] (1 responses)
<Wishlist request> You don't suppose that Andi could make a kernel config option whether the prefetch is included so that I could do my own comparison? Thanks! </Wishlist request>
Posted Sep 9, 2010 17:35 UTC (Thu)
by dtlin (subscriber, #36537)
[Link]
Posted Sep 10, 2010 12:22 UTC (Fri)
by nwatkins (guest, #61119)
[Link] (2 responses)
Is there an underlying assumption for the linked list case that the items in the list are near each other in memory, or is there additional help from the compiler that marks memory for the next node? I'm not coming up with a mechanism which would allow the processor to understand the structure of the list.
Posted Sep 10, 2010 16:20 UTC (Fri)
by jzbiciak (guest, #5246)
[Link]
Posted Sep 10, 2010 22:51 UTC (Fri)
by daglwn (guest, #65432)
[Link]
Soon the data items will not even need to be that close to each other. Google "region prefetching" for some articles. We have to use those billions of transistors for something
Posted Sep 10, 2010 21:04 UTC (Fri)
by rilder (guest, #59804)
[Link] (1 responses)
Posted Sep 18, 2010 6:45 UTC (Sat)
by efexis (guest, #26355)
[Link]
Prefetching considered harmful
Prefetching considered harmful
Prefetching considered harmful
Prefetching considered harmful
Prefetching considered harmful
Prefetching considered harmful
Prefetching considered harmful
Prefetching considered harmful
Prefetching considered harmful
Prefetching considered harmful
Prefetching considered harmful