LWN.net Logo

Faster page faulting through prezeroing

Faster page faulting through prezeroing

Posted Jan 13, 2005 7:39 UTC (Thu) by huaz (guest, #10168)
Parent article: Faster page faulting through prezeroing

[quote]If those pages are already cleared, there is no need to load an entire page into the processor cache when it is faulted in.[/quote]

I don't get it. The what happens when kscrubd wakes up and clears the pages? Yup, it brings the memory into cache and might get evicted before someone needs it.

I am not convinced this is a useful feature. It looks more like something that only works for one particular (possibly artificially designed) benchmark.


(Log in to post comments)

Faster page faulting through prezeroing

Posted Jan 13, 2005 11:26 UTC (Thu) by etienne (guest, #25256) [Link]

<quote>The what happens when kscrubd wakes up and clears the pages? Yup, it brings the memory into cache and might get evicted before someone needs it</quote>

Maybe that is not the job of the processor to clear the page, for instance kscrubd function can do a IDE DMA read on the disk of a pre-zeroed area. Then the processor cache is not touched (could be marked dirty but...). That pre-zeroed area could be some reserved blocks at the end of the swap partition, or a contigous file.

Another clean solution is available on non ia32 processor, being write and invalidate instruction: when the first byte of a cache line is written (to zero), the complete cache line is not first read from memory.
IMHO, when the repeat counter (in register %ecx) is bigger than the cache line, assembly instruction "rep stosl" still do not produce a write and invalidate transaction to external memory on ia32.

Etienne.

Faster page faulting through prezeroing

Posted Jan 13, 2005 22:49 UTC (Thu) by zhjy (guest, #27228) [Link]

I didn't the code. What I guess is that kscrubd can zero'ed a lot of pages once, then it can save some unnecessary cache eviction.

Faster page faulting through prezeroing

Posted Jan 14, 2005 14:40 UTC (Fri) by zhjy (guest, #27228) [Link]

Another small thing is: when context switching between processes, anyway, the cache lines may be filled by new ones, so kscrubd will not add much cache pollution. But page fault handling is a synchronous operation and after that you still are in the same context. In that case, cache pollution is bad.

Cache trashing

Posted Jan 14, 2005 5:06 UTC (Fri) by goaty (guest, #17783) [Link]

I think the idea is not so much to prevent cache trashing, which is after all inevitable, but to make it happen less often. If kswapd pre-zeroes a big sack of pages, then that's more efficient than zeroing them one at a time. And of course if the hardware can be persuaded to zero chunks of RAM without touching the processor cache, then you've got a huge win.

In a couple of years it might even be possible to buy a PCI Express "/dev/null" card to accelerate your server.

Cache trashing

Posted Feb 12, 2008 0:31 UTC (Tue) by goaty (guest, #17783) [Link]

2+ years ago, I wrote: In a couple of years it might even be possible to buy a PCI Express "/dev/null" card to accelerate your server.

Unfortunately, this did not happen. As someone pointed out in another thread, it's possible to persuade various DMA-capable hardware to act as a /dev/null device. For example, you can stick a page full of zeroes on the swap device and then get the IDE controller to DMA it to wherever its needed. Provided the drive's cache is larger than the page size, the performance should be acceptable.

The problem being that most of the devices on the system are already busy with their intended function, like reading and writing files, and cannot expend time in the frivolous pursuit of nullage.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds