LWN.net Logo

Speeding up the page allocator

Speeding up the page allocator

Posted Feb 27, 2009 10:26 UTC (Fri) by etienne_lorrain@yahoo.fr (guest, #38022)
In reply to: Speeding up the page allocator by bluefoxicy
Parent article: Speeding up the page allocator

Please note that I was more talking about DMA zeroing, which is nearly "free" in CPU time (on some tests I did on PPC, it is more than 10 times faster than the CPU zeroing - excluding dcbz which cannot be used on un-cached memory to be precise).
The big advantage is that it should also remove those cache-lines from the memory cache (layer 1, 2 and 3 if present) at time of free(), so it should still be better if you "free, allocate, don't use, free, allocate, don't use" because the allocated and unused memory isn't even fetched into the memory cache, and isn't made dirty for the other processors cache.
But it is probably more complex (multiprocessor DMA semaphore), and for these kind of things only testing can tell the truth, and that truth is only valid for the tested environment.


(Log in to post comments)

Speeding up the page allocator

Posted Feb 27, 2009 23:14 UTC (Fri) by nix (subscriber, #2304) [Link]

It's nearly free, but is it worth the complexity? How many pages are
zeroed, and then not used soon enough that it's still in cache?

IIRC the zero page was removed from the kernel because zeroing pages was
faster than doing pagetable tricks to share a single zero page. Pagetable
manipulation is particularly expensive, but even so...

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds