LWN.net Logo

Huge pages part 1 (Introduction)

Huge pages part 1 (Introduction)

Posted Feb 19, 2010 10:57 UTC (Fri) by nix (subscriber, #2304)
Parent article: Huge pages part 1 (Introduction)

I find myself looking at huge pages and thinking that huge pages are a feature that will be useful only for special-purpose single-use machines (basically just the two Mel mentions: simulation and databases, and perhaps here and there virtualization where the machine has *lots* of memory) until the damn things are swappable. Obviously we can't swap them as a unit (please wait while we write a gigabyte out), so we need to break the things up and swap bits of them, then perhaps reaggregate them into a Gb page once they're swapped back in. Yes, it's likely to be a complete sod to implement, but it would also give that nice TLB-hit speedup feeling without having to worry if you're about to throw the rest of your system into thrash hell as soon as the load on it increases. (Obviously once you *do* start swapping speed goes to hell anyway, so the overhead of taking TLB misses is lost in the noise. In the long run, defragmenting memory on swapin and trying to make bigger pages out of it without app intervention seems like a good idea, especially on platforms like PPC with a range of page sizes more useful than 4Kb/1Gb.)

IIRC something similar to this being discussed in the past, maybe as part of the defragmentation work? ... I also dimly remember Linus being violently against it but I can't remember why.


(Log in to post comments)

Huge pages part 1 (Introduction)

Posted Feb 19, 2010 12:03 UTC (Fri) by farnz (guest, #17727) [Link]

What you really want (but it's difficult to do sanely on x86) is transparent huge pages. Where possible, the kernel gives you continguous physical pages for continguous virtual pages, and it transparently converts suitable sets of continguous virtual pages to the next size of mapping up when it can do so, and splits large mappings into the next size down when they're not in use, or when there's memory pressure.

The pain on x86 is twofold: first, instead of getting to aggregate (e.g.) 16 4K pages into a 16K page, then 16 16K pages into a 256K page, you get to do things like aggregate 1024 4K pages into a 4M page, and 256 4M pages into a 1GB page. Second, typical x86 TLBs are split by page type; so it's not uncommon to have something like the Core 2 Duo, where you have 128 entries for 4K pages, and just 4 entries for 4M pages (Instruction TLB).

Given that split, most workloads gain more from having the kernel always in the TLB, than from evicting the kernel in favour of your own code (which would have been in the 4K page size TLB otherwise).

Huge pages part 1 (Introduction)

Posted Feb 19, 2010 14:15 UTC (Fri) by paulj (subscriber, #341) [Link]

Doesn't Linux already use huge pages for the kernel? Or am I misremembering
something?

Huge pages part 1 (Introduction)

Posted Feb 19, 2010 14:19 UTC (Fri) by farnz (guest, #17727) [Link]

It does. The work being done is for huge pages for userspace, which is a whole different ballgame, and could result in the kernel's hugepage mapping being pushed out of the TLB.

If/when someone does the work, it'll need benchmarking not just on the latest and greatest, but also on real-world older systems with more restrictive TLBs, to see if it's a net loss.

Huge pages part 1 (Introduction)

Posted Feb 20, 2010 16:33 UTC (Sat) by nix (subscriber, #2304) [Link]

This stuff could presumably autotune, kicking in only on CPUs for which it
is a net win.

Huge pages part 1 (Introduction)

Posted Feb 20, 2010 16:31 UTC (Sat) by nix (subscriber, #2304) [Link]

Oh, hell, I forgot about the split TLB. That makes the whole resource
allocation problem drastically harder :/ Still, does the kernel need more
than one or two entries? If you're not using hugepages currently, it seems
to me that some of those hugepage TLB entries are actually wasted...

Alan Cox on FreeBSD

Posted Feb 23, 2010 5:22 UTC (Tue) by man_ls (subscriber, #15091) [Link]

You are probably thinking of this excellent LWN article from 2007 and the excellent Navarro paper (with a contribution by Alan Cox). The fact that they decided to bring transparent huge pages support to FreeBSD and not to Linux is funny, considering.

Alan Cox on FreeBSD

Posted Feb 23, 2010 6:10 UTC (Tue) by viro (subscriber, #7872) [Link]

ac != alc... IOW, it's not who you probably think it is.

Alan Cox on FreeBSD

Posted Feb 23, 2010 21:53 UTC (Tue) by nix (subscriber, #2304) [Link]

I wish I had been thinking of that paper, but I had no idea it existed.
Unfortunately all the links to it appear dead :( 10.1.1.14.2514 is not a
valid DOI as far as I can tell, and the source link throws an error page
at me...

Alan Cox on FreeBSD

Posted Feb 24, 2010 19:37 UTC (Wed) by biged (subscriber, #50106) [Link]

Alan Cox on FreeBSD

Posted Feb 24, 2010 22:47 UTC (Wed) by nix (subscriber, #2304) [Link]

Yay, thank you!

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds