LWN: Comments on "Huge pages part 1 (Introduction)" http://lwn.net/Articles/374424/ This is a special feed containing comments posted to the individual LWN article titled "Huge pages part 1 (Introduction)". hourly 2 Huge pages part 1 (Introduction) http://lwn.net/Articles/530694/rss 2012-12-26T08:00:58+00:00 heguanjun <div class="FormattedComment"> <p> well, now there is another huge page implementation: Transparent huge pages. and with many benefits.<br> </div> Huge pages part 1 (Introduction) http://lwn.net/Articles/395104/rss 2010-07-07T04:48:06+00:00 glennewton I've collected a number of good resources for huge pages for Linux, Java, Solaris, MySql, AMD here: <a rel="nofollow" href="http://zzzoot.blogspot.com/2009/02/java-mysql-increased-performance-with.html">Java, MySql increased performance with Huge Pages</a>. Alan Cox on FreeBSD http://lwn.net/Articles/376035/rss 2010-02-24T22:47:20+00:00 nix <div class="FormattedComment"> Yay, thank you!<br> </div> Alan Cox on FreeBSD http://lwn.net/Articles/375999/rss 2010-02-24T19:37:41+00:00 biged <div class="FormattedComment"> Try here: <a href="http://www.usenix.org/events/osdi02/tech/full_papers/navarro/navarro_html/">http://www.usenix.org/events/osdi02/tech/full_papers/nava...</a><br> </div> Alan Cox on FreeBSD http://lwn.net/Articles/375882/rss 2010-02-23T21:53:59+00:00 nix <div class="FormattedComment"> I wish I had been thinking of that paper, but I had no idea it existed. <br> Unfortunately all the links to it appear dead :( 10.1.1.14.2514 is not a <br> valid DOI as far as I can tell, and the source link throws an error page <br> at me...<br> </div> Alan Cox on FreeBSD http://lwn.net/Articles/375753/rss 2010-02-23T06:10:23+00:00 viro <div class="FormattedComment"> ac != alc... IOW, it's not who you probably think it is.<br> </div> Alan Cox on FreeBSD http://lwn.net/Articles/375751/rss 2010-02-23T05:22:57+00:00 man_ls You are probably thinking of <a href="http://lwn.net/Articles/250335/">this excellent LWN article</a> from 2007 and the excellent <a href="http://en.scientificcommons.org/42636713">Navarro paper</a> (with a contribution by Alan Cox). The fact that they decided to bring transparent huge pages support to FreeBSD and not to Linux is funny, considering. Huge pages part 1 (Introduction) http://lwn.net/Articles/375327/rss 2010-02-20T16:33:52+00:00 nix <div class="FormattedComment"> This stuff could presumably autotune, kicking in only on CPUs for which it <br> is a net win.<br> </div> Huge pages part 1 (Introduction) http://lwn.net/Articles/375325/rss 2010-02-20T16:31:19+00:00 nix <div class="FormattedComment"> Oh, hell, I forgot about the split TLB. That makes the whole resource <br> allocation problem drastically harder :/ Still, does the kernel need more <br> than one or two entries? If you're not using hugepages currently, it seems <br> to me that some of those hugepage TLB entries are actually wasted...<br> <p> </div> Huge pages part 1 (Introduction) http://lwn.net/Articles/375236/rss 2010-02-19T14:19:03+00:00 farnz <p>It does. The work being done is for huge pages for userspace, which is a whole different ballgame, and could result in the kernel's hugepage mapping being pushed out of the TLB. <p>If/when someone does the work, it'll need benchmarking not just on the latest and greatest, but also on real-world older systems with more restrictive TLBs, to see if it's a net loss. Huge pages part 1 (Introduction) http://lwn.net/Articles/375234/rss 2010-02-19T14:15:43+00:00 paulj <div class="FormattedComment"> Doesn't Linux already use huge pages for the kernel? Or am I misremembering <br> something?<br> </div> Huge pages part 1 (Introduction) http://lwn.net/Articles/375222/rss 2010-02-19T12:03:34+00:00 farnz <p>What you really want (but it's difficult to do sanely on x86) is transparent huge pages. Where possible, the kernel gives you continguous physical pages for continguous virtual pages, and it transparently converts suitable sets of continguous virtual pages to the next size of mapping up when it can do so, and splits large mappings into the next size down when they're not in use, or when there's memory pressure. <p>The pain on x86 is twofold: first, instead of getting to aggregate (e.g.) 16 4K pages into a 16K page, then 16 16K pages into a 256K page, you get to do things like aggregate 1024 4K pages into a 4M page, and 256 4M pages into a 1GB page. Second, typical x86 TLBs are split by page type; so it's not uncommon to have something like the Core 2 Duo, where you have 128 entries for 4K pages, and just 4 entries for 4M pages (Instruction TLB). <p>Given that split, most workloads gain more from having the kernel always in the TLB, than from evicting the kernel in favour of your own code (which would have been in the 4K page size TLB otherwise). Huge pages part 1 (Introduction) http://lwn.net/Articles/375220/rss 2010-02-19T10:57:46+00:00 nix <div class="FormattedComment"> I find myself looking at huge pages and thinking that huge pages are a feature that will be useful only for special-purpose single-use machines (basically just the two Mel mentions: simulation and databases, and perhaps here and there virtualization where the machine has *lots* of memory) until the damn things are swappable. Obviously we can't swap them as a unit (please wait while we write a gigabyte out), so we need to break the things up and swap bits of them, then perhaps reaggregate them into a Gb page once they're swapped back in. Yes, it's likely to be a complete sod to implement, but it would also give that nice TLB-hit speedup feeling without having to worry if you're about to throw the rest of your system into thrash hell as soon as the load on it increases. (Obviously once you *do* start swapping speed goes to hell anyway, so the overhead of taking TLB misses is lost in the noise. In the long run, defragmenting memory on swapin and trying to make bigger pages out of it without app intervention seems like a good idea, especially on platforms like PPC with a range of page sizes more useful than 4Kb/1Gb.)<br> <p> IIRC something similar to this being discussed in the past, maybe as part of the defragmentation work? ... I also dimly remember Linus being violently against it but I can't remember why.<br> <p> </div> About NUMA http://lwn.net/Articles/375100/rss 2010-02-18T18:17:55+00:00 cma Mel, thanks A LOT! It's all very clear now! Best regards! About NUMA http://lwn.net/Articles/375036/rss 2010-02-18T15:00:54+00:00 mel <div class="FormattedComment"> <font class="QuotedText">&gt; I'm curious... If my app is NUMA aware (let's say it's an old fashioned threaded app), in this</font><br> <font class="QuotedText">&gt; case woould it be interesting in enabling in BIOS (in concrete for a Dell R610 server) the </font><br> <font class="QuotedText">&gt; memory option NODE INTERLEAVING? Or just let BIOS enable NUMA behavior (NODE </font><br> <font class="QuotedText">&gt; INTERLEVING disabled)? Thanks and congrats for this great article!</font><br> <p> s/NUMA aware/not NUMA aware/<br> <p> It depends on whether your application fits in one node or not. If it fits in one node, then leave <br> NODE_INTERLEAVING off and use taskset to bind the application to one nodes worth of CPUs. <br> The memory will be allocated locally and performance will be decent.<br> <p> If the application needs the whole machine and one thread faults all of the memory before <br> spawning other threads (a common anti-pattern for NUMA), then NODE_INTERLEAVING will give <br> good average performance. Without interleaving, your performance will sometimes be great <br> and other times really poor depending on if the thread is running on the node that faulted the <br> data or not. You don't need to go to the bios to test it out, launch the <br> application with<br> <p> numactl --interleave=all your-application<br> </div> About NUMA - corrigenda http://lwn.net/Articles/375031/rss 2010-02-18T14:36:34+00:00 cma Sorry about the typo: I was meaning: if my app WAS NOT NUMA aware... About NUMA http://lwn.net/Articles/375029/rss 2010-02-18T14:35:10+00:00 cma I'm curious... If my app is NUMA aware (let's say it's an old fashioned threaded app), in this case woould it be interesting in enabling in BIOS (in concrete for a Dell R610 server) the memory option NODE INTERLEAVING? Or just let BIOS enable NUMA behavior (NODE INTERLEVING disabled)? Thanks and congrats for this great article! Regards Well-written article http://lwn.net/Articles/374972/rss 2010-02-18T11:12:11+00:00 sdalley <p>Well done Mel Gorman for writing about a deeply technical topic in such a comprehensible way! I actually understood it 80% on the first pass.</p> <p>It's a real art to be able to define unfamiliar concepts as you go without over-simplifying on the one hand or being lost in nested interrupts on the other.</p> <p>The only things I had to look up were <a href="http://en.wikipedia.org/wiki/Message_Passing_Interface">MPI</a> and <a href="http://en.wikipedia.org/wiki/OpenMP">OpenMP</a> but that was easily done.</p>