Adding a huge zero page

Posted Sep 27, 2012 9:33 UTC (Thu) by alankila (guest, #47141)
Parent article: Adding a huge zero page

Transparent hugepages are nice in theory, but probably not as reliable as mounting hugetlbfs and using that instead. It seems that whether hugepages can be used depends to a degree on how much unused memory you have at the point of doing the memory allocation. If you are unlucky, that allocation doesn't use hugepages and it will not get converted into hugepages later on, either.

I observed this issue on my virtual machines test server which runs 6 virtual machines of various sizes with a total of 8 of the 16 GB of system memory. After bootup, almost all of the kvm memory was hugepage'd, but the next day only some 10 % of the memory still was. The problem was that the machines were shutdown during night for backup, and then brought back up. My guess is that the backup process filled memory with pages, some which were dirty, and this defeated the hugepages optimization.

Adding a huge zero page

Posted Sep 27, 2012 13:51 UTC (Thu) by ejr (subscriber, #51652) [Link] (7 responses)

Ah, but transparent huge pages are portable. They don't change your code. In HPC land, your task pretty much own the node(s) on which you're running, so there is little danger of the fragmentation you encountered. THP often is a big performance win with *no* code change. Fixing the zero page issue will fix the remaining decently-sized gotcha for HPC-style uses.

And some of these HPC codes are old and/or expect to run on more than Linux. Conditionally changing all the user-specified and compiler-generated memory allocations is a painful task.

Adding a huge zero page

Posted Sep 28, 2012 9:11 UTC (Fri) by alankila (guest, #47141) [Link] (6 responses)

Right. Well, I'm just saying that there are cases where it doesn't work, so it's a bit akin a voodoo feature you enable and then convince yourself that there is a speed benefit. It is only when you read AnonHugePages line from /proc/meminfo and see, for instance, that only 64 MB is actually in hugepages that you realize it isn't all it's cracked up to be. But hey, it's better than nothing, right?

I was wondering if there shouldn't be a memory defragmenting task that goes through the running process' heap periodically and would move the 4k pages around until coalescing them to a hugepage becomes possible. I mean, if using these pages really gives you around 5 % performance benefit, it would seem reasonable to spend up to few % of CPU to do it for tasks that seem long-lived enough.

Adding a huge zero page

Posted Sep 28, 2012 11:13 UTC (Fri) by nix (subscriber, #2304) [Link] (3 responses)

Working quite well here:

AnonHugePages: 788480 kB
AnonHugePages: 2553856 kB

The latter machine is running a single virtual machine, but the former is running no VMs of any kind and has still turned a gigabyte into transpages (probably largely inside monsters like Chromium). That's not insignificant. (For that matter, I routinely see compile jobs getting hugepaged up, and a TLB saving in a pointer-mad monster like GCC really does speed it up. Sure, it's only a few percent, but that's better than nothing, right?)

Adding a huge zero page

Posted Sep 28, 2012 23:19 UTC (Fri) by alankila (guest, #47141) [Link] (2 responses)

Sure. I'm not saying it never works, I just wish it worked for my use case. Anyway, explicit hugepages are not too huge a pain for now, you just have to calculate/measure how many you need and then hack some apparmor rules for kvm to allow the hugepages mount region to be accessible for writing.

That being said, out of 1.5 GB of other services on the server:

AnonHugePages: 71680 kB

*sigh*

Adding a huge zero page

Posted Sep 28, 2012 23:37 UTC (Fri) by khc (guest, #45209) [Link] (1 responses)

are we doing some kind of poll? :-)

MemTotal: 16327088 kB
AnonHugePages: 3102720 kB

Of course, this box has a fairly specialized daemon that allocates 8GB of memory as 2 separate pools, so it's not surprising that auto huge pages work well (although I've never measured the performance impact of that).

Adding a huge zero page

Posted Sep 29, 2012 10:28 UTC (Sat) by nix (subscriber, #2304) [Link]

Yeah, exactly. If you run things with big heaps composed of lots of little pieces (so malloc uses arena allocation and allocates >>2Mb), you'll probably do well with transparent hugepages. If instead you have lots of little programs with small heaps, you won't see any benefit: if you have programs that make lots of medium-big allocations between 512Kb and 2Mb, you'll probably see glibc malloc falling back to mmap() of regions a bit too small to be converted into a transparent hugepage.

Adding a huge zero page

Posted Sep 28, 2012 14:09 UTC (Fri) by ejr (subscriber, #51652) [Link]

Look up Mel Gorman's memory compaction work. IIRC, there was a somewhat recent (months ago?) bit here on the painful interactions between memory compaction and VFAT on removable devices.

Adding a huge zero page

Posted Oct 5, 2012 16:10 UTC (Fri) by cpasqualini (guest, #69417) [Link]

Looking at this article and it's comments adn started to read about THP and found an interesting file in the Docs: http://lwn.net/Articles/423592/

Did you test any of these?

echo always >/sys/kernel/mm/transparent_hugepage/defrag
echo madvise >/sys/kernel/mm/transparent_hugepage/defrag
echo never >/sys/kernel/mm/transparent_hugepage/defrag