|From:||Ulrich Drepper <drepper-AT-redhat.com>|
|Subject:||Re: [patch 00/14] remap_file_pages protection support|
|Date:||Sat, 06 May 2006 09:05:29 -0700|
|Cc:||Nick Piggin <nickpiggin-AT-yahoo.com.au>, Andrew Morton <akpm-AT-osdl.org>, linux-kernel-AT-vger.kernel.org, Linux Memory Management <linux-mm-AT-kvack.org>, Val Henson <val.henson-AT-intel.com>|
Blaisorblade wrote: > I've not seen the numbers indeed, I've been told of a problem with a "customer > program" and Ingo connected my work with this problem. Frankly, I've been > always astonished about how looking up a 10-level tree can be slow. Poor > cache locality is the only thing that I could think about. It might be good if I explain a bit how much we use mmap in libc. The numbers can really add up quickly. - for each loaded DSO we might have up to 5 VMAs today: 1. text segment, protection r-xp, normally never written to 2. a gap, protection ---p (i.e., no access) 3. a relro data segment, protectection r--p 4. a data segment, protection rw-p 5. a bss segment, anonymous memory The first four are mapped from the file. In fact, the first segment "allocates" the entire address space of all segment, even if it's longer than the file. Then gap is done using mprotect(PROT_NONE). Then the area for segment 3 and 4 is mapped in one mmap() call. It's in the same file but the offset used in the mmap is not the same as the same as the offset which naturally is already established through the first mmap. I.e., if the first mmap() would start at offset 0 and continue for 1000 pages, the gap might start at a, say, offset of 4 pages and continue for 500 pages. Then the "natural" offset of the first data page would be 504 pages but the second mmap() call would in fact use the offset 4 because the text and data segment are continuous in the _file_ (although not in memory). Anyway, once relocations are done the protection of the relro segment is changed, splitting the data segment in two. So, for DSO loading there would be two steps of improvement: 1. if a mprotect() call wouldn't split the VMA we would have 3 VMAs in the end instead of 5. 40% gain. 2. if I could use remap_file_pages() for the data segment mapping and the call would allow changing the protection and it would not split the VMAs, then we'd be down to 2 mappings. 60% down. A second big VMA user are thread stacks. I think the application which was mentioned in this thread briefly used literally thousands of threads. Leaving aside the insanity of this (it's unfortunately how many programmers work) this can create problems because we get at least two (on IA-64 three) VMAs per thread. I.e., thread stack allocation works likes this: 1. allocate area big enough for stack and guard (we don't use automatic growing, this cannot work) 2. change the protection of the guard end of the stack to PROT_NONE. So, for say 1000 threads we'll end up with 2000 VMAs. Threads are also important to mention here because - often they are short-lived and we have to recreate them often. We usually reuse stacks but only keep that much allocated memory around. So more often than we like we actually free and later re-allocate stacks. - these thousands of stack VMAs are really used concurrently. ALl threads are woken over a period of time. A third source of VMAs is anonymous memory allocation. mmap is used in the malloc implementation and directly in various places. For randomization reasons there isn't really much we can do here, we shouldn't lump all these allocations together. A fourth source of VMAs are the programs themselves which mmap files. Often read-only mappings of many small files. Put all this together and non-trivial apps as written today (I don't say they are high-quality apps) can easily have a few thousand, maybe even 10,000 to 20,000 VMAs. Firefox on my machine uses in the moment ~560 VMAs and this is with only a handful of threads. Are these the numbers the VM system is optimized for? I think what our people running the experiments at the customer site saw is that it's not. The VMA traversal showed up on the profile lists. -- â§ Ulrich Drepper â§ Red Hat, Inc. â§ 444 Castro St â§ Mountain View, CA â
Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds