Re: pthread_create() slow for many threads; also time to revisit 64b context switch optimization?
From: | Andi Kleen <andi-AT-firstfloor.org> | |
To: | Ingo Molnar <mingo-AT-elte.hu> | |
Subject: | Re: pthread_create() slow for many threads; also time to revisit 64b context switch optimization? | |
Date: | Wed, 13 Aug 2008 22:42:48 +0200 | |
Message-ID: | <87fxp8zlx3.fsf@basil.nowhere.org> | |
Cc: | Ulrich Drepper <drepper-AT-redhat.com>, Arjan van de Ven <arjan-AT-infradead.org>, akpm-AT-linux-foundation.org, hugh-AT-veritas.com, linux-mm-AT-kvack.org, linux-kernel-AT-vger.kernel.org, briangrant-AT-google.com, cgd-AT-google.com, mbligh-AT-google.com, Linus Torvalds <torvalds-AT-linux-foundation.org>, Thomas Gleixner <tglx-AT-linutronix.de>, "H. Peter Anvin" <hpa-AT-zytor.com> | |
Archive‑link: | Article |
Ingo Molnar <mingo@elte.hu> writes: > > i find it pretty unacceptable these days that we limit any aspect of > pure 64-bit apps in any way to 4GB (or any other 32-bit-ish limit). It's not limited to 2GB, there's a fallback to >4GB of course. Ok admittedly the fallback is slow, but it's there. I would prefer to not slow down the P4s. There are **lots** of them in field. And they ran 64bit still quite well. Also back then I benchmarked on early K8 and it also made a difference there (but I admit I forgot the numbers) I think it would be better to fix the VM because there are other use cases of applications who prefer to allocate in a lower area. For example Java JVMs now widely use a technique called pointer compression where they dynamically adjust the pointer size based on how much memory the process uses. For that you have to get low memory in the 47bit VM too. The VM should deal with that gracefully. To be honest I always thought the linear search in the VMA list was a little dumb. I'm sure there are other cases where it hurts too. Perhaps this would be really an opportunity to do something about it :) -Andi
Posted Aug 21, 2008 16:25 UTC (Thu)
by leonb (guest, #3054)
[Link]
Re: pthread_create() slow for many threads; also time to revisit 64b context switch optimization?
MAP_32BIT should not go because it has uses other than the stack.
For instance Lush (lush.sf.net) uses it for implementing the
dynamic linking capabilities needed by its compiler.
This is because gcc compiles x86_64 with
option -mcmodel=small by default and therefore the program
and its symbols must be linked in the lower 2 GB of
the address space. To make things more interesting,
gcc currently does not implement the -mcmodel=large.
- L.