|
|
Subscribe / Log in / New account

Re: pthread_create() slow for many threads; also time to revisit 64b context switch optimization?

From:  Andi Kleen <andi-AT-firstfloor.org>
To:  Ingo Molnar <mingo-AT-elte.hu>
Subject:  Re: pthread_create() slow for many threads; also time to revisit 64b context switch optimization?
Date:  Wed, 13 Aug 2008 22:42:48 +0200
Message-ID:  <87fxp8zlx3.fsf@basil.nowhere.org>
Cc:  Ulrich Drepper <drepper-AT-redhat.com>, Arjan van de Ven <arjan-AT-infradead.org>, akpm-AT-linux-foundation.org, hugh-AT-veritas.com, linux-mm-AT-kvack.org, linux-kernel-AT-vger.kernel.org, briangrant-AT-google.com, cgd-AT-google.com, mbligh-AT-google.com, Linus Torvalds <torvalds-AT-linux-foundation.org>, Thomas Gleixner <tglx-AT-linutronix.de>, "H. Peter Anvin" <hpa-AT-zytor.com>
Archive‑link:  Article

Ingo Molnar <mingo@elte.hu> writes:
>
> i find it pretty unacceptable these days that we limit any aspect of 
> pure 64-bit apps in any way to 4GB (or any other 32-bit-ish limit). 

It's not limited to 2GB, there's a fallback to >4GB of course. Ok
admittedly the fallback is slow, but it's there.

I would prefer to not slow down the P4s. There are **lots** of them in
field. And they ran 64bit still quite well. Also back then I
benchmarked on early K8 and it also made a difference there (but I
admit I forgot the numbers)

I think it would be better to fix the VM because there are
other use cases of applications who prefer to allocate in a lower area.
For example Java JVMs now widely use a technique called pointer
compression where they dynamically adjust the pointer size based
on how much memory the process uses. For that you have to get
low memory in the 47bit VM too. The VM should deal with that gracefully.

To be honest I always thought the linear search in the VMA list
was a little dumb. I'm sure there are other cases where it hurts
too. Perhaps this would be really an opportunity  to do something about it :)

-Andi



to post comments

Re: pthread_create() slow for many threads; also time to revisit 64b context switch optimization?

Posted Aug 21, 2008 16:25 UTC (Thu) by leonb (guest, #3054) [Link]

MAP_32BIT should not go because it has uses other than the stack.
For instance Lush (lush.sf.net) uses it for implementing the
dynamic linking capabilities needed by its compiler.

This is because gcc compiles x86_64 with 
option -mcmodel=small by default and therefore the program 
and its symbols must be linked in the lower 2 GB of 
the address space. To make things more interesting, 
gcc currently does not implement the -mcmodel=large.

- L.






Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds