Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
PostgreSQL 9.3 beta: Federated databases and more
LWN.net Weekly Edition for May 9, 2013
(Nearly) full tickless operation in 3.10
Pettenò: Debunking x32 myths
Posted Jun 26, 2012 7:36 UTC (Tue) by butlerm (subscriber, #13312)
That is a bit misleading. x32 is an ABI for 64 bit processors. ARM 64-bit, when it becomes a reality, is likely to perform much like x32 on all applications that do not make heavy use of pointers, and substantially slower than x32 on those.
If x32 is useful, something similar for ARM 64-bit is likely to be useful for exactly the same reason. It might even predominate over a pure 64 bit ARM ABI. How many portable devices are likely to need to address more than 4 GB per process?
Posted Jun 26, 2012 7:58 UTC (Tue) by dlang (✭ supporter ✭, #313)
The only reason that it makes sense in the x86 world is the historic accidents that lead to the creation of the AMD64 architecture.
as far as I know, x86 vs AMD64 is the only case where 32 bit vs 64 bit makes any changes other than the size of the registers (and pointers). The fact that it doubles the number of registers on a CPU platform that is one of the most register starved in existence makes a very significant difference. I don't expect that ARM64 is going to end up doing a similar thing. I don't believe that they are nearly as register starved to begin with, and there's far less of a push for perfect backwards compatibility between ARM processor generations (in fact, there's very little push for backwards compatibility at the binary level at all, what little there is, exists at the source code level)
Posted Jun 26, 2012 9:52 UTC (Tue) by butlerm (subscriber, #13312)
The question would be decided in both cases rather by the performance advantages of running 64 bit code with either 32 bit or 64 bit pointers. My understanding is that many Java applications slow down by as much as 20% simply by switching from a 32 bit JVM to a 64 bit JVM on the same machine.
Considering that the JVM implementation is gaining registers and register width, that result is rather remarkable. The only explanation appears to be that the cache impact of larger pointers in a Java environment is severe. Take a look at this for example:
Why 64-bit Java is slow
Naturally, the reasonable expectation should be that an x32 JVM would be considerably faster than both current IA-32 and current x86-64 JVMs. An ARM64 JVM with 32 bit pointers should be considerably faster than ARM64 JVM with 64 bit pointers for the same reason.
Posted Jun 26, 2012 9:57 UTC (Tue) by dlang (✭ supporter ✭, #313)
that assumes that an ARM64 system has some advantage other than address space when compared to ARM32
none of the other processors that have both 32 bit and 64 bit modes (Sparc, Power, etc) have any advantage other than address space for their 64 bit modes, which is why it is still so common to see such systems running a 64 bit kernel with 32 bit userspace.
Posted Jun 26, 2012 11:49 UTC (Tue) by mansr (guest, #85328)
Posted Jun 26, 2012 13:37 UTC (Tue) by butlerm (subscriber, #13312)
That is beside the point - the issue is the relative performance of _ARM64_ code when compiled to use different pointer sizes. The advantage relative to ARM32 is irrelevant. ARM64 with 32 bit pointers must outperform ARM64 with 64 bit pointers on important workloads to be worth supporting at all.
There is plenty of evidence to suggest that will indeed be the case, and the difference between LP64 and L64P32 ("x32") on x86-64 will make that even more clear than the current major performance _loss_ one experiences when going from a pure 32 bit to pure 64 bit JVM.
Take a look at this:
64-bit Performance Thoughput/Memory Improvements in WAS7.0
The only way for them to get Websphere on a 64 bit JVM to approach the performance of Websphere on a 32 bit JVM was to use compressed references, i.e. smaller pointers. An L64P32 model on a 64 bit processor is essentially the same idea, and will make essentially the same improvement relative to LP64 without requiring developers to rewrite all the pertinent C code to use compressed pointers by hand.
Posted Jun 26, 2012 13:46 UTC (Tue) by Flameeyes (subscriber, #51238)
You should probably try to make sure you understand how the ABI works before trying to discuss it in details. It's ILP32, not L64P32 (i.e. "long" is also 32-bit).
As for the issue of compressed pointers in the JVM, it shows very well one thing: that you can easily solve the performance issue by making your code smarter, instead of breaking compatibility with what has be done up to now to make a new ABI.
Posted Jun 26, 2012 19:11 UTC (Tue) by butlerm (subscriber, #13312)
Thanks for the correction. I am not sure why anyone would want 32 bit longs on a 64 bit processor, unless there is a large code base out there that is lazy about using longs where ints would suffice.
Posted Jun 26, 2012 19:28 UTC (Tue) by mansr (guest, #85328)
Posted Jun 26, 2012 19:50 UTC (Tue) by slashdot (guest, #22014)
Programs can still use 64-bit integers via long long, int64_t or intmax_t (and I'd guess intfast_t too?).
Posted Jun 26, 2012 20:31 UTC (Tue) by butlerm (subscriber, #13312)
The struct timespec / timeval issues would go away, for example. The number of x32 specific syscalls required would go down. Problems with ioctl structure differences would be greatly reduced, as would problems porting LP64 code in general.
32 bit longs are the wave of the past. IA-32 is rapidly becoming obsolete. Why any special effort would be made to retain compatibility with ILP32 rather than with LP64 (as much as possible) is a mystery to me. The whole thing is going to run on an LP64 kernel with a parallel LP64 userspace in many cases, after all. 32 bit pointers can be a major improvement. 32 bit longs on a 64 bit architecture on the other hand just make life difficult without any substantive gains, so far as I can tell.
In any case ILP32 was a mistake. It should have been L64P32 to begin with.
Posted Jun 27, 2012 1:50 UTC (Wed) by slashdot (guest, #22014)
Again, almost all software supports 32-bit compilation, and since all existing relevant 32-bit ABIs (i.e. x86 and arm) are ILP32, it would be insane to use a different size for "long" than the rest of the 32-bit world.
New programs should never use the "long" keyword anyway and should instead use the typedefs in <stdint.h> which actually have a defined useful meaning.
Posted Jun 27, 2012 2:51 UTC (Wed) by Cyberax (✭ supporter ✭, #52523)
So would you like to have "int" to be 18 bits or 36 bits in length?
Posted Jun 27, 2012 4:30 UTC (Wed) by cmccabe (guest, #60281)
It would be nice for the stdint.h types to be built-in, and more widely used in some cases, but it's really not a big deal. There are always higher-level languages you can use if you don't want to deal with this stuff. Some of them even have unlimited length integers! The 70s are over, you know.
Posted Jun 27, 2012 13:49 UTC (Wed) by nix (subscriber, #2304)
char is "the smallest addressable unit" (in practice always 1 byte)
Posted Jun 27, 2012 14:22 UTC (Wed) by mansr (guest, #85328)
Posted Jun 27, 2012 21:27 UTC (Wed) by nix (subscriber, #2304)
Posted Jun 27, 2012 21:45 UTC (Wed) by mansr (guest, #85328)
Posted Jun 29, 2012 4:57 UTC (Fri) by sethml (subscriber, #8471)
Posted Jun 29, 2012 12:11 UTC (Fri) by nix (subscriber, #2304)
Posted Jun 27, 2012 8:38 UTC (Wed) by dgm (subscriber, #49227)
There was a reason for that. The intention was that they mapped cleanly to the word sizes of the underlaying architecture.
> despite the fact that there is no way to write a portable, correct and fast programs in such a language.
Absurd. C was invented for exactly that. The original UNIX code was written in PDP-7 assembly (a 18-bit machine), and later rewritten in C and ported to the PDP-11 (a 16-bit one). The first C version of UNIX was portable, correct _and_ fast. And all that in a newborn language that would be considered "crude" compared to today's C.
Posted Jun 28, 2012 17:45 UTC (Thu) by jzbiciak (✭ supporter ✭, #5246)
Posted Jun 26, 2012 20:04 UTC (Tue) by Flameeyes (subscriber, #51238)
It's mostly not to break the assumption that sizeof(long) == sizeof(void*) which is true for _most_ Unix software. Although it's getting less common nowadays due to portability issues with Windows (Win64 being L32P64).
As mansr already pointed out, Spec2k makes wide use of long where int (or properly sized stdints) should be used, which is one of the reasons why its benchmarks are getting better results than they should theoretically get.
I can discuss this more, and I'll probably do so, in a blog post with more detailed discussion on the cache issue.
Posted Jun 26, 2012 20:36 UTC (Tue) by butlerm (subscriber, #13312)
It sounds like Spec2k should be rewritten to use int32_t instead of int, and int64_t instead of long, in any places where this might make a significant difference.
Posted Jun 26, 2012 20:58 UTC (Tue) by Flameeyes (subscriber, #51238)
Posted Jun 27, 2012 3:11 UTC (Wed) by slashdot (guest, #22014)
Posted Jun 28, 2012 17:41 UTC (Thu) by jzbiciak (✭ supporter ✭, #5246)
Posted Jun 26, 2012 10:43 UTC (Tue) by Cyberax (✭ supporter ✭, #52523)
Why 64-Bit Java Is Slow
Posted Jun 27, 2012 1:28 UTC (Wed) by ldo (subscriber, #40946)
Posted Jun 27, 2012 1:47 UTC (Wed) by alankila (subscriber, #47141)
Posted Jun 27, 2012 2:50 UTC (Wed) by Cyberax (✭ supporter ✭, #52523)
Posted Jun 26, 2012 11:51 UTC (Tue) by mansr (guest, #85328)
Posted Jun 26, 2012 11:07 UTC (Tue) by k3ninho (subscriber, #50375)
This is what Thumb already does for 16-bit code in a 32-bit processor (well, it actually cuts down the instructions as well as the data to 16-bit symbols, and initially restricted the functionality to subset of the ARMv4 IA). I have no idea what Thumb-like behaviour would be included in a 64-bit ARM IA.
Posted Jun 26, 2012 11:47 UTC (Tue) by mansr (guest, #85328)
Wrong. The original Thumb instruction set has 16-bit instructions with reduced functionality, most notably many instructions can access only the low 8 registers. The register size is still 32 bits. Thumb2 extends the instruction set with additional 32-bit instructions providing full access to the entire architecture.
The 64-bit ARM has no equivalent of Thumb mode, although 32-bit ARMv7 userspace is still supported.
See http://www.arm.com/files/downloads/ARMv8_Architecture.pdf for more information.
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds