The x32 system call ABI
Classic 32-bit x86 has easily-understood problems: it can only address 4GB of memory and its tiny set of registers slows things considerably. Running a current processor in the 64-bit mode fixes both of those problems nicely, but at a cost: expanding variables and pointers to 64 bits leads to expanded memory use and a larger cache footprint. It's also not uncommon (still) to find programs that simply do not work properly on a 64-bit system. Most programs do not actually need 64-bit variables or the ability to address massive amounts of memory; for that code, the larger data types are a cost without an associated benefit. It would be really nice if those programs could take advantage of the 64-bit architecture's additional registers and instructions without simultaneously paying the price of increased memory use.
That best-of-both-worlds situation is exactly what the x32 ABI is trying to provide. A program compiled to this ABI will run in native 64-bit mode, but with 32-bit pointers and data values. The full register set will be available, as will other advantages of the 64-bit architecture like the faster SYSCALL64 instruction. If all goes according to plan, this ABI should be the fastest mode available on 64-bit machines for a wide range of programs; it is easy to see x32 widely displacing the 32-bit compatibility mode.
One should note that the "if" above is still somewhat unproven: actual benchmarks showing the differences between x32 and the existing pure modes are hard to come by.
One outstanding question - and the spark for the current discussion - has to do with the system call ABI. For the most part, this ABI looks similar to what is used by the legacy 32-bit mode: the 32-bit-compatible versions of the system calls and associated data structures are used. But there is one difference: the x32 developers want to use the SYSCALL64 instruction just like native 64-bit applications do for the performance benefits. That complicates things a bit, since, to know what data size to expect, the kernel needs to be able to distinguish system calls made by true 64-bit applications from those running in the x32 mode, regardless of the fact that the processor is running in the same mode in both cases. As an added challenge, this distinction needs to be made without slowing down native 64-bit applications.
The solution involves using an expanded version of the 64-bit system call table. Many system calls can be called directly with no compatibility issues at all - a call to fork() needs little in the translation of data structures. Others do need the compatibility layer, though. Each of those system calls (92 of them) is assigned a new number starting at 512. That leaves a gap above the native system calls for additions over time. Bit 30 in the system call number is also set whenever an x32 binary calls into the kernel; that enables kernel code that cares to implement "compatibility mode" behavior.
Linus didn't seem to mind the mechanism used to distinguish x32 system calls in general, but he hated the use of compatibility mode for the x32 ABI. He asked:
There are legitimate reasons why some of the system calls cannot be shared between the x32 and 64-bit modes. Situations where user space passes structures containing pointers to the kernel (ioctl() and readv() being simple examples) will require special handling since those pointers will be 32-bit. Signal handling will always be special. Many of the other system calls done specially for x32, though, are there to minimize the differences between x32 and the legacy 32-bit mode. And those calls are the ones that Linus objects to most strongly.
It comes down, for the most part, to the format of integer values passed to the kernel in structures. The legacy 32-bit mode, naturally, uses 32-bit values in most cases; the x32 mode follows that lead. Linus is saying, though, that the 64-bit versions of the structures - with 64-bit integer values - should be used instead. At a minimum, doing things that way would minimize the differences between the x32 and native 64-bit modes. But there is also a correctness issue involved.
One place where the 32- and 64-bit modes differ is in their representation of time values; in the 32-bit world, types like time_t, struct timespec, and struct timeval are 32-bit quantities. And 32-bit time values will overflow in the year 2038. If the year-2000 issue showed anything, it's that long-term drop-dead days arrive sooner than one tends to think. So it's not surprising that Linus is unwilling to add a new ABI that would suffer from the 2038 issue:
The width of time_t cannot change for legacy 32-bit binaries. But x32 is an entirely new ABI with no legacy users at all; it does not have to retain any sort of past compatibility at this point. Now is the only time that this kind of issue can be fixed. So it is probably entirely safe to say that an x32 ABI will not make it into the mainline as long as it has problems like the year-2038 bug.
At this point, the x32 developers need to review their proposed system call
ABI and find a way to rework it into something closer to Linus's taste;
that process is already underway.
Then developers can get into the serious business of building systems under
that ABI and running benchmarks to see whether it is all worth the effort.
Convincing distributors (other than Gentoo, of course) to support this ABI
will take a fairly convincing story, but, if this mode lives up to its
potential, that story might just be there.
Index entries for this article | |
---|---|
Kernel | User-space API |
Kernel | x32 |
Posted Sep 1, 2011 1:21 UTC (Thu)
by cma (guest, #49905)
[Link] (4 responses)
Posted Sep 1, 2011 3:38 UTC (Thu)
by foom (subscriber, #14868)
[Link]
Posted Sep 1, 2011 5:18 UTC (Thu)
by Tuna-Fish (guest, #61751)
[Link] (2 responses)
If you need more than 4GB, you should compile your program for native 64 bit.
Posted Sep 13, 2011 17:43 UTC (Tue)
by cma (guest, #49905)
[Link] (1 responses)
Posted Sep 13, 2011 17:51 UTC (Tue)
by dlang (guest, #313)
[Link]
a large database is a perfect example of a situation where you would want the full 64 bits available.
given these other memory usesin a system, it's very likely that a machine with 6-8G of ram that's dedicated for database use could still be very happy with x32
however, if you are splitting the database up using sharding (where you have multiple database instances, which could live on separate machines, including virtual machines), it's very possible that each one will only need 4G or less of address space even with far more ram.
also, if you have a database like postgres that used multiple processes (instead of multiple threads), you should recognize that each process can have 4G of address space, so unless you have a huge amount of shared memory allocated, 4G per process may be a very comfortable limit.
Posted Sep 1, 2011 3:57 UTC (Thu)
by njs (subscriber, #40338)
[Link] (8 responses)
Posted Sep 1, 2011 4:24 UTC (Thu)
by jzbiciak (guest, #5246)
[Link]
Further down the page is this note: So, that presumably accounts for why 181.mcf slowed down 0.5% to 1% relative to normal 32-bit x86.
Posted Sep 1, 2011 8:08 UTC (Thu)
by slashdot (guest, #22014)
[Link] (5 responses)
Why not just have x32 programs use the x86-64 system calls and otherwise behave as normal x86-64 programs from the kernel's perspective?
The only difference would then be that they would only use 4GB of address space (mmap with MAP_32BIT), and store pointers in 32-bit-sized locations in memory.
In fact, you could probably use #pragma and/or __attribute__ to specify pointer size, and use a 64-bit libc, while most other libraries and the executable are 32-bit.
Posted Sep 1, 2011 11:56 UTC (Thu)
by and (guest, #2883)
[Link] (4 responses)
Posted Sep 1, 2011 12:53 UTC (Thu)
by cesarb (subscriber, #6266)
[Link] (3 responses)
When putting a pointer into these structures, it can simply be zero-extended.
Only memory allocation system calls would need a new flag (to allocate below 4G). Other than these, the kernel does not have to change at all. The rest could be done in userspace.
(The only other change needed in the kernel would be to add a flag in the executable file format to make ASLR use only the lower 32 bits.)
Posted Sep 1, 2011 22:17 UTC (Thu)
by hummassa (subscriber, #307)
[Link] (2 responses)
Posted Sep 5, 2011 22:05 UTC (Mon)
by butlerm (subscriber, #13312)
[Link] (1 responses)
(1) Change the source level API for all pertinent ioctl structures that contain pointers so that programs have to manually zero extend a 32 bit pointer into some sort of opaque 64 bit value.
I suspect (1) would break source compatibility in far too many places, although it seems like it is what should have been done way back when these interfaces were first designed.
(2) seems ideal, but requires cooperation for every supporting compiler. I don't know exactly why, but the x32 ABI devs are trying to avoid that if at all possible.
Posted Sep 6, 2011 1:05 UTC (Tue)
by cesarb (subscriber, #6266)
[Link]
(3) Keep the source level API 32-bit, but have glibc do the zero-extension into the true 64-bit API before calling the kernel.
The main problem with that is, of course, ioctl (the same compat ioctl problem the kernel already has). So, how about this:
(4) Same as (3) but add a new x32_ioctl 64-bit syscall which calls into the compat ioctl engine the kernel already has.
Posted Sep 1, 2011 22:28 UTC (Thu)
by daglwn (guest, #65432)
[Link]
Several things could conspire to make this happen, besides the lack of optimization already noted.
- Function calls are more expensive due to additional callee-save registers.
Then there are all sorts of microarchitecture changes resulting from the ISA additions that can reduce clock-for-clock performance. Things like longer pipelines to compensate for more complicated instruction decoding, though these are likely secondary at best.
Posted Sep 1, 2011 10:28 UTC (Thu)
by nix (subscriber, #2304)
[Link] (2 responses)
Posted Sep 2, 2011 14:50 UTC (Fri)
by BenHutchings (subscriber, #37955)
[Link] (1 responses)
Posted Sep 3, 2011 19:45 UTC (Sat)
by nix (subscriber, #2304)
[Link]
Posted Sep 1, 2011 22:45 UTC (Thu)
by gerdesj (subscriber, #5446)
[Link] (15 responses)
Surely it is application developers rather than distributors who would support this thing?
In the Gentoo case, presumably I'd merely have to remember to pick a kernel option before emerging something that could use this. Said option would be mentioned in the middle of a 300 package emerge, after setting a USE flag, which I'd never notice 8)
For now fixing how CUPS can cause a 2 hour load time for a Libre Office file is probably going to yield better performance improvements (its something to do with printers being unavailable away from "home")
I am not an expert but this look like a bodge of some sort. An application is written to work with a 2^n bit system. If it runs on a 2^n bit system then great. If not then you'll need a compatibility layer.
Surely the 64 bit version of a (previously) 32 bit app can be efficient in terms of memory and register usage.
I can't help but be reminded of the 16 -> 32 bit migration.
FIX THE BLOODY APPLICATION!
Cheers
Posted Sep 1, 2011 23:00 UTC (Thu)
by dlang (guest, #313)
[Link] (14 responses)
the larger footprint uses more CPU cache, making the system spend more time waiting for the cache to be updated from memory.
this is why many of the chips that have both 32 bit and 64 bit modes tend to run 64 bit kernels with 32 bit userspace, for programs that don't need to address more than 4G of ram, the overhead of the larger data objects results in a slowdown
x86/AMD64 is pretty much unique in the fact that 64 bit mode doesn't just extend the registers to 64 bits, it also gives you twice as many registers to work with. Since the x86 platform has far fewer registers than more modern designs, the availability of more registers means that far more things can happen in the CPU itself, without having to save and load values out to the cache (or worse yet, to RAM) in a constant shuffle to get register space for the things that need it. x86 systems spend a HUGE amount of time doing this register shuffle.
the idea behind the x32 architecture is to be able to take advantage of these extra registers (which almost always result in improved performance) without having to pay the overhead of larger pointers to memory.
the fact that many 32 bit applications that are nto 64 bit clean can be made to run in this mode is pure gravy, and if the time change takes place, this may be sacraficed in order to get a better long-term x32 architecture.
Posted Sep 2, 2011 5:35 UTC (Fri)
by eru (subscriber, #2753)
[Link] (13 responses)
I can see the reasoning, but still I feel the ideal is very bad. It reminds me too much of the "memory models" of MS-DOS, 16-bit Xenix and 16-bit OS/2, and the problems associated to having then separate library versions of each, and slightly different requirements and capacities of programs depending on how they were compiled. Been there, and did not like it. Please don't bring this mess to Linux!
Having more modes just means more available ways for the programmer to screw things up, and more possibilities for low-level bugs and security holes in the kernel and C library. The now-existing 32-bit mode in x86_64 is justifiable for supporting legacy binaries, but other memory models will just complicate things with very little gain.
Posted Sep 2, 2011 14:41 UTC (Fri)
by nix (subscriber, #2304)
[Link] (12 responses)
Posted Sep 2, 2011 19:02 UTC (Fri)
by dlang (guest, #313)
[Link] (11 responses)
and as far as I have seen, almost all distros for those chips ship 64 bit kernels with 32 bit userspace because 32 bit binaries are faster to run than 64 bit ones (due to the more compact code and memory addresses), as long as you can live in 4G of address space as an application.
there are actually very few cases where a single application needs to address more than 4G of address space, and in many, if not most of those cases there are real advantages to just running multiple processes rather than a single giant process. so this works very well in the real world.
Posted Sep 2, 2011 22:36 UTC (Fri)
by martinfick (subscriber, #4455)
[Link] (7 responses)
I guess you think java applications are few. :)
Posted Sep 2, 2011 22:47 UTC (Fri)
by dlang (guest, #313)
[Link] (6 responses)
remember that visualization is supposed to be the wave of the future, especially for things in datacenters. part of the way this works is that you slice up the memory available on a server to allocate it between many more small servers. most such servers end up with less than 4G per virtual server and what we are talking about for x32 is 4G per _application_ (not counting OS buffering, kernel allocations, or any other overhead) this is a lot more elbow room.
not every application can fit in 4G, but when you really look at it, a surprising number of them will.
and pointer-heavy things like Java are especially likely to benifit from the smaller pointers of x32
Posted Sep 5, 2011 7:48 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link]
http://wikis.sun.com/display/HotSpotInternals/CompressedOops and
Posted Sep 5, 2011 22:38 UTC (Mon)
by intgr (subscriber, #39733)
[Link] (4 responses)
Offtopic, but interesting: 64-bit Java already offers the -XX:+UseCompressedOops option which turns on pointer compression. By dropping 3 bits from the least significant end of the address, it can address 32GB of memory using 32-bit pointer fields.
Posted Sep 6, 2011 14:35 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Apr 2, 2012 14:30 UTC (Mon)
by Richard_J_Neill (subscriber, #23093)
[Link] (2 responses)
Posted Apr 3, 2012 20:49 UTC (Tue)
by ibukanov (subscriber, #3942)
[Link] (1 responses)
On modern CPU memory is addressed internally by cache lines that are typically 16-32-64 bytes in size. On x86 the byte access is just as fast as 32-bit access. Moreover, misaligned access to 32-bit values is allowed and is not costly as long as the variable does not cross the cache line boundary.
Posted May 21, 2012 15:08 UTC (Mon)
by mikemol (guest, #83507)
[Link]
Posted Sep 3, 2011 11:50 UTC (Sat)
by raven667 (subscriber, #5198)
[Link] (2 responses)
Posted Sep 3, 2011 17:12 UTC (Sat)
by dlang (guest, #313)
[Link] (1 responses)
1. especially early on there were problems with the compatibility mode causing occasional 'strange' errors when running 32 bit userspace on a 64 bit kernel.
2. the added registers of 64 bit mode significantly improve the performance of 64 bit code vs 32 bit code, in almost every case even when you take into account the extra overhead of the larger pointers.
Posted Sep 6, 2011 3:30 UTC (Tue)
by butlerm (subscriber, #13312)
[Link]
Posted Sep 6, 2011 0:24 UTC (Tue)
by kragilkragil2 (guest, #76172)
[Link]
Posted Sep 6, 2011 5:49 UTC (Tue)
by gmaxwell (guest, #30048)
[Link] (13 responses)
The inevitable result of this is that I'm going to have _two_ copies of most of my system libraries in core at all times, and we'll be back to the bad old days where common software isn't 64 bit clean (right now its mostly only proprietary crap-ware like flash thats problematic)
And for what? so a very few overly pointered memory bandwidth bound test cases can run faster? Any many of these cases could run just as well by switching to (e.g.) using pointer offsets internally (which would also reduce their scalability, but no worse than switching to 32 bit mode).
Posted Sep 6, 2011 12:27 UTC (Tue)
by liljencrantz (guest, #28458)
[Link]
Aside from the possibility of getting a 64-bit time_t on 32-bit systems, this sounds like a huge waste of time.
Posted Sep 7, 2011 6:23 UTC (Wed)
by butlerm (subscriber, #13312)
[Link] (3 responses)
With open source applications, what is there to complain about? If you don't like x32 just use x64 only.
And of course the big advantage of x32 over pointer compression is that no source modifications are required, modifications that in a typical C application would be extremely painful.
Posted Sep 7, 2011 6:47 UTC (Wed)
by gmaxwell (guest, #30048)
[Link] (2 responses)
"If x32 compiled distributions run significantly faster than x64" IFF, but based on the currently available micro-benchmarks this seems unlikely. I've yet to see an example of a single application which is faster in x32 than best_of(x86,x86_64), and if we're in the two libraries mode then taking the choice of x86 for those few memory bandwidth bound pointer heavy apps that don't mine the scalability constraint is no worse.
"occasional application" like... my browser? (which is currently using ~4GiB of VM, though not as much resident obviously).
Not to mention the reduced address space for ASLR.
Posted Sep 8, 2011 21:19 UTC (Thu)
by JanC_ (guest, #34940)
[Link]
Posted Sep 9, 2011 2:30 UTC (Fri)
by butlerm (subscriber, #13312)
[Link]
That is the wrong metric to judge an ABI by - unless you agree that we should stick with an x86 + x86_64 biarchy indefinitely, and have distributions compile every other application appropriately. Then we really will end up with both sets of libraries pinned in memory.
x32 is noticeably better than x86, on some benchmarks as much as 30% more. It is also noticeably better than x86_64, another 30% on important workloads. It is a better all around ABI for most applications.
x86 is stunted, and will hopefully go away in a few years. But x32 sounds like it is worth keeping around for a long time. A 30% performance increase on many workloads isn't the sort of thing you want to idly throw away.
Posted Sep 9, 2011 12:23 UTC (Fri)
by NikLi (guest, #66938)
[Link] (7 responses)
A vm like python uses a *lot* of pointers:
- a list of 'n' items is a buffer of 'n' pointers. Same for tuples.
C programmers think with memory buffers but for dynamic languages where objects work by reference are mostly based on tons of pointers; this is what makes them dynamic. And yes, making all those pointers half their size is very important. Because imagine that when you want to look up something in a list, this list is fetched to the cache and all the pointers are traversed while looking for the item. Fetching a 2k buffer is better than fetching a 4k buffer. In fact, x86 might be more suitable than x86-64 for such vms!
(It would be very interesting to see some python benchmarks for x32 vs x86, nontheless)
Now, one may say that "if you want speed, do it in C". However making a dynamic language faster will benefit thousands of programs written in that language, which is important for some people..
Using pointer offsets suffers from one extra indirection and will kill a big part of the cache. On the other hand, pointing to more than 4G of things is an overkill.
Posted Sep 9, 2011 14:12 UTC (Fri)
by gmaxwell (guest, #30048)
[Link] (6 responses)
You use a single offset (after all, we're assuming you're willing to take a 4G limit in these applications) and keep it in a register.
Alternatively, how about an ABI that promises you that you can get memory under the 4G mark and you use 32 bits internally, and covert at the boundaries to external libraries. This way single applications can be 32 bit without overhead but it doesn't drag the whole system with it?
Posted Sep 9, 2011 15:45 UTC (Fri)
by dlang (guest, #313)
[Link] (5 responses)
however you missed that libraries can allocate memory as well, and so the libraries must be compiled to only request memory under 4G as well.
Posted Sep 9, 2011 18:12 UTC (Fri)
by gmaxwell (guest, #30048)
[Link] (1 responses)
It would have ~all the performance benefits without doubling the libraries in memory. It wouldn't, however, retain the benefit of reduced porting benefit of existing 32bit crapware since pointers in library owned structures would be the other size. ::shrugs::
Posted Sep 11, 2011 3:23 UTC (Sun)
by butlerm (subscriber, #13312)
[Link]
It would be essentially the same as adding support for 80286 style near and far pointers across the code base. In C, every structure, every header file, every shared pointer declaration would potentially have to be marked whether it was using large or small pointers. The compiler certainly wouldn't know that an arbitrary function or structure declaration was referring to something from a library, and some libraries would have to come in a non-standard flavor in any case.
Now as you say, there are certain advantages to that, in terms of memory and cache footprint. They did it back in the x286 era for a reason. But it is much more impractical to implement that sort of thing across the source code for practically everything then simply to compile under a new ABI, especially if the new ABI performs well enough to be the system default.
A reasonable distribution policy could be to replace x86 with x32, and not ship x86_64 libraries in x32 distributions. It could simply say that if you want have a 64 bit user space, you should use a full 64 bit version. 64 bit addressing could be reserved for the kernel. If I were to guess, half of the people currently planning to use x32 (e.g. in embedded applications) have that sort of thing in mind in any case.
Posted Apr 9, 2012 21:28 UTC (Mon)
by snadrus (guest, #60224)
[Link] (2 responses)
Posted Apr 10, 2012 9:08 UTC (Tue)
by khim (subscriber, #9252)
[Link] (1 responses)
1. Open wikipedia. Read. Perhaps then you'll be considered seriously in some future architecture dispute.
Posted Apr 13, 2012 6:37 UTC (Fri)
by biged (guest, #50106)
[Link]
Your response here is beyond rude: it is poisonous. You should realise that with more time and attention someone might be able to explain the misconception, help others and avoid insulting anyone.
Please stop treating LWN as your inbox: post less often, and more thoughtfully. For me, you have become a spammer.
Posted Sep 10, 2011 8:27 UTC (Sat)
by bersl2 (guest, #34928)
[Link] (2 responses)
Posted Sep 10, 2011 17:11 UTC (Sat)
by dlang (guest, #313)
[Link] (1 responses)
Posted Sep 11, 2011 14:28 UTC (Sun)
by Baylink (guest, #755)
[Link]
Posted Apr 15, 2012 21:16 UTC (Sun)
by tenchiki (subscriber, #53749)
[Link] (4 responses)
ABI: o32 n32 64
Posted Apr 15, 2012 23:05 UTC (Sun)
by khim (subscriber, #9252)
[Link] (3 responses)
Why would anyone want this? x32 uses ILP32 model to minimize difference between IA32 mode and x32 mode. Any other choice just looks… strange.
Posted Apr 16, 2012 6:56 UTC (Mon)
by paulj (subscriber, #341)
[Link] (2 responses)
Posted Apr 16, 2012 9:00 UTC (Mon)
by khim (subscriber, #9252)
[Link] (1 responses)
Posted Apr 16, 2012 11:07 UTC (Mon)
by paulj (subscriber, #341)
[Link]
Posted Dec 2, 2012 14:08 UTC (Sun)
by normcf (guest, #88125)
[Link] (1 responses)
Posted Dec 2, 2012 23:06 UTC (Sun)
by dlang (guest, #313)
[Link]
If you have a huge array of pointers, the memory savings will outweigh this cost, but not for the normal uses.
A doubt here...Memory seen from a single process
Does x32 will provide for a single process for mapping/seeing more than 2GB of ram?
Memory seen from a single process
Memory seen from a single process
Thanks!
So this could be a problem for apps needing more than 4GB. Like MySQL with larger buffers or a memory based DB.
Regards
Memory seen from a single process
Memory seen from a single process
Curiously, that first link does have some benchmarks, and in none of them is x32 actually the best choice -- on one of them ia32 wins, and on one of them x86-64 wins. I guess this must reflect some lack of optimization in the toolchain or something, since I can't see how adding more registers could ever legitimately make a CPU-bound 32-bit program *slower*...?
The x32 system call ABI
The x32 system call ABI
GCC
The current x32 implementation isn't optimized:
Atom LEA optimization is disabled.
Memory addressing should be optimized.
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
Resuming:
x86_64/amd64 => wastes space (and cache == performance), many registers
ia32 => gains space, few registers
x32 => gains space, many registers
The x32 system call ABI
(2) Use a compiler extension that does this transparently, i.e. that supports a special pointer type where the high order bits are always zero.
The x32 system call ABI
The x32 system call ABI
> CPU-bound 32-bit program *slower*...?
- Systems calls are more expensive due to larger context save and restore.
- Things like setjmp/longjmp are slower for the same reason.
- Longer instruction encoding causes icache pressure.
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
Jon
The x32 system call ABI
the idea behind the x32 architecture is to be able to take advantage of these extra registers (which almost always result in improved performance) without having to pay the overhead of larger pointers to memory.
That way lies madness
That way lies madness
Please don't bring this mess to Linux!
Linux has had 'this mess' since the days of SPARC64 in the 90s, and now with x86-64 and x86-32, biarchy is downright common. The linker and dynamic linker know about it, and you cannot accidentally link against the wrong library. Biarch packaging problems have largely been weeded out by the ubiquity of x86-64.
That way lies madness
That way lies madness
That way lies madness
That way lies madness
http://blog.juma.me.uk/tag/compressed-oops/
That way lies madness
> the smaller pointers of x32
That way lies madness
That way lies madness
That way lies madness
That way lies madness
That way lies madness
That way lies madness
That way lies madness
MS and Apple? Intel or AMD?
My guess is they don't because they would need compiler support, but I also think that some engineers at Apple, MS, Intel or AMD did the benchmarks and concluded that it isn't worth it. AFAIK modern CPUs use some sort of shadow registers or something to mask away the performance penalties you get by having so few registers.
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
- a dictionary of 'n' items is a buffer of ~6*n pointers
- every string item carries a pointer
- every instance is a dictionary plus a couple of pointers
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
2. Try to pretend you never asked this question.Your worst yet, and for me your last
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI
One of the notable specs for the n32 ABI that don't seem to have been mentioned for x32 is to make the long datatype to be 64bits:
int 32 32 32
long 32 64 64
pointer 32 32 64
(all other types same size)
The x32 system call ABI
One of the notable specs for the n32 ABI that don't seem to have been mentioned for x32 is to make the long datatype to be 64bits
The x32 system call ABI
What's wrong with using The x32 system call ABI
long long
for that?
The x32 system call ABI
The x32 system call ABI
The x32 system call ABI