LWN.net Logo

The x32 system call ABI

The x32 system call ABI

Posted Sep 6, 2011 5:49 UTC (Tue) by gmaxwell (subscriber, #30048)
Parent article: The x32 system call ABI

ugh. I'm having such a hard time making myself believe that this is a good idea.

The inevitable result of this is that I'm going to have _two_ copies of most of my system libraries in core at all times, and we'll be back to the bad old days where common software isn't 64 bit clean (right now its mostly only proprietary crap-ware like flash thats problematic)

And for what? so a very few overly pointered memory bandwidth bound test cases can run faster? Any many of these cases could run just as well by switching to (e.g.) using pointer offsets internally (which would also reduce their scalability, but no worse than switching to 32 bit mode).


(Log in to post comments)

The x32 system call ABI

Posted Sep 6, 2011 12:27 UTC (Tue) by liljencrantz (subscriber, #28458) [Link]

Agreed. This, to me, sounds like over-optimizing.

Aside from the possibility of getting a 64-bit time_t on 32-bit systems, this sounds like a huge waste of time.

The x32 system call ABI

Posted Sep 7, 2011 6:23 UTC (Wed) by butlerm (subscriber, #13312) [Link]

If x32 compiled distributions run significantly faster than x64, it seems rather likely to me that desktop users will generally end up with _one_ x32 copy of system libraries in memory, with x64 libraries only loaded for the occasional application that needs a very large memory space.

With open source applications, what is there to complain about? If you don't like x32 just use x64 only.

And of course the big advantage of x32 over pointer compression is that no source modifications are required, modifications that in a typical C application would be extremely painful.

The x32 system call ABI

Posted Sep 7, 2011 6:47 UTC (Wed) by gmaxwell (subscriber, #30048) [Link]

"If you don't like x32 just use x64 only" which means I get to go back to the bad old days of playing (int) to (void *)/(size_t) conversion guy because when 64 bit systems weren't commonly deployed on developers desktops a lot of stuff simply didn't work without a bunch of fuss. The freedom of open source has tremendous but not infinite value— there is a real cost to being an oddball.

"If x32 compiled distributions run significantly faster than x64" IFF, but based on the currently available micro-benchmarks this seems unlikely. I've yet to see an example of a single application which is faster in x32 than best_of(x86,x86_64), and if we're in the two libraries mode then taking the choice of x86 for those few memory bandwidth bound pointer heavy apps that don't mine the scalability constraint is no worse.

"occasional application" like... my browser? (which is currently using ~4GiB of VM, though not as much resident obviously).

Not to mention the reduced address space for ASLR.

The x32 system call ABI

Posted Sep 8, 2011 21:19 UTC (Thu) by JanC_ (guest, #34940) [Link]

It's using almost 4 GiB on a 64-bit system now? But of course your browser would supposedly need significantly less memory when running in x32 mode? And once Firefox also uses out-of-process rendering (like Chrome/Chromium), that would become even less of an issue...?

The x32 system call ABI

Posted Sep 9, 2011 2:30 UTC (Fri) by butlerm (subscriber, #13312) [Link]

>I've yet to see an example of a single application which is faster in x32 than best_of(x86,x86_64)

That is the wrong metric to judge an ABI by - unless you agree that we should stick with an x86 + x86_64 biarchy indefinitely, and have distributions compile every other application appropriately. Then we really will end up with both sets of libraries pinned in memory.

x32 is noticeably better than x86, on some benchmarks as much as 30% more. It is also noticeably better than x86_64, another 30% on important workloads. It is a better all around ABI for most applications.

x86 is stunted, and will hopefully go away in a few years. But x32 sounds like it is worth keeping around for a long time. A 30% performance increase on many workloads isn't the sort of thing you want to idly throw away.

The x32 system call ABI

Posted Sep 9, 2011 12:23 UTC (Fri) by NikLi (guest, #66938) [Link]

It is not "pointer memory bandwidth bound test cases".

A vm like python uses a *lot* of pointers:

- a list of 'n' items is a buffer of 'n' pointers. Same for tuples.
- a dictionary of 'n' items is a buffer of ~6*n pointers
- every string item carries a pointer
- every instance is a dictionary plus a couple of pointers

C programmers think with memory buffers but for dynamic languages where objects work by reference are mostly based on tons of pointers; this is what makes them dynamic. And yes, making all those pointers half their size is very important. Because imagine that when you want to look up something in a list, this list is fetched to the cache and all the pointers are traversed while looking for the item. Fetching a 2k buffer is better than fetching a 4k buffer. In fact, x86 might be more suitable than x86-64 for such vms!

(It would be very interesting to see some python benchmarks for x32 vs x86, nontheless)

Now, one may say that "if you want speed, do it in C". However making a dynamic language faster will benefit thousands of programs written in that language, which is important for some people..

Using pointer offsets suffers from one extra indirection and will kill a big part of the cache. On the other hand, pointing to more than 4G of things is an overkill.

The x32 system call ABI

Posted Sep 9, 2011 14:12 UTC (Fri) by gmaxwell (subscriber, #30048) [Link]

> Using pointer offsets suffers from one extra indirection and will kill a big part of the cache. On the other hand, pointing to more than 4G of things is an overkill.

You use a single offset (after all, we're assuming you're willing to take a 4G limit in these applications) and keep it in a register.

Alternatively, how about an ABI that promises you that you can get memory under the 4G mark and you use 32 bits internally, and covert at the boundaries to external libraries. This way single applications can be 32 bit without overhead but it doesn't drag the whole system with it?

The x32 system call ABI

Posted Sep 9, 2011 15:45 UTC (Fri) by dlang (✭ supporter ✭, #313) [Link]

what you are trying to describe is basically what the x32 architecture is doing.

however you missed that libraries can allocate memory as well, and so the libraries must be compiled to only request memory under 4G as well.

The x32 system call ABI

Posted Sep 9, 2011 18:12 UTC (Fri) by gmaxwell (subscriber, #30048) [Link]

It would only take a single syscall to the kernel to tell it to never give this process access to _any_ address space outside of the first 4gb (not via sbrk, mmap, etc).

It would have ~all the performance benefits without doubling the libraries in memory. It wouldn't, however, retain the benefit of reduced porting benefit of existing 32bit crapware since pointers in library owned structures would be the other size. ::shrugs::

The x32 system call ABI

Posted Sep 11, 2011 3:23 UTC (Sun) by butlerm (subscriber, #13312) [Link]

What you describe could be done, but it would be difficult to implement, require special compiler support to do well, and would break source compatibility even with special compiler support.

It would be essentially the same as adding support for 80286 style near and far pointers across the code base. In C, every structure, every header file, every shared pointer declaration would potentially have to be marked whether it was using large or small pointers. The compiler certainly wouldn't know that an arbitrary function or structure declaration was referring to something from a library, and some libraries would have to come in a non-standard flavor in any case.

Now as you say, there are certain advantages to that, in terms of memory and cache footprint. They did it back in the x286 era for a reason. But it is much more impractical to implement that sort of thing across the source code for practically everything then simply to compile under a new ABI, especially if the new ABI performs well enough to be the system default.

A reasonable distribution policy could be to replace x86 with x32, and not ship x86_64 libraries in x32 distributions. It could simply say that if you want have a 64 bit user space, you should use a full 64 bit version. 64 bit addressing could be reserved for the kernel. If I were to guess, half of the people currently planning to use x32 (e.g. in embedded applications) have that sort of thing in mind in any case.

The x32 system call ABI

Posted Apr 9, 2012 21:28 UTC (Mon) by snadrus (guest, #60224) [Link]

What about building x32 off ia32 compatibility? There would be no kernel changes, but just compiler changes to use the additional registers. You may even be able to use ia32 or x32 libraries interchangeably if you're not passing by register.

The x32 system call ABI

Posted Apr 10, 2012 9:08 UTC (Tue) by khim (subscriber, #9252) [Link]

1. Open wikipedia. Read.
2. Try to pretend you never asked this question.

Perhaps then you'll be considered seriously in some future architecture dispute.

Your worst yet, and for me your last

Posted Apr 13, 2012 6:37 UTC (Fri) by biged (subscriber, #50106) [Link]

Khim, you have exceeded your usual levels of hostility and brashness with this comment, and so I have added you to my filter. (I mention this as a reminder to others: My Account -> Comment Filtering.)

Your response here is beyond rude: it is poisonous. You should realise that with more time and attention someone might be able to explain the misconception, help others and avoid insulting anyone.

Please stop treating LWN as your inbox: post less often, and more thoughtfully. For me, you have become a spammer.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds