A Gentoo x32 release candidate
A Gentoo x32 release candidate
Posted Jun 6, 2012 17:05 UTC (Wed) by gmaxwell (guest, #30048)In reply to: A Gentoo x32 release candidate by gmaxwell
Parent article: A Gentoo x32 release candidate
Gah. I meant to say that they're either not using much memory, in which case the savings doesn't matter— or they are and they're the kind of workload where 64 vs 32 has scaling implications. (e.g. the browser)
Posted Jun 6, 2012 17:31 UTC (Wed)
by mikemol (guest, #83507)
[Link] (13 responses)
Your browser is only one out of hundreds (if not thousands) of programs on your computer. Many (most?) of them only run for a few moments, or otherwise don't (or don't derive meaningful benefit from) consume huge amounts of memory memory.
Take the 'dd' command. top. ls. bash. dash. cp. mv. echo. cat. tee. cupsd. dbus-daemon. lpr. grep. find. xargs.
The programs you spend hours every day staring at? Yeah, those probably benefit from having a 64-bit address space. The programs you don't think about, often when you're not even actively using them? They probably don't.
Posted Jun 6, 2012 17:39 UTC (Wed)
by gmaxwell (guest, #30048)
[Link] (9 responses)
But they do link shared libraries— at least libc— which is rather large. So if you're going to have a mix of x32 and x86_64 programs running you're going to end up with another copy of libc in memory for those things, passing through your caches, etc... which should easily offset the tiny gains from making those programs x32.
Posted Jun 6, 2012 18:13 UTC (Wed)
by mikemol (guest, #83507)
[Link]
Posted Jun 6, 2012 18:47 UTC (Wed)
by and (guest, #2883)
[Link]
Posted Jun 6, 2012 22:27 UTC (Wed)
by butlerm (subscriber, #13312)
[Link] (6 responses)
Those programs, yes. There is a significant class of other programs that can be sped up by as much as 40% compared to x86-64. The advantage is so great that x32 is reasonably likely to predominate over the latter in the future, outside a relatively narrow set of applications.
Posted Jun 6, 2012 23:28 UTC (Wed)
by andrel (guest, #5166)
[Link] (5 responses)
Posted Jun 7, 2012 0:34 UTC (Thu)
by dlang (guest, #313)
[Link] (2 responses)
I don't know any specific programs, but there are people who have reported that using 32 bit apps on 64 bit systems results in better performance than using 64 bit apps.
This seldom applies on the AMD64 architecture as 64 bit mode also gives you twice as many registers to use, but on Sparc and Power* systems this is a very common situation.
x32 is creating an equivalent architecture for the AMD64 systems.
Posted Jun 7, 2012 9:06 UTC (Thu)
by dvandeun (guest, #24273)
[Link] (1 responses)
Posted Jun 10, 2012 3:48 UTC (Sun)
by vonbrand (subscriber, #4458)
[Link]
... not to mention that quicksort (which is designed for arrays) makes next to no sense on lists...
Posted Jun 7, 2012 21:48 UTC (Thu)
by paulj (subscriber, #341)
[Link]
Also, as overall system memory usage is generally lower with x32, it allows, e.g., more VMs to be run for the same amount of memory.
Posted Jun 8, 2012 20:54 UTC (Fri)
by butlerm (subscriber, #13312)
[Link]
The specific example I had in mind is 181.mcf, part of the SPEC 2000 CPU benchmark.
http://www.spec.org/cpu2000/CINT2000/181.mcf/docs/181.mcf...
I imagine that many Perl, Python, and Java programs will show comparable improvements, in addition to compilers, linkers, web browsers, xml processors, interpreters, x32 native kernels, and garbage collected languages in general.
With support for near and far pointers it is conceivable one could dramatically improve kernel performance as well, making an x32/x86-64 hybrid kernel perform nearly as well as an x32 native one, without losing the ability to support 64 bit applications.
Posted Jun 7, 2012 13:54 UTC (Thu)
by foom (subscriber, #14868)
[Link]
Why do you think that Chrome on Linux would actually need the 64-bit address space when the vast majority of the installs (Windows) are all 32bit and work great?
Posted Jun 18, 2012 7:47 UTC (Mon)
by massimiliano (subscriber, #3048)
[Link]
I very intentionally didn't mention the browser as an application you'd want to be 32-bit. I thought about Chrome's model of one-process-per-tab, and decided I still liked the larger address for mmap and IPC purposes. The browser (or, at least, most of it) should be 64-bit. Perhaps there'd be sufficiently low overhead to have just the JS engine 32-bit.
Well, for most of the world "Chrome" means "Chrome on Windows", and "Chrome on Windows" means "the 32bit Chrome build".
And since Chrome works pretty well on Windows I guess a 32bit build should work well also on our beloved Linux desktops...
In fact here (V8 development team) we work on 64bit Linux hosts but we test and develop 32bit x86 before anything else, and then make sure that also amd64 and arm work perfectly. But when we look at performance numbers we do it mainly on the 32bit builds.
Posted Jun 18, 2012 11:29 UTC (Mon)
by hummassa (subscriber, #307)
[Link]
Posted Jun 6, 2012 20:23 UTC (Wed)
by jpnp (guest, #63341)
[Link] (5 responses)
I'm confident they would benefit as they already benchmark better running as 32bit code on AMD64, adding the extra registers from the 64bit ABI can only aid the compiler.
Mind you, I don't see a great need for the whole OS to be X32, just support for X32 applications running on AMD64 for those workloads where it helps.
Posted Jun 7, 2012 0:43 UTC (Thu)
by nybble41 (subscriber, #55106)
[Link] (4 responses)
You've probably already considered this, but for workloads like this, why not pre-allocate a moderate-sized pool of memory for this data and store just the offsets? That seems like a less intrusive solution than requiring multiple copies of system libraries to support amd64 and x32 side-by-side.
Also, is it too much to ask that x32 applications be capable of interacting with amd64 libraries? Perhaps merge x32 and x86_64 into a single ABI with "near" and "far" pointers? If mixed code always limits itself to a 32-bit address space, and x32 code uses the 64-bit system call ABI, then it should be possible to convert between "near" and "far" pointers transparently and use a single set of libraries for both modes. The only remaining issue that I can see is making sure the compiler knows which pointers need to be "far" pointers even when compiled in a x32 context (e.g. shared library header files).
Posted Jun 7, 2012 2:51 UTC (Thu)
by butlerm (subscriber, #13312)
[Link] (3 responses)
You can recompile well written programs for an ABI like this without any source code changes. Manually adding offsets, on the other hand, is slower and makes for unusually ugly looking code.
> Also, is it too much to ask that x32 applications be capable of interacting with amd64 libraries?
It is conceivable that shims could be provided for some 64-bit libraries, but in the general case (C++ libraries for example) it is not even practical.
Most initial x32 systems are likely to be x32 only. I wouldn't expect a desktop distribution to come with full libraries for both x32 and x86-64, one would probably either have x32 releases that come with a handful of 64 bit packages, or x86-64 releases that come with a handful of x32 packages.
Posted Jun 7, 2012 3:57 UTC (Thu)
by nybble41 (subscriber, #55106)
[Link] (2 responses)
I wasn't actually talking about providing shims. Rather, shared libraries would be compiled just as they are now in amd64 mode. The x32 programs would use 64-bit pointers in shared data structures and APIs, and 32-bit pointers in their own internal structures and APIs. Obviously, for this to work either the x32 parts or the dual-mode parts have to be marked somehow, e.g. with an attribute or a pragma line, so the compiler knows to use the larger pointers when compiling shared APIs for x32. Since any application with x32 components is guaranteed to run in a 32-bit address space, converting between the 64-bit and 32-bit pointers is trivial--the most significant 32 bits of the full-size pointers are always zero. Apart from marking the boundaries, the compiler can do all of the work.
> I wouldn't expect a desktop distribution to come with full libraries for both x32 and x86-64, one would probably either have x32 releases that come with a handful of 64 bit packages, or x86-64 releases that come with a handful of x32 packages.
The problem is the dependencies. To add just one moderately complex "foreign" package and you may end up needing duplicates of most of the system libraries. Some packages are relatively standalone, but what if you wanted, say, an x32 build of Chromium on an amd64 system? You'd need x32 builds of around 133 other packages[1] just to provide that one application.
[1] Estimated with: apt-cache depends --recurse -i chromium|awk '/^\s*Depends:\s+lib/{print $2;}'|sort -u
Posted Jun 7, 2012 4:15 UTC (Thu)
by mikemol (guest, #83507)
[Link]
You're *not* going to be able to interlink x32 and x86-64 binaries while sharing headers unless you make those headers aware of the differing binary representations of the types...and if you do that, you're making things significantly more complicated over a broad cross-section of code. That means tons of bugs.
As for having per-arch copies of the same binaries...that's already status quo on multilib systems. Not that big of a problem, really. x32 is poised to replace the old 32-bit ABI, with its segmented memory model and relatively limited register and CPU instruction set, with a 32-bit ABI with more registers and a higher-level guaranteed minimum for CPU instruction set availability. x32, in a sense, represents the new "i686" minimum compiler target for x86 systems with a 32-bit ABI.
Posted Jun 8, 2012 7:34 UTC (Fri)
by khim (subscriber, #9252)
[Link]
Nope. Think about standard library. memcpy quite obviously does not need to convert pointers, but aio_read needs to do that. And if you pass structures with pointers to functions around then it becomes real ugly real fast. x86-64 NaCl is independent reimplementation of x32 architecture (we plan to rebase our change on top of x32 when it'll be stable) and for initial benchmarks we used standard x86-64 glibc linked with our x32-like binary. This was a disaster: it was possible to compile and run few simpler SPEC CPU2000 benchmarks this way, but things like 253.perlbmk just refused to work properly. When we've finally got the loader and libc ported we've dropped this mixed mode as a hot potato. It's not worth it, believe me. Right. This is a lot of work. But it's still simpler then to try to stitch Chromium from x32 pieces and x86-64 pieces.
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
A Gentoo x32 release candidate
Apart from marking the boundaries, the compiler can do all of the work.
Some packages are relatively standalone, but what if you wanted, say, an x32 build of Chromium on an amd64 system? You'd need x32 builds of around 133 other packages[1] just to provide that one application.
