LWN.net Logo

Why 64 bit?

Why 64 bit?

Posted Sep 13, 2006 4:34 UTC (Wed) by TwoTimeGrime (guest, #11688)
Parent article: What you should (and shouldn't) expect from 64-bit Linux (Linux.com)

What practical advantage does 64 bit have over regular computers other than the ability to address more RAM?


(Log in to post comments)

Why 64 bit?

Posted Sep 13, 2006 4:45 UTC (Wed) by shemminger (subscriber, #5739) [Link]

x86-64 code has more registers available so it is faster.

64 bit allows addressing >4 gigabyte of memory without doing the PAE hack.

Why 64 bit?

Posted Sep 13, 2006 15:02 UTC (Wed) by chill633 (guest, #16013) [Link]

64-bit is faster just because it has more registers? Not quite.

My one serious test w/64-bit involved benchmarking Slackware 10.2 w/2.6 kernel against Slamd64 w/2.6 kernel. The one thing I was interested in was converting DVDs to Xvid (MPEG-4) using vobcopy and transcode.

The 32-bit Slackware version was 40-50% faster than the identical Slamd64 setup. By "identical" I mean the same computer (dual-boot), same DVD, same software installed (well, 32-bit in one case, 64-bit in the other).

I never could figure it out. Vobcopy was identical in speed, down to the second. Transcode, however, was a dog using the Slamd64 version.

My next test is going to be Gentoo, when I have some time.

Charles

Why 64 bit?

Posted Sep 13, 2006 15:29 UTC (Wed) by ewan (subscriber, #5533) [Link]

I believe that transcode has some handwritten assembler for some of its
heavy numerical loops that's only used on supported x86-32 processors;
everything else, including both non x86 and x86-64 uses the equivalent
portable versions written in C.

Why 64 bit?

Posted Sep 13, 2006 15:33 UTC (Wed) by khim (subscriber, #9252) [Link]

Gosh. You've compared hand-optimized 32bit assembler code and generic 64bit code without SIMD commands. You found VERY surprising that optimized code which uses SIMD commands are faster then non-optimized code. Why ? Have you EVER seen situation where generic code is faster then hand-optimized code with SIMD instructions ?

Hint: if you are comparing DIFFERENT programs (and make no mistake: 32bit transcode and 64bit transoce are totally different beasts) all bets are off. Gentoo will not help there...

P.S. AFAIK 64bit SIMD implementation is in the works and there are something in CVS but it's not enabled in any builds right now...

Why 64 bit?

Posted Sep 13, 2006 4:54 UTC (Wed) by bryanr (guest, #25324) [Link]

- AMD64 has more registers than IA32 -- resulting in fewer spills to memory (load/store ops)

- The wider registers mean fewer cycles are used during memcpy and
related operations (the CPU can load/store 64 bits of data per ins
instead of 32)

- Modern CPU instructions are always enabled on AMD64. Many "i686"
distros have gcc schedule code for an i686 pipeline, yet only emit
instructions present since i486 (penalizing new hardware to ensure
compatability with old)

Why 64 bit?

Posted Sep 13, 2006 17:53 UTC (Wed) by raven667 (subscriber, #5198) [Link]

> Modern CPU instructions are always enabled on AMD64. Many "i686"
> distros have gcc schedule code for an i686 pipeline, yet only emit
> instructions present since i486 (penalizing new hardware to ensure
> compatability with old)

I don't believe that to be accurate. I'm pretty sure that the the acutal object code does runtime cpu
detection and has seperate optimized code paths for i386/486/586/686/etc. so there should be no
difference when using -march=i386 and -mcpu=i686.

Why 64 bit?

Posted Sep 13, 2006 20:41 UTC (Wed) by nix (subscriber, #2304) [Link]

Doesn't that depend on the software?

(Some builds of speed-critical stuff like OpenSSL use the CPU hwcaps and
the per-hwcap library feature of glibc to produce builds optimized for
appropriate processors: look for e.g. /usr/lib/i686 or /usr/lib/v9...)

Why 64 bit?

Posted Sep 14, 2006 2:42 UTC (Thu) by raven667 (subscriber, #5198) [Link]

> Doesn't that depend on the software?

I don't believe so, what I understand is that if you use both -march and -mcpu gcc will put in
optimized code paths seperate for each. There will be a fallback for i386 but optimized code for
i686. The binary will obviously be larger as it may contain multiple implementations of the same
code.

At least that's how I understand how it works, but I can't find a find a reference quickly. Maybe
someone more knowledgable than I can comment?

Why 64 bit?

Posted Sep 14, 2006 6:47 UTC (Thu) by khim (subscriber, #9252) [Link]

When everything else fails, RTFM.

And here we go:
-mtune=cpu-type: Tune to cpu-type everything applicable about the generated code, except for the ABI and the set of available instructions.
-march=cpu-type: Generate instructions for the machine type cpu-type. The choices for cpu-type are the same as for -mtune. Moreover, specifying -march=cpu-type implies -mtune=cpu-type.
-mcpu=cpu-type: A deprecated synonym for -mtune.

Nothing like "separate code path", "fallback for i386" and other such nonsense. If you are specifying "-mcpu=i386 -mtune=i686" it just means: "try to optimize for i686, but don't use anything except i386 instructions". Quite sad, really: C++ code expirience can slowdown up to 30-40% with such options but usually just 5-10%.

Autoselection can be done (kernel does it in some configuration, OpenSSL does it, GLibC does it, some other programs are doing it - but that's up to application developer, compiler will not help you there... Where have you got ridiculous idea that it's task for the compiler - I do not know...

Why 64 bit?

Posted Sep 14, 2006 18:57 UTC (Thu) by nix (subscriber, #2304) [Link]

OpenSSL doesn't do it, as far as I know (it can't; many critical-path
things are macro-expanded). Instead, distributors use glibc's hwcap
mechanism to select appropriately-compiled OpenSSL libraries for the
hardware at dynamic link time. (This is especially useful on e.g. SPARC,
because of SPARCv7's lack of integer multiply instructions.)

Why 64 bit?

Posted Sep 13, 2006 21:16 UTC (Wed) by dberkholz (subscriber, #23346) [Link]

Only in code specifically written to do so, e.g. Mesa.

Why 64 bit?

Posted Sep 13, 2006 9:03 UTC (Wed) by jfj (guest, #37917) [Link]

Unless you are using "long long" operations or more than 4GB of memory per process, none. Oh, I forgot, it eats your cache memory so you are using 1/2 of it compared to 32 bits.

Why 64 bit?

Posted Sep 13, 2006 9:11 UTC (Wed) by nix (subscriber, #2304) [Link]

Congratulations on being the only commenter not to assume that '64-bit' means 'x86-64' :)

Why 64 bit?

Posted Sep 13, 2006 15:36 UTC (Wed) by khim (subscriber, #9252) [Link]

Ummm... What OTHER 64-bit architecture is available for "brand-new AMD64 workstation" ?

Why 64 bit?

Posted Sep 13, 2006 20:42 UTC (Wed) by nix (subscriber, #2304) [Link]

Er. Um. True. I forgot the actual *content* of the article.

I'll go away now.

Why 64 bit?

Posted Sep 13, 2006 15:34 UTC (Wed) by charris (subscriber, #13263) [Link]

I haven't noticed much difference on the desktop, but for certain numerical things there can be significant differences. For instance, the random number generator MWC8222 relies on multiplying two 32 bit integers with a 64 bit result and runs about twice as fast on amd64. Oddly enough, that advantage disappeared on the emt64 xeon machine I tried it on, so go figure.

By and large I am favoring 32 bits for the short term just because of the codecs and such. When those things eventually mature, and with MS in the 64 bit desktop market they will, I will switch over for good.

Chuck

Why 64 bit?

Posted Sep 13, 2006 17:29 UTC (Wed) by zlynx (subscriber, #2285) [Link]

Now, this may not be factual, but I have read that Intel cheated a bit with EMT-64 instructions.

What they apparently do is run 64-bit operations through their existing 32-bit ALU, and because their P4 architecture has a double-clocked ALU, this isn't so bad. What it does mean though, is that you've just exchanged the instructions a 32-bit compiler generates for 64-bit ops with CPU microcode doing the same thing.

Why 64 bit?

Posted Sep 14, 2006 6:57 UTC (Thu) by khim (subscriber, #9252) [Link]

It's not the same thing. 64-bit instructions were discowered in the photo of Prescott CPU two years before Intel said that he'll support EM64T! How ? On the photo you can find TWO 32bit ALUs! One is used for 32bit, another one - for high half of the 64bit (it's smaller, it's registers don't have tags so can not be used separately, etc). It means that while it's slower then AMD64 it's still faster then what the compiler does - because the compiler can only use first ALU!

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.