Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for May 23, 2013
An "enum" for Python 3
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
Does 64bit really help for doubles? I thought that FPU's on current 32bit processors handled 80 bits or so.
Desktops usually no; Servers yes.
Posted Dec 10, 2004 12:17 UTC (Fri) by evgeny (guest, #774)
Well, at least for my codes it does. YMMV ;-)
> I thought that FPU's on current 32bit processors handled 80 bits or so.
P3 is not bad, and Centrino is even better (per GHz). But you can't get SMP boxes with Centrino, and CPU speed is not high. P4/Xeon4, on the other hand, is a complete disaster as far as doubles and longlongs are concerned - again, this is my experience. Try running md5sum on a large file (e.g. an iso image) on a P3, P4, and Opteron boxes (of course, with 64bit kernel and md5sum binary in the last case) - and compare the (user) times. I even tried the Intel's ICC compiler - all the same. Switching from x86 to amd64 got me a factor of ~2 (per GHz) comparing to P3 and ~4 (!) comparing to P4.
Posted Dec 10, 2004 21:08 UTC (Fri) by jzbiciak (✭ supporter ✭, #5246)
The speedup on md5sum probably comes from doubling the number of integer registers, and thereby reducing register spills to the stack.
At any rate, doubling the integer register file should have a minimal impact on floating point codes, since all floating point computation occurs in the floating point register file. One area floating point code will see speedups is in block copies. Non-computational manipulation of floating point data in the integer register file (stuff like memcpy(), structure assignment, array initialization) will speed up.
Posted Dec 10, 2004 22:31 UTC (Fri) by evgeny (guest, #774)
Not with FP; with (64bit) long longs. Hmm, or, at least, that was my impression. I did some benchmarks a year ago and noticed that a couple of utilites worked much better on amd64 than on x86; and it seemed it was related to the use of 64bit variables/structs..
> The speedup on md5sum probably comes from doubling the number of integer registers, and thereby reducing register spills to the stack.
Probably you're right.
> One area floating point code will see speedups is in block copies.
That's definitely not the case with my numbercrunching codes. Furthermore, running 32bit exec under 64bit kernel takes exactly twice more CPU time than the 64bit executable.
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds