LWN.net Logo

My kid hates Linux (ZDNet)

My kid hates Linux (ZDNet)

Posted Apr 14, 2008 19:42 UTC (Mon) by pizza (subscriber, #46)
In reply to: My kid hates Linux (ZDNet) by JoeBuck
Parent article: My kid hates Linux (ZDNet)

For "generic" 64 vs 32-bit comparisons, I'd agree, but we're talking about x86 and x86_64
here, and the latter has many many architectural improvements over the former (such as double
the number of registers).  As a result, most software tends to run slightly faster due to less
register contention, despite the additional overhead of larger pointers.  Additionally, if
you're doing lots of 64-bit math (as CAD/EDA is wont to do), things get *much* faster.

And yes, I've benchmarked this for myself.  The one thing that's trivial for me to recompile
now (dcraw) gives me an 11% improvement when built as a 64-bit binary.


(Log in to post comments)

My kid hates Linux (ZDNet)

Posted Apr 14, 2008 23:47 UTC (Mon) by djabsolut (guest, #12799) [Link]

Not to disparage the generally good idea of moving towards x86_64, however an improvement of 11% is not really worth the hassle of incompatabilities. What are the speedups like on average ?

(AFAIK, modern processors "translate" the crufty x86_32 code into their own internal code, and along with a large cache this makes issues such as lack of registers not really a problem. The only practical reason one would want to use x86_64 is larger available memory space and/or 64 bit math -- the number of applications needing this is dwarfed by plain-jane applications).

My kid hates Linux (ZDNet)

Posted Apr 15, 2008 3:44 UTC (Tue) by jwb (subscriber, #15467) [Link]

There are major differences with x86_64 that show up everywhere, not just for math.  The
calling convention on the 64-bit system are far cleaner.  More arguments can be passed to
functions in the registers (6, I think) than on 32-bit systems, where the extra arguments have
to be placed on the stack.  Stack management function on x86 are not free; they take one or a
few cycles during every function call and function return.  This can add up.

x86_64 also allows more and better ways of addressing data that can save an explicit load to
register.

These are not theoretical improvements.  Lots of programs run much better on x86_64 than on
plain old x86.

The 64-bit systems do still have the problem of larger pointers which can crowd the cache, but
some programmers find ways around this.  BEA, for example, uses short heap pointers in their
JVM, which gives them all the speedups of the x86_64 programming model (described above)
without paying the cost of 64-bit pointers.

Sorry, Mr. pizza ...

Posted Apr 15, 2008 1:17 UTC (Tue) by JoeBuck (subscriber, #2330) [Link]

... but I wasn't speaking theoretically. I work in electronic design automation.

The doubled-memory effect really does overwhelm the effect of having more registers, 64-bit math and a better machine architecture in many real cases, particularly when the program's working set is in the gigabytes. The time to move that data through the CPU overwhelms all other considerations. The 64-bit executable wins when the working set exceeds the 32-bit address space, of course, but in the range where the 32-bit program requires 1-2 Gbytes and the 64-bit program needs nearly double that.

For this reason, many EDA applications are available in both 32-bit and 64-bit versions, and the recommendation to the customer is to use the 32-bit version even on the 64-bit machine except where the problem is too large.

Sorry, Mr. pizza ...

Posted Apr 15, 2008 6:20 UTC (Tue) by motk (subscriber, #51120) [Link]

Counterpoint, RAM is pretty cheap these days. Just Add More.

Of course, you do come across motherboard limitations occasionally.

RAM is not the problem

Posted Apr 15, 2008 14:46 UTC (Tue) by GreyWizard (subscriber, #1026) [Link]

CPU cache and bandwidth limitations are the issue here, not RAM size.

Sorry, Mr. pizza ...

Posted Apr 15, 2008 6:36 UTC (Tue) by bronson (subscriber, #4806) [Link]

If 64 bit pointers are really that big a deal, how come the EDA guys don't use 4GB memory
pools with 32-bit offsets?  That's way you get the speed and huge memory space of 64 bits with
the space efficiency of 32 bit.  Seems like a win-win.

It's been quite a while since I've done EDA (some VLSI layout and simulation back in 2003).  I
remember some seriously crufty software produced by vendors who would do anything to avoid an
update.  Some of the tools I used were written *and compiled* pre-1998!  It was a nightmare
trying to get that junk to run.  I eventually got the toolchain working and then I never let
anybody touch that box again.  Not so much as a security update or a package upgrade lest it
break anything.

So...  If the EDA industry is indeed pushing back against 64 bit, there might be more to it
than just pointer size inflating the working set.  :)

Sorry, Mr. pizza ...

Posted Apr 17, 2008 20:34 UTC (Thu) by im14u2c (subscriber, #5246) [Link]

If 64 bit pointers are really that big a deal, how come the EDA guys don't use 4GB memory pools with 32-bit offsets? That's way you get the speed and huge memory space of 64 bits with the space efficiency of 32 bit. Seems like a win-win.

Sounds like a maintenance nightmare to me, particularly if the code base is shared between 32-bit and 64-bit worlds, and if any portion of the data set has an index larger than 232-1. The reason I say "index" is that these pools could be homogeneous pools of structures, and so the addressed memory in that pool could actually be as large as 232 * sizeof(struct whatever), rather than just 232 bytes.

Sure, on 64-bit machines you get the compact representation. But, on all machines that share that code base, you add an additional indirection to compute your final pointer, and you've thrown up partitions in your memory map based on where these pools are. If your problem doesn't partition into pools nicely, you're hosed.

Sorry, Mr. pizza ...

Posted Apr 15, 2008 13:21 UTC (Tue) by pizza (subscriber, #46) [Link]

Fair enough; your particular daily-use EDA app (proprietary?  you've never actually mentioned
what it is) performs worse.  You use what best supports your needs, after all.

However, my daily-use apps perform significantly better under 64-bit.  That 11% improvement
with dcraw was the only one I could recreate the benchmarks on immediately, as it's trivial to
recompile.

My main daily use app (GCC cross-compiler building a multi-million line codebase) runs
considerably faster under x86_64.  However, I no longer have an identical 32-bit system for
comparison any longer, so I can't supply benchmarks without blowing half a day on it.  (The
64-bit gnome desktop *feels* faster too, but that's obviously subjective)

One of the folks I work with has also raved about the improvements he saw using the 64-bit
versions of the particular FPGA synthesizer tools.  

Not to mention the speedup one gets by not needing bounce buffers (and other games) for I/O.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds