1000 single bit operations per FLOP
Posted Jun 23, 2008 7:07 UTC (Mon) by
jzbiciak (
✭ supporter ✭, #5246)
In reply to:
1000 single bit operations per FLOP by mikov
Parent article:
The Kernel Hacker's Bookshelf: Ultimate Physical Limits of Computation
Regarding the multiplier... It turns out that floating point multipliers can get rid of most of the alignment steps that the adder needs to do. (See my other post.) You also don't need to apply the sign to your input or your output. So, you're just left with the addition steps.
A naive 24 × 24 multiplier would effectively do 24 24-bit adds. That's 6336 SBOPs. Assume for a moment, though, that we can eliminate about half of those because only 24 output bits are needed. This brings us down to 3168 SBOPs for the multiplier. Remaining steps:
- Adding the mantissas: 88 SBOPs.
- Aligning the result (shift by up to 3 positions, IIRC, implemented as 2 levels of 2:1 mux): 2 × 24 × 3 = 144 SBOPs
This gives a total of ~3400 SBOPs. About 2× to 3× the cost of an adder.
Oh, and that reminds me, you can eliminate one of the argument inversions in my adder estimate above. That squeezes another 60-70 SBOPs out. (You do add some other bit inversions here and there on the sign bits, which is why you don't get all 72 back.)
(
Log in to post comments)