Because transposition uses O(N^2) accesses and multiplication O(N^3). The accesses in the
transposition are more expensive but there are N times fewer than in the multiplication...
Posted Oct 27, 2007 22:41 UTC (Sat) by giraffedata (subscriber, #1954)
[Link]
Aha. Perfectly clear now. The article neglects to explain this; I'd probably say, "the original traverses mul2 in this expensive nonsequential way 1000 times, whereas the improved version does it only once."