Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for December 5, 2013
Deadline scheduling: coming soon?
LWN.net Weekly Edition for November 27, 2013
ACPI for ARM?
LWN.net Weekly Edition for November 21, 2013
P.S. I'm not voting against Aarch64. I just say that "just extending registers to 64-bit" probably won't work as expected.
Supporting 64-bit ARM systems
Posted Jul 11, 2012 18:20 UTC (Wed) by gnb (subscriber, #5132)
Posted Jul 12, 2012 22:16 UTC (Thu) by cesarb (subscriber, #6266)
For instance, the inner loop of SHA-512 needs 8 64-bit registers, plus an array of 16 64-bit values. On a 32-bit architecture with 15 registers, you will have to spill to the stack (each 64-bit register would be represented by a pair of 32-bit registers, so you would need 16 of them). On a 64-bit architecture with 31 registers, you could have the whole state (8+16 values) in the registers, and still have a few left for the intermediate calculations. The entire inner loop can run without having to touch the stack.
Posted Jul 14, 2012 22:40 UTC (Sat) by jzbiciak (✭ supporter ✭, #5246)
In the specific case of ARMv7 vs. ARMv8, you need to consider NEON. An ARMv7 with NEON, wouldn't you do the SHA-512 hash using the NEON registers?
(These guys show decent results for other crypto algorithms using NEON. They suggest SHA-512 would speed up well also, but say they "didn't bother" with it yet.)
I guess my point is, ARM v7 already offers a path to way more registers than the base 16 x 32. I would imagine anyone springing for a Cortex-A15 would also include NEON. NEON is designed to absorb the heavy duty bulk computation, leaving the 16 GPRs for the more general control stuff.
Posted Jul 14, 2012 23:00 UTC (Sat) by dlang (✭ supporter ✭, #313)
But if you are trying to build software for many different devices (like a disto needs to), then you can't count on optional features being there
Posted Jul 15, 2012 4:42 UTC (Sun) by jzbiciak (✭ supporter ✭, #5246)
Your point stands, though, for devices that would have to run the fallback version. Those would benefit from ARMv8 without having to use NEON. Plus, keeping the multiple versions around takes up more space, wrangling them is added complexity, etc. etc.
For something like a cell phone, where interpreters like Dalvik have a JIT, the compiled code can match the exact CPU in the phone. After all, that isn't exactly a user-serviceable part. ;-)
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds