User: Password:
Subscribe / Log in / New account

Supporting 64-bit ARM systems

Supporting 64-bit ARM systems

Posted Jul 12, 2012 22:16 UTC (Thu) by cesarb (subscriber, #6266)
In reply to: Supporting 64-bit ARM systems by Lumag
Parent article: Supporting 64-bit ARM systems

It depends on what you are trying to do.

For instance, the inner loop of SHA-512 needs 8 64-bit registers, plus an array of 16 64-bit values. On a 32-bit architecture with 15 registers, you will have to spill to the stack (each 64-bit register would be represented by a pair of 32-bit registers, so you would need 16 of them). On a 64-bit architecture with 31 registers, you could have the whole state (8+16 values) in the registers, and still have a few left for the intermediate calculations. The entire inner loop can run without having to touch the stack.

(Log in to post comments)

Supporting 64-bit ARM systems

Posted Jul 14, 2012 22:40 UTC (Sat) by jzbiciak (subscriber, #5246) [Link]

In the specific case of ARMv7 vs. ARMv8, you need to consider NEON. An ARMv7 with NEON, wouldn't you do the SHA-512 hash using the NEON registers?

(These guys show decent results for other crypto algorithms using NEON. They suggest SHA-512 would speed up well also, but say they "didn't bother" with it yet.)

I guess my point is, ARM v7 already offers a path to way more registers than the base 16 x 32. I would imagine anyone springing for a Cortex-A15 would also include NEON. NEON is designed to absorb the heavy duty bulk computation, leaving the 16 GPRs for the more general control stuff.

Supporting 64-bit ARM systems

Posted Jul 14, 2012 23:00 UTC (Sat) by dlang (subscriber, #313) [Link]

If you are building software for one device, everything you say is reasonable.

But if you are trying to build software for many different devices (like a disto needs to), then you can't count on optional features being there

Supporting 64-bit ARM systems

Posted Jul 15, 2012 4:42 UTC (Sun) by jzbiciak (subscriber, #5246) [Link]

There are CP15 registers (ID_ISARx, MVFRx, CPACR, NSACR) that tell you whether VFP and NEON are present, and if present, whether they're powered up and available. (VFP/NEON are in a separate power domain on A15.) So, you could theoretically build software that includes NEON-optimized algorithms, and set up the right versions of the functions at the start of the code.

Your point stands, though, for devices that would have to run the fallback version. Those would benefit from ARMv8 without having to use NEON. Plus, keeping the multiple versions around takes up more space, wrangling them is added complexity, etc. etc.

For something like a cell phone, where interpreters like Dalvik have a JIT, the compiled code can match the exact CPU in the phone. After all, that isn't exactly a user-serviceable part. ;-)

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds