> Given that x86 accesses memory 32-bits at a time anyway,
On modern CPU memory is addressed internally by cache lines that are typically 16-32-64 bytes in size. On x86 the byte access is just as fast as 32-bit access. Moreover, misaligned access to 32-bit values is allowed and is not costly as long as the variable does not cross the cache line boundary.
Posted May 21, 2012 15:08 UTC (Mon) by mikemol (subscriber, #83507)
[Link]
For basic instructions, yes. Take a look at the SSE instructions; while there are unaligned and aligned versions for several, the aligned versions will carry better performance.