|
|
Log in / Subscribe / Register

GCC 12.1 Released

GCC 12.1 Released

Posted May 15, 2022 11:23 UTC (Sun) by excors (subscriber, #95769)
In reply to: GCC 12.1 Released by anton
Parent article: GCC 12.1 Released

> The surviving general-purpose architectures are AMD64, Aarch64, RV64GC, Power, s390.

It does get a lot easier if you exclude ARMv7, though that transition is either pretty recent or hasn't happened yet, depending on what field you're working in.

If I'm reading it right, ARMv8-A says: Unaligned accesses to Device memory (i.e. MMIO) always fault. Most loads/stores to unaligned Normal memory are okay, but multi-register loads/stores will fault if the SCTLR_ELx.A bit is set (though I believe Linux doesn't set that), and Exclusive/Acquire/Release/Atomic accesses will fault unless your CPU is ARMv8.4 (or older with an optional feature) (but even when unaligned atomics are supported, they may (unpredictably) fault if they cross a 16-byte boundary).

ARMv7-A will fault in much less obscure cases, e.g. any unaligned multi-word access (LDM, LDRD, etc) regardless of SCTLR.A. That's a problem whenever you're loading an int64_t, or even two adjacent int32_ts (because the compiler likes to merge them into one instruction), and if it's not aligned you'll need to tell the compiler with __attribute__((packed)).

ARMv8-M also faults on unaligned multi-word accesses. An ARMv8-M Baseline implementation (which I think is the modern replacement for ARMv6-M) will even fault on unaligned single-word accesses.


to post comments

GCC 12.1 Released

Posted May 15, 2022 12:48 UTC (Sun) by anton (subscriber, #25547) [Link] (2 responses)

It does get a lot easier if you exclude ARMv7, though that transition is either pretty recent or hasn't happened yet, depending on what field you're working in.
In general-purpose computers, the transition to ARMv8-A has happened quite a while ago (e.g., with Raspi3 in 2016).

However, maybe it has more to do with the instruction set. In that case, Aarch32 seems to be pretty alive on RaspiOS (although even they have started releasing an Aarch64 version). However the Cortex-X2 and Cortex-A510 announced by ARM almost a year ago don't support Aarch32, so Aarch32 is a second-class citizen already, and I expect that there will be no hardware support on general-purpose computers for it in the not-too-distant future.

Personal experience: I just tried to run an EABI5 binary on all four ARMv8-A machines (with various distributions) we have around. On three I get "no such file or directory" (apparently the kernel does not understand the binary at all), the fourth (a Raspi4 with 64-bit Debian 10) eventually chokes on a missing library. It seems that Aarch32 is not very important for 64-bit Linux distributions.

Concerning the SCTLR_ELx.A bit, IA-32 and AMD64 have a similar bit since the 486, which I tried to use (for portability checking in a development environment), but had to give up on, because on IA-32 the ABI puts doubles at 4-byte boundaries, and the flag would cause fault on such accesses. Another attempt with AMD64 failed because gcc produces unaligned accesses from pairs of user-written aligned accesses. So if Linux has not set SCTLR_ELx.A in the past, setting it now would probably cause quite a bit of breakage.

Concerning atomics, they are no excuse for breaking code that does not perform atomic accesses (I doubt that the auto-vectorizer dares auto-vectorizing atomics).

ARMv8-M is irrelevant for general-purpose computers. To those who think it has anything to do with ARMv8-A: it has not. E.g., there is no Aarch64 (the headline feature of ARMv8-A) in ARMv8-M. Yes ARM's naming is confusing.

GCC 12.1 Released

Posted May 18, 2022 14:23 UTC (Wed) by excors (subscriber, #95769) [Link] (1 responses)

> In general-purpose computers, the transition to ARMv8-A has happened quite a while ago (e.g., with Raspi3 in 2016).
>
> However, maybe it has more to do with the instruction set. In that case, Aarch32 seems to be pretty alive on RaspiOS (although even they have started releasing an Aarch64 version).

True, my previous comment should have said "ARMv8-A AArch64" (not "ARMv8-A") - the rules for ARMv8-A AArch32 look essentially identical to ARMv7-A, so unaligned LDRD/LDM/etc will fault as you showed in a later comment. (And the compiler will happily transform assumedly-aligned loads into LDRD/LDM.)

> ARMv8-M is irrelevant for general-purpose computers.

Also true (well, assuming you mean the main user-visible processor and ignore the potentially dozens of microcontrollers in the same computer), but I'm not sure "general-purpose computer" is that useful a distinction in practice. There are plenty of libraries originally designed for Linux userspace that are quite usable and useful on higher-end microcontrollers, and it would be a shame if the only thing preventing them from working in that environment was an accidental reliance on misaligned data. It would also be a shame if GCC wasted performance on those microcontrollers by assuming all data might be misaligned and never using LDRD/LDM, given the vast majority of existing code does follow the alignment rules correctly and is currently benefiting from that optimisation. So I believe there's still value in following those alignment rules in new code, for portability to real systems that may realistically want to reuse your code.

GCC 12.1 Released

Posted May 18, 2022 21:07 UTC (Wed) by anton (subscriber, #25547) [Link]

(And the compiler will happily transform assumedly-aligned loads into LDRD/LDM.)
I was somewhat surprised how hard it was to find ldrds in the binary in order to exercise them: only 32 non-sp/fp ldrds and 58 ldms in 19587 instructions. For comparison, an Aarch64 binary of (a later version of) the same program has 257 non-sp/fp ldps in 21745 instructions. By general-purpose I mean the, e.g. Zen3 core that's targeted by free software developers and/or ISVs, not, e.g., AMDs PSPs which are indeed Aarch32 cores last I heard, but which we unfortunately cannot program.
There are plenty of libraries originally designed for Linux userspace that are quite usable and useful on higher-end microcontrollers, and it would be a shame if the only thing preventing them from working in that environment was an accidental reliance on misaligned data.
Indeed, ideally already the GPL prevents them from being used in such locked-down environments. But if gcc maintainers' willingness to break programs hurts the proprietary crowd for a change, that's less of a concern to me than when they hurt free software developers and users.
It would also be a shame if GCC wasted performance on those microcontrollers by assuming all data might be misaligned and never using LDRD/LDM, given the vast majority of existing code does follow the alignment rules correctly and is currently benefiting from that optimisation.
On the contrary, I would find it a shame if programmers who know how to get good performance by using unaligned accesses would slow down their programs in order to cater for gcc's sillyness.

GCC 12.1 Released

Posted May 17, 2022 20:39 UTC (Tue) by anton (subscriber, #25547) [Link]

I have now managed to indeed get a SIGBUS on a Raspi4 by running 32-bit code that uses ldrd with an unaligned address (but regular ldr does not produce such an exception). So SSE/SSE2, ldrd (and friends) are the last die-hards in a general-purpose world dominated by instructions that work with unaligned addresses.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds