"This is just an academic discussion about the standard, since it seems everyone agrees GCC needs to change regardless of whether it presently implements the standard."
I absolutely don't agree with this. Forbidding 64-bit read/writes for 32-bit scalars within an aligned 64-bit boundary is a pessimization which should be opted into in C99, such as with the -pthread flag as in current gcc implementations (-pthread does require this pessimization if adjacent memory locations would otherwise be corrupted).
Presumably when C11 is implemented in gcc, opting out of this pessimization will also be available. Or possibly gcc will require some specific flag to be set where a multi-threaded program is being compiled, who knows.