depending on the application, the fact that pointers and memory addresses change from 32 bits to 64 bits can actually slow the system significantly.
the larger footprint uses more CPU cache, making the system spend more time waiting for the cache to be updated from memory.
this is why many of the chips that have both 32 bit and 64 bit modes tend to run 64 bit kernels with 32 bit userspace, for programs that don't need to address more than 4G of ram, the overhead of the larger data objects results in a slowdown
x86/AMD64 is pretty much unique in the fact that 64 bit mode doesn't just extend the registers to 64 bits, it also gives you twice as many registers to work with. Since the x86 platform has far fewer registers than more modern designs, the availability of more registers means that far more things can happen in the CPU itself, without having to save and load values out to the cache (or worse yet, to RAM) in a constant shuffle to get register space for the things that need it. x86 systems spend a HUGE amount of time doing this register shuffle.
the idea behind the x32 architecture is to be able to take advantage of these extra registers (which almost always result in improved performance) without having to pay the overhead of larger pointers to memory.
the fact that many 32 bit applications that are nto 64 bit clean can be made to run in this mode is pure gravy, and if the time change takes place, this may be sacraficed in order to get a better long-term x32 architecture.