very few desktop apps need more than 4G/process (including the browser, at least with current versions),
however, you are missing significant benefits of x32 vs traditional 32 bit
added total address space (since you are using a 64 bit kernel)
added per-process memory (4G instead of 2G)
twice the number of registers.
ability to rely on advanced CPU features (compared to i86)
64 bit commands.
64 bit time_t
moving from x32 to full 64 bit adds
increased per-process address space (beyond 4G) with the drawback of increased pointer size (and the effects this can have on cache pressure)
The advantages of moving off of pure 32 bit are pretty clear, the only drawback being that it's new and has less testing.
however, when comparing x32 and 64 bit, it's less clear if the overall win is going to be in the address space or the cache efficiency.
unless, as you point out, the system just doesn't have enough ram to use the additional address space, and that point is probably higher than most people thing. I expect it's probably at 8+GB of ram, not at 4G of ram. Remember that memory used for disk caches, the kernel, housekeeping apps, etc can easily use the ram beyond what the 'main' app of the system uses, even assuming that the 'main' app of the system is a single process.
If the 'main' app of the system in multiple processes not a single process with multiple threads, which is a fairly good thing to do when faced with multi-core CPUs anyway, the amount of memory in a 'typical' system before it really needs 64 bit addressing in userspace can easily go beyond 16G
on the other hand, some people will have apps that really want more address space, even on low-memory machines (it may be more efficient to address the memory sparsely than to add the complication to use it more efficiently)