I didn't say needs at least 4GB actually, just that needs not more than 4GB.
The limiting phase is the WPA step that merges all the types. That is a single process. The actual code generation runs with multiple processes later in the LTO-WHOPR build, so is just limited by how many -j* threads are used.
The 32bit compiler will need less memory than the 64bit compiler because a large amount of the compiler memory is pointers, and those are only half as big. So there's a good chance that a normal modular or reasonable sized monolithic build will fit in ~2-3GB or so, which is the limit on a normal 32bit kernel with 32bit compiler
Very large monolithic builds like allyes will not work