If it really is such low overhead and as useful as you say, Why don't you (alone or with help from others who believe this) demonstrate this for us on a real distro.
take some distro (say ubuntu since it supports multiple architectures) download the repository (when I did this a couple years ago it was 600G, nowdays it's probably larger so it make take $150 or so to buy a USB 2TB drive, it will take you a while to download everything), then create a unified version of the distro, making all the binaries and libraries 'fat' and advertise the result. I'm willing to bet that if you did this as a plain repackaging of ubuntu with no changes you would even be able to get people to host it for you (you may even be able to get Cononical to host it if your repackaging script is simple enough)
I expect that the size difference is going to be larger than you think (especially if you include every architecure that ubuntu supports, not just i486 and AMD64), and this size will end up costing performance as well as having effects like making it hard to create an install CD etc.
I may be wrong and it works cleanly, in which case there will be real ammunition to go back to the kernel developers with (although you do need to show why you couldn't just use additional ELF sections with a custom loader instead as was asked elsewhere)
If you could do this and make a CD like the ubuntu install CD, but one that would work on multiple architectures (say i486, AMD64, powerPC) that would get people's attention. (but just making a single disk that does this without having the rest of a distro to back it up won't raise nearly the interest that you will get if you can script doing this to an entire distro)