We can make the reasonable assumption that a media system is dealing with 16-bit 44.1 kHz stereo audio.
I disagree on the need for 64 bits. 32-bit floats already have 23 bits of precision in the mantissa, and plenty of range in the exponent. Given the typical quantization to just 16 bits, it is hard to argue for the need of more intermediate precision.
I agree it's not necessarily the extra code that hurts me (although I do find gstreamer's modules to be pretty gory, and the use of ORC for a trivial volume scaler was astonishing); what I perceive as poor architecture hurts me. Especially the audio pipeline looks pretty strange to me as an audio person. The need to specify conversions explicitly is baffling. How does one even know that in the future some format doesn't get added or removed from a module, thus either requiring the addition of a conversion step, or making the specified conversion unnecessary and potentially even harmful?
I am convinced that a more ideal audio pipeline would automatically convert between buffer types where necessary and possible, and that audio processing would always imply promotion to some general high-fidelity type, be it integer or float (might be compile-time switch) so that at most there is one promotion and one demotion within any reasonable pipeline.