Well, that's why we call it a heuristic :-) It can be helpful even if it's not perfect. A really tough case is flows that can scale their bandwidth requirements but value latency over throughput -- for something like VNC or live video you'd really like to use all the bandwidth you can get, but latency is more important. (I maintain a program like this, bufferbloat kicks its butt :-(.) These should just go ahead and set explicit QoS bits.
Obviously the first goal should be to minimize latency in general, though.