I wouldn't worry too much about an initial congestion window of ten packets. On a five mbps bottleneck link with 1500 byte packets that is only about 2.4 ms of queuing delay. The queuing delay due to ack compression as the congestion window increases is probably going to be considerably higher than that.
There seems to me to be only two good ways to solve the queuing latency problem, beyond simply reducing queuing limits on bottleneck routers and interfaces to reasonable sizes. One is the widespread deployment of packet pacing, which is difficult to do well without hardware support, and which has other challenges. The other is fair (flow specific) queuing at every bottleneck router or interface. The latter seems much more practical to me.