What about the following simple solution? The buffer is divided in two, a 5% part and a 95% part; the first allows low latency (since small size), the latter achieves high throughput (since big size). The sender sets in each packet a bit to choose in which buffer part the packet will be put. The router serves in round-robin (one packet from the 1st part, one packet from the 2nd, one from 1st, one from 2nd etc.)
(An optimisation can be done to use the 5% part if it is partly used and the 95% part is full.)