When doing video streaming on small embedded systems (small = 32MB RAM, slow processor, no MMU), kernel read-ahead and write-behind turn out to be problematic because they cause too much memory pressure and in a rather lumpy way (which is the real problem - memory allocation failures start happening, and the I/O rate is very variable from one second to the next).
But not having read-ahead and write-behind makes latency too high, unless asynchronous I/O is used to keep the queues full. Asynchronous I/O doesn't work on Linux except with direct I/O. So we're dabbling in asynchronous, direct I/O for video streaming on small devices to make it more reliable.