My first thought was the following. Is this the same thing you are saying?
'y' could be some scale value, such as '1' (no scaling) or 0.5.
The idea here being that if this file handle is having pages ejected, the read-ahead is limited to the number of pages still in ram that fit within the read-ahead window.
So, at first read from offset 0, the full window is available. If half of those pages are ejected before they are referenced, the pages still in RAM value is now 1/2 of the read ahead window, so that next read-ahead window is then limited to 1/2 size. If 1/2 of the pages read are also ejected from the head of the file, the read-ahead window would then be 1/4 of the maximum window size (because 1/4 were ejected from the front, and 1/2 from the end - leaving us 1/2 way into the window with only the most recent 1/4 still in RAM).
Referencing our editor's drawings, using this algorithm in the first example would only read ahead 7 blocks into the window (instead of 13 because there are 7 prior to the current offset) and 3 (instead of 2) in the second example.
In the example file usage pattern like RRD, the seek to the middle of the file would only read ahead 1 block. But even if given a 'dd' pattern from the middle of a file, the read-ahead window would at least grow exponentially (1, 2, 4, 8, ...) until the full window size was saturated, somewhat similar to TCP's slow-start.
This presumes, of course, that it is cheap to determine contiguous blocks of memory in RAM prior to the current offset. But I wonder if something like that would be a simple (and workable) compromise. The maximum read-ahead size could then grow to be very large, helping the SSD scenario.
Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds