LWN.net Logo

Turning off caching

Turning off caching

Posted Aug 8, 2013 9:09 UTC (Thu) by epa (subscriber, #39769)
Parent article: A survey of memory management patches

I can't really see why dropping already cached pages would be helpful, but when working with large files which you will scan sequentially, it is useful to stop the newly read pages being cached. (If the file is bigger than memory, then by the time you get to the end of the file the start of it is no longer in cache, so you have to read it all over again next time. So caching doesn't give any performance benefit, and it would be better to use that memory for other things.)

Is there an option to open a file and specify that newly read pages should not be added to the cache?


(Log in to post comments)

Turning off caching

Posted Aug 8, 2013 12:13 UTC (Thu) by Funcan (subscriber, #44209) [Link]

man fadvise64 and look for the POSIX_FADV_SEQUENTIAL flag

Turning off caching

Posted Aug 8, 2013 14:58 UTC (Thu) by sbohrer (subscriber, #61058) [Link]

POSIX_FADV_SEQUENTIAL? I don't think that does what the previous poster asked but I've been surprised by what the various fadvise flags _actually_ do before. POSIX_FADV_NOREUSE sounds like it might avoid caching pages but the man page I'm looking at claims that in 2.6.18+ it is a no-op.

I am certain that POSIX_FADV_DONTNEED drops pages from the page cache but it doesn't work for future pages. In other words you have to periodically call it on pages you've previously read or have written which is somewhat annoying. The other gotcha for writes is that POSIX_FADV_DONTNEED doesn't drop dirty pages from the page cache it only initiates writeback so you have to call it twice for each possibly dirty page range if you really want those pages dropped. I currently use this for write-once files or files that I know will no longer be in the page cache by the next time I'm going to need them.

Turning off caching

Posted Aug 11, 2013 1:28 UTC (Sun) by giraffedata (subscriber, #1954) [Link]

I don't know exactly what Linux's current page replacement policy is, but this problem of sequential read of a file too big to fit in cache pushing other stuff out of cache as it goes, called a cache sweep, was solved long ago. The kernel should detect that this is happening and stop caching that file before it does much harm, and I presume that it does. That would explain why Linux doesn't do anything special with POSIX_FADV_NOREUSE.

I know that even before modern cache sweep protection was invented, Linux avoided much of the pain by using version of second-chance, so that these pages, since they were referenced only once, would be the first to be evicted and most of the pages that would actually be referenced again would remain.

Turning off caching

Posted Aug 8, 2013 15:11 UTC (Thu) by sbohrer (subscriber, #61058) [Link]

I'm _not_ a user who calls drop_caches to solve my problems, but I've surely been tempted to do it. If you rarely read back any files or when you do read you don't care about read performance then caching file pages hurts the performance of the things you do care about. As an example we have systems that simply log several hundreds of GB of data during the day, and that data is backed up in the evenings. The page cache is essentially useless on these machines since most of the files are bigger than RAM and we really don't care about the read performance as long as the old data is off before the next day starts. On the other hand we do care about write performance/latency and as soon as the page cache fills up you can start experiencing write stalls as old pages are dropped and new pages are allocated.

Turning off caching

Posted Aug 8, 2013 15:49 UTC (Thu) by etienne (subscriber, #25256) [Link]

> I'm _not_ a user who calls drop_caches to solve my problems

I am such a user, but my problem is to check that the device that i have just written (FLASH storage partition) has been correctly written (i.e. the FLASH device driver worked) - so I want to really read back from the FLASH partition and compare to what it should be (and see if there are uncorrected read errors)...
It would be nice to have an interface to drop the cache on a single device...

Turning off caching

Posted Aug 8, 2013 19:10 UTC (Thu) by sciurus (subscriber, #58832) [Link]

Won't unmounting it drop the cache for that device?

Turning off caching

Posted Aug 9, 2013 4:17 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

Would "-o remount,ro" work? Unmounting might be a little too disruptive. Of course, I'm not sure what happens when the filesystem behind a rw file I'm using gets remounted ro? Is it still writeable until I close it? Hit the next block? Delay the remount?

Turning off caching and switching to read-only

Posted Aug 11, 2013 1:43 UTC (Sun) by giraffedata (subscriber, #1954) [Link]

Switching a filesystem image read-only cleans the cache, but does not purge it. Thus, when you next read the file and see the correct data, that is no proof that the kernel correctly wrote to the device, which is what the OP wants. For that, you need to purge the cache and then read.

As for what happens when you switch to read-only while writing to a file is in progress: The mount() system call to switch to read-only fails. It fails if any file is open for writing.

And I'll tell you when else it fails, which causes no end of pain: when there's an unlinked file (a file not in any directory) in the filesystem. Because the kernel must update the filesystem when the file eventually closes (because it must delete the file at that time), the kernel cannot allow the switch to r/o.

Turning off caching

Posted Aug 9, 2013 8:43 UTC (Fri) by etienne (subscriber, #25256) [Link]

No, because the partition is not mounted.
On most embedded systems, you have two set of each partitions, and you update the whole unused partition by copying the device itself (that device image may contain a filesystem or just a CPIO or just a binary file like an image of the data to initialise the FPGA or the image of Linux kernel (U-boot cannot read filesystem content)).
So you copy the whole partition, check that there is no error writing, drop the cache, read it back and check there is no error reading, and check the checksum/SHA1 of the whole partition.
Unlike a PC there isn't any software recovery in case of failure, no expensive (in terms of PCB space) recovery FLASH, the only recovery is to plug an external JTAG adapter and it is slow.
Most cards I use have two U-boot, all of them have two Device Tree.

Turning off caching

Posted Aug 9, 2013 16:28 UTC (Fri) by jimparis (subscriber, #38647) [Link]

> So you copy the whole partition, check that there is no error writing, drop the cache, read it back and check there is no error reading, and check the checksum/SHA1 of the whole partition.

Why don't you just use O_DIRECT?

Turning off caching

Posted Aug 11, 2013 2:02 UTC (Sun) by giraffedata (subscriber, #1954) [Link]

So you copy the whole partition, check that there is no error writing, drop the cache, read it back and check there is no error reading, and check the checksum/SHA1 of the whole partition.
Why don't you just use O_DIRECT?

One good reason is because then you don't get all the benefits of caching. There's a good reason systems normally write through the buffer/cache, and it probably applies here: you want the kernel to be able to choose the order and size of writes to the device, independent of the order and size of writes by the application. For speed and such.

But I remember using an ioctl(BLKFLSBUF) to purge just the cache of a particular device, for speed testing; that's a lot less reckless than dropping every cached piece of information from the entire system. I wonder if that still works.

Turning off caching

Posted Sep 14, 2013 6:45 UTC (Sat) by Spudd86 (guest, #51683) [Link]

Do you care about the performance of open()? Then don't drop the caches because it drops the dentry cache too so the kernel will have to hit the disk for every directory in your path.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds