LWN: Comments on "A survey of memory management patches" http://lwn.net/Articles/562211/ This is a special feed containing comments posted to the individual LWN article titled "A survey of memory management patches". hourly 2 A survey of memory management patches http://lwn.net/Articles/567129/rss 2013-09-17T21:49:30+00:00 proski <div class="FormattedComment"> I'd rather have several thresholds: when non-privileged user syscalls start failing, when root syscalls start failing, when processes start to be killed, when kmalloc() starts failing.<br> </div> Turning off caching http://lwn.net/Articles/566807/rss 2013-09-14T06:45:51+00:00 Spudd86 <div class="FormattedComment"> Do you care about the performance of open()? Then don't drop the caches because it drops the dentry cache too so the kernel will have to hit the disk for every directory in your path.<br> </div> A survey of memory management patches http://lwn.net/Articles/563314/rss 2013-08-13T22:25:13+00:00 mathstuf <div class="FormattedComment"> I'm pretty sure that was sarcasm :) .<br> </div> A survey of memory management patches http://lwn.net/Articles/563044/rss 2013-08-12T03:58:11+00:00 thedevil <div class="FormattedComment"> "But, naturally, that will not be a problem since all user-space code<br> diligently checks the return status of every system call and responds<br> with well-tested error-handling code when things go wrong."<br> <p> LOL that's a good one.<br> <p> In fact, I wonder if this is going to lead to another episode of the<br> "Linus vetoes a change for breaking broken user space code" saga.<br> <p> </div> A survey of memory management patches http://lwn.net/Articles/563026/rss 2013-08-12T00:09:27+00:00 WanpengLi <div class="FormattedComment"> Then you replace it by fallocate or MADV_WILLNEED?<br> </div> Turning off caching http://lwn.net/Articles/562969/rss 2013-08-11T02:02:42+00:00 giraffedata <blockquote> <blockquote> So you copy the whole partition, check that there is no error writing, drop the cache, read it back and check there is no error reading, and check the checksum/SHA1 of the whole partition. </blockquote> Why don't you just use O_DIRECT? </blockquote> <p> One good reason is because then you don't get all the benefits of caching. There's a good reason systems normally write through the buffer/cache, and it probably applies here: you want the kernel to be able to choose the order and size of writes to the device, independent of the order and size of writes by the application. For speed and such. <p> But I remember using an ioctl(BLKFLSBUF) to purge just the cache of a particular device, for speed testing; that's a lot less reckless than dropping every cached piece of information from the entire system. I wonder if that still works. Turning off caching and switching to read-only http://lwn.net/Articles/562968/rss 2013-08-11T01:43:34+00:00 giraffedata <p>Switching a filesystem image read-only <em>cleans</em> the cache, but does not purge it. Thus, when you next read the file and see the correct data, that is no proof that the kernel correctly wrote to the device, which is what the OP wants. For that, you need to purge the cache and then read. <p>As for what happens when you switch to read-only while writing to a file is in progress: The mount() system call to switch to read-only fails. It fails if any file is open for writing. <p> And I'll tell you when else it fails, which causes no end of pain: when there's an unlinked file (a file not in any directory) in the filesystem. Because the kernel must update the filesystem when the file eventually closes (because it must delete the file at that time), the kernel cannot allow the switch to r/o. Turning off caching http://lwn.net/Articles/562966/rss 2013-08-11T01:28:50+00:00 giraffedata I don't know exactly what Linux's current page replacement policy is, but this problem of sequential read of a file too big to fit in cache pushing other stuff out of cache as it goes, called a cache sweep, was solved long ago. The kernel should detect that this is happening and stop caching that file before it does much harm, and I presume that it does. That would explain why Linux doesn't do anything special with POSIX_FADV_NOREUSE. <p> I know that even before modern cache sweep protection was invented, Linux avoided much of the pain by using version of second-chance, so that these pages, since they were referenced only once, would be the first to be evicted and most of the pages that would actually be referenced again would remain. A survey of memory management patches http://lwn.net/Articles/562939/rss 2013-08-10T15:14:44+00:00 luto <div class="FormattedComment"> MADV_WILLWRITE appears to be unnecessary for my application after all, so I'm unlikely to develop it further. If anyone else has a use for it, please speak up.<br> </div> Turning off caching http://lwn.net/Articles/562876/rss 2013-08-09T16:28:42+00:00 jimparis <div class="FormattedComment"> <font class="QuotedText">&gt; So you copy the whole partition, check that there is no error writing, drop the cache, read it back and check there is no error reading, and check the checksum/SHA1 of the whole partition.</font><br> <p> Why don't you just use O_DIRECT?<br> </div> Turning off caching http://lwn.net/Articles/562807/rss 2013-08-09T08:43:32+00:00 etienne <div class="FormattedComment"> No, because the partition is not mounted.<br> On most embedded systems, you have two set of each partitions, and you update the whole unused partition by copying the device itself (that device image may contain a filesystem or just a CPIO or just a binary file like an image of the data to initialise the FPGA or the image of Linux kernel (U-boot cannot read filesystem content)).<br> So you copy the whole partition, check that there is no error writing, drop the cache, read it back and check there is no error reading, and check the checksum/SHA1 of the whole partition.<br> Unlike a PC there isn't any software recovery in case of failure, no expensive (in terms of PCB space) recovery FLASH, the only recovery is to plug an external JTAG adapter and it is slow.<br> Most cards I use have two U-boot, all of them have two Device Tree.<br> <p> </div> Turning off caching http://lwn.net/Articles/562782/rss 2013-08-09T04:17:34+00:00 mathstuf <div class="FormattedComment"> Would "-o remount,ro" work? Unmounting might be a little too disruptive. Of course, I'm not sure what happens when the filesystem behind a rw file I'm using gets remounted ro? Is it still writeable until I close it? Hit the next block? Delay the remount?<br> </div> Turning off caching http://lwn.net/Articles/562710/rss 2013-08-08T19:10:26+00:00 sciurus <div class="FormattedComment"> Won't unmounting it drop the cache for that device?<br> </div> Turning off caching http://lwn.net/Articles/562675/rss 2013-08-08T15:49:54+00:00 etienne <div class="FormattedComment"> <font class="QuotedText">&gt; I'm _not_ a user who calls drop_caches to solve my problems</font><br> <p> I am such a user, but my problem is to check that the device that i have just written (FLASH storage partition) has been correctly written (i.e. the FLASH device driver worked) - so I want to really read back from the FLASH partition and compare to what it should be (and see if there are uncorrected read errors)...<br> It would be nice to have an interface to drop the cache on a single device...<br> </div> Turning off caching http://lwn.net/Articles/562669/rss 2013-08-08T15:11:45+00:00 sbohrer <div class="FormattedComment"> I'm _not_ a user who calls drop_caches to solve my problems, but I've surely been tempted to do it. If you rarely read back any files or when you do read you don't care about read performance then caching file pages hurts the performance of the things you do care about. As an example we have systems that simply log several hundreds of GB of data during the day, and that data is backed up in the evenings. The page cache is essentially useless on these machines since most of the files are bigger than RAM and we really don't care about the read performance as long as the old data is off before the next day starts. On the other hand we do care about write performance/latency and as soon as the page cache fills up you can start experiencing write stalls as old pages are dropped and new pages are allocated. <br> </div> Turning off caching http://lwn.net/Articles/562667/rss 2013-08-08T14:58:56+00:00 sbohrer <div class="FormattedComment"> POSIX_FADV_SEQUENTIAL? I don't think that does what the previous poster asked but I've been surprised by what the various fadvise flags _actually_ do before. POSIX_FADV_NOREUSE sounds like it might avoid caching pages but the man page I'm looking at claims that in 2.6.18+ it is a no-op.<br> <p> I am certain that POSIX_FADV_DONTNEED drops pages from the page cache but it doesn't work for future pages. In other words you have to periodically call it on pages you've previously read or have written which is somewhat annoying. The other gotcha for writes is that POSIX_FADV_DONTNEED doesn't drop dirty pages from the page cache it only initiates writeback so you have to call it twice for each possibly dirty page range if you really want those pages dropped. I currently use this for write-once files or files that I know will no longer be in the page cache by the next time I'm going to need them. <br> </div> Turning off caching http://lwn.net/Articles/562625/rss 2013-08-08T12:13:39+00:00 Funcan <div class="FormattedComment"> man fadvise64 and look for the POSIX_FADV_SEQUENTIAL flag<br> </div> Turning off caching http://lwn.net/Articles/562587/rss 2013-08-08T09:09:51+00:00 epa <div class="FormattedComment"> I can't really see why dropping already cached pages would be helpful, but when working with large files which you will scan sequentially, it is useful to stop the newly read pages being cached. (If the file is bigger than memory, then by the time you get to the end of the file the start of it is no longer in cache, so you have to read it all over again next time. So caching doesn't give any performance benefit, and it would be better to use that memory for other things.)<br> <p> Is there an option to open a file and specify that newly read pages should not be added to the cache?<br> </div> A survey of memory management patches http://lwn.net/Articles/562562/rss 2013-08-08T07:14:12+00:00 xorbe <div class="FormattedComment"> Often when a disk fills up, it isn't actually full. The system usually reserves a few percent for the OS to operate.<br> <p> How come memory isn't treated the same way? I have 16GB, start killing user processes when 256MB free is reached ... lots of hard problems avoided?<br> </div> A survey of memory management patches http://lwn.net/Articles/562546/rss 2013-08-08T03:19:17+00:00 naptastic <div class="FormattedComment"> A customer told me yesterday that he's been a sysadmin for 20 years, and just needed me to tell him how to set his IP address, netmask, and gateway. In terms of dedicated server customers I've dealt with, he's maybe a little below average.<br> <p> I would *LOVE* to be able to "grep dump_cache /var/log/messages" and find out who thought that would be a good idea. It would be, for me, a welcome addition to the noise of BIND, FTPd, and all the rest.<br> </div>