Predictive per-task write throttling
The Linux VM subsystem attempts to address this problem with a simple form of write throttling. When the number of dirty pages gets too large, a process caught in the act of dirtying a page will be sent off to write out a few pages before being allowed to proceed. This technique slows the dirtying of pages while simultaneously helping to reclaim pages which have already been written to. This write throttling code makes no attempt to penalize any specific process, however; it will happily throttle any process which dirties a page at the wrong time.
Andrea Arcangeli has decided to improve the situation with a per-task predictive write throttling patch, currently found in the -mm tree. The patch is surprisingly simple - especially after noting that the bulk of it is involved with setting up the /proc and sysctl control interfaces.
At its core, the patch adds a simple accumulator which keeps an approximate count of the number of pages dirtied by each process over the last five seconds. It then assumes that each process will continue to dirty pages at about the same rate into the future. The "are there too many dirty pages?" calculation is then changed to take this rate into account. The code, thus, is making a guess at what the dirty memory situation will be like in the future, based on what each process is doing. Any process which looks like it will cause too much memory to be dirtied gets to perform writeback for a while, while processes which are not writing to lots of pages are not given that particular chore.
Andrea's preliminary results show that, with this patch in place, small,
interactive tasks run in competition with a large copy task will run more
quickly. Since the copy operation is being made to perform writeback (when
it would have otherwise been dirtying more pages), more memory is available
for the other tasks in the system. The interesting part of the result is
that the copy task runs no slower with this patch in place. A process
which is bound by the system's ability to write pages to disk will not
benefit from being allowed to dirty the bulk of the system's memory, and it
will not suffer by being throttled. So this little patch looks like it
could be a winner for everybody involved.
Index entries for this article | |
---|---|
Kernel | Memory management/Writeout throttling |
Kernel | Write throttling |
Posted Sep 21, 2005 22:00 UTC (Wed)
by yokem_55 (subscriber, #10498)
[Link]
Posted Sep 21, 2005 22:01 UTC (Wed)
by arcticwolf (guest, #8341)
[Link]
Posted Sep 21, 2005 22:10 UTC (Wed)
by Quazatron (guest, #4368)
[Link]
Posted Sep 29, 2005 8:52 UTC (Thu)
by ringerc (subscriber, #3071)
[Link]
A `disknice' command would make me want to send a few cases of nice beer to whoever wrote it. A `memnice' command in the same style, for tuning the amount of system cache memory given to a process, would probably demand a container of beer if only I could provide it. Someone who fixes the kernel so neither is required at all ....
Consider a system with a large pile of GUI apps running. It needs the binaries for these apps in RAM, along with things like mailbox index files, various common config files, etc. Now imagine that a backup has pushed all that out of RAM in favour of data that's only going to be used once, so the GUI processes need to read it in from disk now every time they want to use it. That includes the binary images themselves. That's bad, but what's worse is that the backup is slowing disk I/O right down with no way to throttle it, so each access those GUI apps make takes forever. You now have the recipe for how to almost DoS a Linux terminal server by running a backup.
I still don't understand why `tar' and `cp' can't just tell the kernel "this read should not be cached" and "I'm doing bulk I/O".
Posted Sep 29, 2005 15:44 UTC (Thu)
by jimwelch (guest, #178)
[Link]
Posted Sep 29, 2005 20:18 UTC (Thu)
by huaz (guest, #10168)
[Link]
This is marked as a subscriber article, but the full text of the article is on the front news page even for non-subscribers. Something not work right?Predictive per-task write throttling
I think you accidentally posted the whole article as the summary (so even non-subscribers can read it on the headlines page, I assume). :)Predictive per-task write throttling
This is very good news, as I have been noticing this exact problem for a while. When you think the kernel is already very good, someone steps in and makes it better... You gotta love free/libre/open source software...Predictive per-task write throttling
This has driven me nuts for years. Anything that improves the situation is wonderful in my book. Currently, running a backup on my dual Xeon server at work (on an 8-disk RAID) will cause everything else to slow to a crawl for hours. Worse, I have no way to tell the kernel "this is a low priority process, please don't let it monopolize memory or disk resources if someone else wants them," nor can I tell the kernel "this process is just doing a once-off copy, please don't shove everything it reads into your IO cache."Good news
Arg!, Andrea needs to be promoted to Commodore Andrea for this one. A simple, elegant fix for a complex, long standing problem! Maybe that should be Admiral Andrea. (AA) harg-harg! (Ok, so I'm a little late for pirate day, real life is always a drag on free(as in FOSS) time.)Predictive per-task write throttling
Hans Reiser claims this patch degrades Reiser4 performance substantially.Performance degradation?