LWN.net Logo

Keeping disks busy

Some of the changes that have been stressing the I/O scheduler have gone in very recently. A couple of patches from Andrew Morton are currently sitting in Linus's 2.5.39 BitKeeper tree; they are worth a look.

In the end, much of the work done by the VM subsystem is deciding which pages to move to which disks, and when. A good set of decisions will lead to good performance; but if the VM is not smart in which pages it shoves out, performance can suffer. Some of Andrew's recent efforts stem from an important observation that has been somewhat overlooked until now: there is little point in trying to write pages to disks which are already overwhelmed with requests.

If you want to try to direct your efforts toward disks which are not overly busy, you first need some sort of indication of just how much work each drive has to do. So Andrew has added a new set of functions that report on whether a device's request queue is congested or not. The test used is simplistic: a device's read or write queue is not congested if at least 25% of the allocated request queue entries (a fixed number of these is allocated at queue creation time) are available for use. A simple test is good enough, though, especially considering that the size of a request queue tends to be volatile.

Once you can test for a congested state, you can start making smarter decisions. Once these functions were in, Linus merged another patch which causes the ext2 filesystem to cut back on speculative readahead operations if the underlying device is busy. If the disks are backed up, presumably there are more important things for them to be doing than reading ahead data that may or may not be used.

More impressive performance gains, however, can be had by looking at the pdflush subsystem. pdflush is a set of kernel threads whose job it is to write dirty file data back to the underlying filesystems. A fair amount of effort goes into keeping separate pdflush threads from trying to write back to the same device, and to simply keeping the right number of pdflush threads around.

With the new scheme, life gets easier. pdflush does its best to simply pass over pages when the destination queue is congested; instead, it concentrates on pages that can be written to less busy devices. Thus pdflush no longer blocks on request queues, and can concentrate on keeping them all full. A side benefit is that a single pdflush thread may now be sufficient.

According to Andrew: "This code can keep sixty spindles saturated - we've never been able to do that before." It is increasingly apparent that the 2.6 kernel is going to be an amazing performer in numerous areas, thanks to work like this.


(Log in to post comments)

Keeping disks busy

Posted Sep 27, 2002 16:46 UTC (Fri) by aglet (guest, #1334) [Link]

Probably the most inappropriate place for this comment, but I have the opposite problem. I'd like my disks to be kept as idle as possible.

A journalled filesystem is a very good thing for laptops, where power outages (and associated unclean shutdowns) are a relatively frequent ocurrence. I've switched my laptop over to ext3 for unrelated reasons (IRQ problems cause it to lock up every now and then, necessitating a hard reset, and the disk is big enough for this to be annoying...). I used to run noflushd so that the disk stayed spun down, but since ext3 does a flush every 5 seconds this has stopped working.

Which brings me, finally, to my question: is there any way to reconcile usage of a journalled filesystem with civilised battery usage? I'm aware of course that if changes were only committed to disk rarely I'd risk losing more data, but I'd like to be able to make that trade-off.

Keeping disks busy

Posted Sep 27, 2002 18:47 UTC (Fri) by steveha (guest, #3876) [Link]

Have you looked into "noatime"? By default, the disk system keeps track of when you access files -- reading a file is accessing it, so reading a file results in a write to update the access time. If you add the "noatime" option to your /etc/fstab file (where the filesystem mounting options are specified) it will cut down your disk writes.

steveha

Keeping disks busy

Posted Sep 28, 2002 15:09 UTC (Sat) by chaostrophy (guest, #662) [Link]

Have a look at the last few kernel traffics (http://kt.zork.net/kernel-traffic/). There were some patches to help with that, it delays writes until some read must happen, then spins the disk up, flushs all writes, and does as much reading as it can.

Here is is from lwn:
http://lwn.net/Articles/1652/

OK, so it was more than 2-3 weeks ago.

Good luck,
Ron

looking for a sysadmin gig

Copyright © 2002, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds