LWN.net Logo

Re: statfs() / statvfs() syscall ballsup...

From:  Linus Torvalds <torvalds-AT-osdl.org>
To:  Greg Stark <gsstark-AT-mit.edu>
Subject:  Re: statfs() / statvfs() syscall ballsup...
Date:  Sun, 12 Oct 2003 09:13:50 -0700 (PDT)
Cc:  Joel Becker <Joel.Becker-AT-oracle.com>, Jamie Lokier <jamie-AT-shareable.org>, Trond Myklebust <trond.myklebust-AT-fys.uio.no>, Ulrich Drepper <drepper-AT-redhat.com>, Linux Kernel <linux-kernel-AT-vger.kernel.org>


On 12 Oct 2003, Greg Stark wrote:
> 
> There are other reasons databases want to control their own cache. The
> application knows more about the usage and the future usage of the data than
> the kernel does.

But this again is not an argument for not using the page cache - it's only 
an argument for _telling_ the kernel about its use.

> However on busy servers whenever it's run it causes lots of pain because the
> kernel flushes all the cached data in favour of the data this job touches.

Yes. But this is actually pretty easy to avoid in-kernel, since all of the 
LRU logic is pretty localized.

It could be done on a per-process thing ("this process should not pollute 
the active list") or on a per-fd thing ("accesses through this particular 
open are not to pollute the active list"). 

>									 And
> worse, there's no way to indicate that the i/o it's doing is lower priority,
> so i/o bound servers get hit dramatically. 

IO priorities are pretty much worthless. It doesn't _matter_ if other 
processes get preferred treatment - what is costly is the latency cost of 
seeking. What you want is not priorities, but batching.

			Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo-AT-vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


(Log in to post comments)

Copyright © 2003, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds