User: Password:
|
|
Subscribe / Log in / New account

Re: [00/17] Large Blocksize Support V3

From:  Christoph Hellwig <hch-AT-infradead.org>
To:  Nick Piggin <nickpiggin-AT-yahoo.com.au>
Subject:  Re: [00/17] Large Blocksize Support V3
Date:  Thu, 26 Apr 2007 17:07:15 +0100
Cc:  Christoph Lameter <clameter-AT-sgi.com>, "Eric W. Biederman" <ebiederm-AT-xmission.com>, linux-kernel-AT-vger.kernel.org, Mel Gorman <mel-AT-skynet.ie>, William Lee Irwin III <wli-AT-holomorphy.com>, David Chinner <dgc-AT-sgi.com>, Jens Axboe <jens.axboe-AT-oracle.com>, Badari Pulavarty <pbadari-AT-gmail.com>, Maxim Levitsky <maximlevitsky-AT-gmail.com>
Archive-link:  Article, Thread

On Thu, Apr 26, 2007 at 05:48:12PM +1000, Nick Piggin wrote:
> >Well maybe you could explain what you want. Preferably without redefining 
> >the established terms?
> 
> Support for larger buffers than page cache pages.

I don't think you really want this :)  The whole non-pagecache I/O
path before 2.3 was a toal pain just because it used buffers to drive
I/O.  Add to that buffers bigger than a page and you add another
two mangnitudes of complexity.  If you want to see a mess like that
download on of the eary XFS/Linux releases that had an I/O path
like that.  I _really_ _really_ don't want to go there.

Linux has a long tradition of trading a tiny bit of efficieny for
much cleaner code, and I'd for 100% go down Christoph's route here.
Then again I'd actually be rather surprised if > page buffers
were more efficient - you'd run into shitloads over overhead due to
them beeing non-contingous like calling vmap all over the place,
reprogramming iommus to at least make them look virtually contingous [1],
etc..

I also don't quite get what your problem with higher order allocations
are.  order 1 allocations are generally just fine, and in fact
thread stacks are >= oder 1 on most architectures.  And if the pagecache
uses higher order allocations that means we'll finally fix our problems
with them, which we have to do anyway.  Workloads continue to grow and
with them the kernel overhead to manage them, while the pagesize for
many architectures is fixed.  So we'll have to deal with order 1
and order 2 allocations better just for backing kmalloc and co.

Or think jumboframes for that matter.


[1] many iommu implementation of course also have a limit of how many
    segments they can actually virtually merge


(Log in to post comments)


Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds