|From:||Christoph Hellwig <hch-AT-infradead.org>|
|To:||Nick Piggin <nickpiggin-AT-yahoo.com.au>|
|Subject:||Re: [00/17] Large Blocksize Support V3|
|Date:||Thu, 26 Apr 2007 17:07:15 +0100|
|Cc:||Christoph Lameter <clameter-AT-sgi.com>, "Eric W. Biederman" <ebiederm-AT-xmission.com>, linux-kernel-AT-vger.kernel.org, Mel Gorman <mel-AT-skynet.ie>, William Lee Irwin III <wli-AT-holomorphy.com>, David Chinner <dgc-AT-sgi.com>, Jens Axboe <jens.axboe-AT-oracle.com>, Badari Pulavarty <pbadari-AT-gmail.com>, Maxim Levitsky <maximlevitsky-AT-gmail.com>|
On Thu, Apr 26, 2007 at 05:48:12PM +1000, Nick Piggin wrote: > >Well maybe you could explain what you want. Preferably without redefining > >the established terms? > > Support for larger buffers than page cache pages. I don't think you really want this :) The whole non-pagecache I/O path before 2.3 was a toal pain just because it used buffers to drive I/O. Add to that buffers bigger than a page and you add another two mangnitudes of complexity. If you want to see a mess like that download on of the eary XFS/Linux releases that had an I/O path like that. I _really_ _really_ don't want to go there. Linux has a long tradition of trading a tiny bit of efficieny for much cleaner code, and I'd for 100% go down Christoph's route here. Then again I'd actually be rather surprised if > page buffers were more efficient - you'd run into shitloads over overhead due to them beeing non-contingous like calling vmap all over the place, reprogramming iommus to at least make them look virtually contingous , etc.. I also don't quite get what your problem with higher order allocations are. order 1 allocations are generally just fine, and in fact thread stacks are >= oder 1 on most architectures. And if the pagecache uses higher order allocations that means we'll finally fix our problems with them, which we have to do anyway. Workloads continue to grow and with them the kernel overhead to manage them, while the pagesize for many architectures is fixed. So we'll have to deal with order 1 and order 2 allocations better just for backing kmalloc and co. Or think jumboframes for that matter.  many iommu implementation of course also have a limit of how many segments they can actually virtually merge
Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds