LWN.net Logo

Re: [PATCH 3/5] page allocator: Wait on both sync and async congestion after direct reclaim

From:  Pekka Enberg <penberg-bbCR+/B0CizivPeTLB3BmA-AT-public.gmane.org>
To:  Jens Axboe <jens.axboe-QHcLZuEGTsvQT0dZR+AlfA-AT-public.gmane.org>
Subject:  Re: [PATCH 3/5] page allocator: Wait on both sync and async congestion after direct reclaim
Date:  Fri, 13 Nov 2009 15:41:46 +0200
Cc:  Mel Gorman <mel-wPRd99KPJ+uzQB+pC5nmwQ-AT-public.gmane.org>, KOSAKI Motohiro <kosaki.motohiro-+CUm20s59erQFUHtdCDX3A-AT-public.gmane.org>, Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b-AT-public.gmane.org>, Frans Pop <elendil-EIBgga6/0yRmR6Xm/wNWPw-AT-public.gmane.org>, Jiri Kosina <jkosina-AlSwsSmVLrQ-AT-public.gmane.org>, Sven Geggus <lists-+AJD3D7QEjjt/htJsj1pd9AswbaBtrod-AT-public.gmane.org>, Karol Lewandowski <karol.k.lewandowski-Re5JQEeQqe8AvxtiuMwx3w-AT-public.gmane.org>, Tobias Oetiker <tobi-7K0TWYW2a3pyDzI6CaY1VQ-AT-public.gmane.org>, linux-kernel-u79uwXL29TY76Z2rM5mHXA-AT-public.gmane.org, "linux-mm-Bw31MaZKKs3YtjvyW6yDsg-AT-public.gmane.org" <linux-mm-Bw31MaZKKs3YtjvyW6yDsg-AT-public.gmane.org>, Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA-AT-public.gmane.org>, Christoph Lameter <cl-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b-AT-public.gmane.org>, Stephan von Krawczynski <skraw-DcQCyzbjH0jQT0dZR+AlfA-AT-public.gmane.org>, "Rafael J. Wysocki" <rjw-KKrjLPT3xs0-AT-public.gmane.org>, Kernel Testers List <kernel-testers-u79uwXL29TY76Z2rM5mHXA-AT-public.gmane.org>, Linus Torvalds <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b-AT-public.gmane.org>
Archive-link:  Article, Thread

Hi Jens,

On Fri, Nov 13, 2009 at 3:32 PM, Jens Axboe <jens.axboe-QHcLZuEGTsvQT0dZR+AlfA@public.gmane.org>
wrote:
>> Suggest an alternative that brings congestion_wait() more in line with
>> 2.6.30 behaviour then.
>
> I don't have a good explanation as to why the delays have changed,
> unfortunately. Are we sure that they have between .30 and .31? The
> dm-crypt case is overly complex and lots of changes could have broken
> that house of cards.

Hand-waving or not, we have end user reports stating that reverting
commit 8aa7e847d834ed937a9ad37a0f2ad5b8584c1ab0 ("Fix
congestion_wait() sync/async vs read/write confusion") fixes their
(rather serious) OOM regression. The commit in question _does_
introduce a functional change and if this was your average regression,
people would be kicking and screaming to get it reverted.

So is there a reason we shouldn't send a partial revert of the commit
(switching to BLK_RW_SYNC) to Linus until the "real" issue gets
resolved? Yes, I realize it's ugly voodoo magic but dammit, it used to
work!

                        Pekka


(Log in to post comments)

Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds