|
|
Subscribe / Log in / New account

Re: [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!)

From:  Johannes Weiner <hannes-AT-cmpxchg.org>
To:  John Stultz <john.stultz-AT-linaro.org>
Subject:  Re: [PATCH 0/4] Volatile Ranges (v14 - madvise reborn edition!)
Date:  Tue, 3 Jun 2014 10:57:10 -0400
Message-ID:  <20140603145710.GQ2878@cmpxchg.org>
Cc:  LKML <linux-kernel-AT-vger.kernel.org>, Andrew Morton <akpm-AT-linux-foundation.org>, Android Kernel Team <kernel-team-AT-android.com>, Robert Love <rlove-AT-google.com>, Mel Gorman <mel-AT-csn.ul.ie>, Hugh Dickins <hughd-AT-google.com>, Dave Hansen <dave-AT-sr71.net>, Rik van Riel <riel-AT-redhat.com>, Dmitry Adamushko <dmitry.adamushko-AT-gmail.com>, Neil Brown <neilb-AT-suse.de>, Andrea Arcangeli <aarcange-AT-redhat.com>, Mike Hommey <mh-AT-glandium.org>, Taras Glek <tglek-AT-mozilla.com>, Jan Kara <jack-AT-suse.cz>, KOSAKI Motohiro <kosaki.motohiro-AT-gmail.com>, Michel Lespinasse <walken-AT-google.com>, Minchan Kim <minchan-AT-kernel.org>, Keith Packard <keithp-AT-keithp.com>, "linux-mm-AT-kvack.org" <linux-mm-AT-kvack.org>
Archive‑link:  Article

On Thu, May 08, 2014 at 10:12:40AM -0700, John Stultz wrote:
> On 04/29/2014 02:21 PM, John Stultz wrote:
> > Another few weeks and another volatile ranges patchset...
> >
> > After getting the sense that the a major objection to the earlier
> > patches was the introduction of a new syscall (and its somewhat
> > strange dual length/purged-bit return values), I spent some time
> > trying to rework the vma manipulations so we can be we won't fail
> > mid-way through changing volatility (basically making it atomic).
> > I think I have it working, and thus, there is no longer the
> > need for a new syscall, and we can go back to using madvise()
> > to set and unset pages as volatile.
> 
> Johannes: To get some feedback, maybe I'll needle you directly here a
> bit. :)
> 
> Does moving this interface to madvise help reduce your objections?  I
> feel like your cleaning-the-dirty-bit idea didn't work out, but I was
> hoping that by reworking the vma manipulations to be atomic, we could
> move to madvise and still avoid the new syscall that you seemed bothered
> by. But I've not really heard much from you recently so I worry your
> concerns on this were actually elsewhere, and I'm just churning the
> patch needlessly.

My objection was not the syscall.

From a reclaim perspective, using the dirty state to denote whether a
swap-backed page needs writeback before reclaim is quite natural and I
much prefer Minchan's changes to the reclaim code over yours.

From an interface point of view, I would prefer the simplicity of
cleaning dirty bits to invalidate pages, and a default of zero-filling
invalidated pages instead of sending SIGBUS.  This also is quite
natural when you think of anon/shmem mappings as cache pages on top of
/dev/zero (see mmap_zero() and shmem_zero_setup()).  And it translates
well to tmpfs.

At the same time, I acknowledge that there are usecases that want
SIGBUS delivery for more than just convenience in order to implement
userspace fault handling, and this is the only place where I see a
real divergence in actual functionality from Minchan's code.

That, however, truly is a separate virtual memory feature.  Would it
be possible for you to take MADV_FREE and MADV_REVIVE as a base and
implement an madvise op that switches the no-page behavior of a VMA
from zero-filling to SIGBUS delivery?



to post comments


Copyright © 2014, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds