Re: Inode Lock Scalability V7 (was V6)
[Posted October 26, 2010 by corbet]
| From: |
| Al Viro <viro-AT-ZenIV.linux.org.uk> |
| To: |
| Nick Piggin <npiggin-AT-kernel.dk> |
| Subject: |
| Re: Inode Lock Scalability V7 (was V6) |
| Date: |
| Fri, 22 Oct 2010 04:07:28 +0100 |
| Message-ID: |
| <20101022030728.GH19804@ZenIV.linux.org.uk> |
| Cc: |
| Dave Chinner <david-AT-fromorbit.com>, linux-fsdevel-AT-vger.kernel.org,
linux-kernel-AT-vger.kernel.org |
| Archive‑link: | |
Article |
On Fri, Oct 22, 2010 at 01:34:44PM +1100, Nick Piggin wrote:
> > * walkers of the sb, wb and hash lists can grab ->i_lock at will;
> > it nests inside their locks.
>
> What about if it is going on or off multiple data structures while
> the inode is live, like inode_lock can protect today. Such as putting
> it on the hash and sb list.
Look at the code. You are overengineering it. We do *not* need a framework
for messing with these lists in arbitrary ways. Where would we need to
do that to an inode we don't hold a reference to or had placed I_FREEING
on and would need i_lock held by caller? Even assuming that we need to
keep [present in hash, present on sb list] in sync (which I seriously doubt),
we can bloody well grab both locks before i_lock.
> > inodes. It's not an accidental subtle property of the code, it's bloody
> > fundamental.
>
> I didn't miss that, and I agree that at the point of my initial lock
> break up, the locking is "wrong". Whether you correct it by changing
> the lock ordering or by using RCU to do lookups is something I want to
> debate further.
>
> I think it is natural to be able to lock the inode and have it lock the
> icache state.
Code outside of fs/inode.c and fs/fs-writeback.c generally has no business
looking at the full icache state, period.