User: Password:
|
|
Subscribe / Log in / New account

Re: [PATCH] [Request for inclusion] Filesystem in Userspace

From:  Linus Torvalds <torvalds-AT-osdl.org>
To:  Jamie Lokier <jamie-AT-shareable.org>
Subject:  Re: [PATCH] [Request for inclusion] Filesystem in Userspace
Date:  Thu, 18 Nov 2004 10:47:44 -0800 (PST)
Cc:  Miklos Szeredi <miklos-AT-szeredi.hu>, hbryan-AT-us.ibm.com, akpm-AT-osdl.org, linux-fsdevel-AT-vger.kernel.org, linux-kernel-AT-vger.kernel.org, pavel-AT-ucw.cz



On Thu, 18 Nov 2004, Jamie Lokier wrote:
>
> Linus Torvalds wrote:
> > Why do you think it would kill the FUSE process? And why do you think 
> > killing _any_ process would make the system come back to life? After all, 
> > memory wasn't filled by process usage, it was filled by dirty FS pages.
> > 
> > I really do believe that user-space filesystems have problems. There's a 
> > reason we tend to do them in kernel space. 
> 
> Are kernel space filesystems immune from this problem?  What happens
> when they need to kmalloc() in order to write some data?

That's why we have GFP_NOFS and other flags (PF_MEMALLOC etc). So yes,
they are "immune" in the sense that they have been inocculated, but not in
the sense that they can't have the bug conceptually.

So the kernel not only keeps a set of reserved pages for atomic 
allocations, but also the VM knows not to recurse into a filesystem 
operation when the reason for the memory allocation was a low-memory 
circumstance. When a filesystem asks for memory in the page-out path, the 
VM may still throw out cached pages for that FS, but it won't try to write 
them back.

Guys, there is a _reason_ why microkernels suck. This is an example of how 
things are _not_ "independent". The filesystems depend on the VM, and the 
VM depends on the filesystem. You can't just split them up as if they were 
two separate things (or rather: you _can_ split them up, but they still 
very much need to know about each other in very intimate ways).

So what do you do? You limit shared dirty pages (inefficient memory use),
or you disallow certain behaviours, or you add tons of new interfaces to
expose essentially the same "every thing that can allocate and is on the
write-out path takes a GFP flag".

User-space filesystems are hard to get right. I'd claim that they are 
almost impossible, unless you limit them somehow (shared writable mappings 
are the nastiest part - if you don't have those, you can reasonably limit 
your problems by limiting the number of dirty pages you accept through 
normal "write()" calls). 

			Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


(Log in to post comments)


Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds