|| ||Joel Becker <firstname.lastname@example.org> |
|| ||email@example.com, firstname.lastname@example.org,
email@example.com, firstname.lastname@example.org |
|| ||[PATCH 0/2] [RFC] Adding the MAY_CREATE flag to ->permission() |
|| ||Wed, 14 Oct 2009 02:57:39 -0700|
|| ||Article, Thread
Ran into a fun problem in ocfs2. ocfs2, being a cluster
filesystem, has cluster locks. Being nice to our users, we allow
signals to interrupt the cluster locking layer if it hasn't gotten too
far yet (sleeping on local locking rather than the cluster).
Now, system calls are only allowed to return -ERESTARTSYS if
they can be safely restarted. In ocfs2_mknod(), which underlies
mkdir(2), mknod(2), and creat(2), we allow signals to interrupt us while
we gather our locks, but once we start changing things, there's no going
back. Everyone else does the same thing.
The problem is open(O_CREAT|O_EXCL). See, ocfs2_mknod() will
successfully create the file. Then we get back to
__open_namei_create(), which promptly calls may_open(). This is
backended by ocfs2_permission(), and it needs the cluster lock to
check the new inode's permissions. Send a signal here, and the ocfs2
code will return -ERESTARTSYS. (This is easily verified via
'git-checkout'). When entry.S restarts the open(O_CREAT|O_EXCL), it
gets -EEXIST. Ouch!
We can't naively block signals in ocfs2_permission(). The
majority of calls are not for O_CREAT|O_EXCL. So how do we let
ocfs2_permission() know about this case?
Christoph's suggestion was a new flag to ->permission(). I've
picked MAY_CREATE, but I'm totally open to a better name. I'm open to a
better solution too.
Following this are the MAY_CREATE patch and the ocfs2 patch to
make use of it.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to email@example.com
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/