|| ||Daniel Phillips <phillips-AT-istop.com>|
|| ||Andrew Morton <akpm-AT-osdl.org>|
|| ||Re: Re: GFS, what's remaining|
|| ||Sun, 4 Sep 2005 15:51:56 -0400|
|| ||Joel.Becker-AT-oracle.com, linux-cluster-AT-redhat.com,
On Sunday 04 September 2005 03:28, Andrew Morton wrote:
> If there is already a richer interface into all this code (such as a
> syscall one) and it's feasible to migrate the open() tricksies to that API
> in the future if it all comes unstuck then OK. That's why I asked (thus
> far unsuccessfully):
> Are you saying that the posix-file lookalike interface provides
> access to part of the functionality, but there are other APIs which are
> used to access the rest of the functionality? If so, what is that
> interface, and why cannot that interface offer access to 100% of the
> functionality, thus making the posix-file tricks unnecessary?
There is no such interface at the moment, nor is one needed in the immediate
future. Let's look at the arguments for exporting a dlm to userspace:
1) Since we already have a dlm in kernel, why not just export that and save
100K of userspace library? Answer: because we don't want userspace-only
dlm features bulking up the kernel. Answer #2: the extra syscalls and
interface baggage serve no useful purpose.
2) But we need to take locks in the same lockspaces as the kernel dlm(s)!
Answer: only support tools need to do that. A cut-down locking api is
entirely appropriate for this.
3) But the kernel dlm is the only one we have! Answer: easily fixed, a
simple matter of coding. But please bear in mind that dlm-style
synchronization is probably a bad idea for most cluster applications,
particularly ones that already do their synchronization via sockets.
In other words, exporting the full dlm api is a red herring. It has nothing
to do with getting cluster filesystems up and running. It is really just
marketing: it sounds like a great thing for userspace to get a dlm "for
free", but it isn't free, it contributes to kernel bloat and it isn't even
the most efficient way to do it.
If after considering that, we _still_ want to export a dlm api from kernel,
then can we please take the necessary time and get it right? The full api
requires not only syscall-style elements, but asynchronous events as well,
similar to aio. I do not think anybody has a good answer to this today, nor
do we even need it to begin porting applications to cluster filesystems.
Oracle guys: what is the distributed locking API for RAC? Is the RAC team
waiting with bated breath to adopt your kernel-based dlm? If not, why not?
to post comments)