LWN.net Logo

Merging GFS2

Andrew Morton has stated that the OCFS2 cluster filesystem is likely to be merged for 2.6.14. OCFS2 is not the only such filesystem under development, however, and the developers behind the GFS2 filesystem are wondering when it, too, might be merged - into -mm, at least. Much work has been done on GFS to address concerns which have been raised previously; the developers think that it is getting close to ready for wider exposure. The resulting discussion raised a couple of interesting questions about the kernel development process.

The first one was asked by Andrew Morton: "why?". Given that OCFS2 is going in, does the kernel really need another clustered filesystem? What, in particular, does GFS bring that OCFS2 lacks? The answers took two forms: (1) Linux has traditionally hosted a large variety of filesystems, and (2) since cluster filesystems are relatively new, users should be able to try both and see which one works better for them. David Teigland also posted a list of GFS features.

GFS will probably win this argument; there is a clear user community, and filesystems tend not to have any impact on the rest of the kernel. But, still, some developers are starting to wonder; consider, for example, this message from Suparna Bhattacharya:

And herein lies the issue where I tend to agree with Andrew on -- its really nice to have multiple filesystems innovating freely in their niches and eventually proving themselves in practice, without being bogged down by legacy etc. But at the same time, is there enough thought and discussion about where the fragmentation/diversification is really warranted, vs improving what is already there, or say incorporating the best of one into another, maybe over a period of time?

The other issue which came up was the creation of a user-space API for the distributed lock manager (DLM) used by GFS. If nothing else, the two cluster filesystem should have a common API so that applications can be written for either one. One option for this API might be "dlmfs", a virtual filesystem used with OCFS2. The dlmfs approach allows normal filesystem operations to be used for lock management tasks; even shell scripts can perform locking. Concerns with dlmfs include relatively slow performance and a certain unease with aspects of the interface:

Actually I think it's rather sick. Taking O_NONBLOCK and making it a lock-manager trylock because they're kinda-sorta-similar-sounding? Spare me. O_NONBLOCK means "open this file in nonblocking mode", not "attempt to acquire a clustered filesystem lock". Not even close.

(Andrew Morton).

It is not clear that better alternatives exist, however. One could implement it all with a big set of ioctl() calls, but nobody really wants to do that. Another approach would be to create a new set of system calls specifically for lock management. Some have argued in favor of system calls, but others, such as Alan Cox, are strongly opposed:

Every so often someone decides that a deeply un-unix interface with new syscalls is a good idea. Every time history proves them totally bonkers. There are cases for new system calls but this doesn't seem one of them.

Alan lists a number of reasons why a file descriptor-based approach makes sense for this sort of operation - they mostly come down to well-understood semantics and the fact that many things just work.

This is clearly a discussion which could go on for some time. Daniel Phillips points out that this is not necessarily a problem. There are currently no user-space users of any DLM API beyond a few filesystem management tools, so there is no great hurry to merge any API. The cluster filesystems could go in without any user-space DLM interface at all while the developers figure out what that interface should be. And, says Daniel, perhaps there should not be one at all. Despite the perceived elegance of having a single lock manager on the system, having user space rely upon its own, user-space DLM is a workable solution which could simplify the kernel side of things.


(Log in to post comments)

Merging GFS2

Posted Sep 8, 2005 7:10 UTC (Thu) by jwb (guest, #15467) [Link]

As the only Lustre user in the universe[1], I certainly wish I sometimes saw Lustre come up in these discussions. Unlike OCFS2, GFS2, and StorNext, Lustre *really* *is* *different*. It's not shared storage! It's distributed storage! This is very different!

[1] This is untrue. Lustre is actually used by a secret cabal of systems admins in governments and shady agencies worldwide who are in control of 92% of the planet's supercomputing power. I'm just the only Lustre user in LWN :)

Merging GFS2

Posted Sep 8, 2005 16:26 UTC (Thu) by bfields (subscriber, #19510) [Link]

As the only Lustre user in the universe[1], I certainly wish I sometimes saw Lustre come up in these discussions.

Talk them into trying to merge their code, and it will....

Merging GFS2

Posted Sep 15, 2005 10:39 UTC (Thu) by pjdc (subscriber, #6906) [Link]

Given that their business model is based on selling access to the current version in the guise of a support contract, I can't see why they'd bother.

They did try to get some of their VFS changes merged a while back - I seem to recall that some stuff that other in-tree filesystems were able to use was accepted (intents?), but nothing Lustre-specific.

Merging GFS2

Posted Sep 8, 2005 16:55 UTC (Thu) by daniel (subscriber, #3181) [Link]

"Lustre *really* *is* *different*. It's not shared storage! It's distributed storage! This is very different!"

Ahem:

http://sourceware.org/cluster/ddraid/

and redundant too.

Regards,

Daniel

Merging GFS2

Posted Sep 9, 2005 8:15 UTC (Fri) by Jerker (guest, #4582) [Link]

Gfarm Grid File System:
http://datafarm.apgrid.org/software/

Copyright © 2005, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds