LWN.net Logo

The Global File System goes full circle

June 30, 2004

This article was contributed by Joe 'Zonker' Brockmeier.

In 2003, Red Hat announced that it was acquiring Sistina, and that it would work to release Sistina's current technologies as open source in 2004. Red Hat made good on that promise on June 24 by re-releasing the Global File System under the GPL. The Global File System (GFS) has a fairly long and interesting history. According to the OpenGFS website, the GFS project started at the University of Minnesota and was sponsored from 1995-2000 by the University. Then Matthew O'Keefe, a professor at the university, founded Sistina around GFS.

Sistina stopped making new versions of GFS available under the GPL in 2001. It's important to note that it's inaccurate to say (as many have) that GFS has been "re-released" under the GPL -- the original code that was available under the GPL remained available under the GPL. Sistina simply quit putting out new releases under the GPL, but users still had the option of using and working with releases prior to Sistina's license change, as did the OpenGFS project.

The release put out by Red Hat last week actually consists of more than just GFS the file system; it totals nine components in all. In addition to GFS itself, Red Hat has released the clustering extensions to the Logical Volume Manager 2 (LVM2). Also, Red Hat has released clustering infrastructure tools and cluster block devices that work with GFS; The Cluster Configuration System (CCS), Cluster Manager (CMAN), Distributed Lock Manager (DLM), GFS Unified Lock Manager (GULM), the Fence I/O fencing system, the Global Network Block Device (GNBD) and the Cluster Snapshot Block Device (CSBD).

Linux has no shortage of filesystems to choose from, but GFS is quite a bit different from Ext3, ReiserFS and other popular file systems being used with Linux today. The GFS release probably isn't that interesting for users with a single Linux workstation or for small installations of Linux systems that don't require a great deal of filesystem sharing or redundancy. For Linux shops that have deployed or plan to deploy Linux in a clustering capacity or using a Storage Area Network (SAN) to share filesystems among servers, instead, GFS is a very interesting technology.

GFS allows Linux servers to share a single file system on a block device via fiber channel, iSCSI, NDB or other technology, and allows those servers to simultaneously read from that file system and coordinates writes to the filesystem to avoid data being overwritten. Changes to the filesystem made by one server are immediately available to other servers. GFS is different from the Network File System (NFS) in that it removes the requirement for clients to access storage devices through an NFS server. It removes some of the overhead from working with data, making GFS more robust. One can use the two technologies in conjunction with one another, using GFS to give a set of servers access to a filesystem stored on a set of fiber channel drives (for example) and then exporting the filesystem to clients via NFS.

GFS is highly scalable, which means that hundreds of systems can share a filesystem on a SAN. In addition, as one might expect, file system and volume resizes can be performed while the system is running -- which means that enterprise systems don't need to be brought down for filesystem maintenance when a deployment starts to require more space. The file servers themselves can be clustered to provide high availability, redundancy and increased performance. Just what the doctor ordered for a database cluster, enterprise file servers, large e-mail installations and many other applications.

For those interested in trying out GFS, source RPMs are available for Red Hat Enterprise Linux 3, CVS snapshots are available, and enterprising Fedora user Lennert Buytenhek has already whipped up FC2 RPMs of GFS and the necessary tools. Packages are no doubt being prepared for other popular Linux distributions as well. Instructions on using GFS can be found here.

Of course, RHEL users still have the option of buying GFS for a mere $2200.

The GFS team is now working to put GFS into the mainline Linux kernel. It shouldn't be terribly difficult for a project this useful to find a healthy community of users to apply whatever elbow grease is necessary to make that happen.


(Log in to post comments)

Not so sure about that scalability

Posted Jul 1, 2004 0:59 UTC (Thu) by roblatham (subscriber, #1579) [Link]

You make the claim "GFS is highly scalable, which means that hundreds of systems can share a filesystem on a SAN", and I'm curious where you got that information.

I have not seen or heard anyone report deploying GFS with more than 32 nodes. The jazz cluster at Argonne National Laboratory, for example, has 8 NFS servers export a GFS file system, but that is only 8 nodes accessing GFS at any time.

While you no doubt *could* have 100s of systems sharing a GFS file system, the expensive SAN infrasturcute you would require would make that a very rare configuration. The file-based locking mechanism makes GFS a poor choice for the large high performance clusters linux is so famous for.

Kudos to Redhat for releasing more code into the wild. People desiring a way to share a file system between two high-availabilty web or email server will rejoyce, no doubt.

Not so sure about that scalability

Posted Jul 1, 2004 7:30 UTC (Thu) by AnswerGuy (guest, #1256) [Link]

I would think that a GFS and perhaps NFSv4 or AFS could be combined into a SAN/NAS cluster product.

You'd have several nodes acting as file servers (to the network) and using a on the SAN that they share.

Not so sure about that scalability

Posted Jul 1, 2004 22:00 UTC (Thu) by seanegan (subscriber, #15672) [Link]

I work for a large three letter company with a RedHat support contract. RedHat came out to our campus to give a talk about GFS. I learned a couple things.

First, they said there are current deployments of GFS with more than 32 block device nodes. The limiting factor is the lock manager. They now have a load balancing and redundent lock manager servers. So rather than having a single lock manager server, you now have a small cluster of them.

Second, the locks are per-block not as you said per-file. And the locks are revokable if a host with a GFS FS mounted becomes uncommunicative the lock can be recovered from that host.

Third, you do not need a SAN (I'm thinging you meant Fibre Channel SAN) to use GFS. You can cost effectively set up a Gb ethernet LAN and use the GNDB server to serve blocks over TCP. I pressed them for an estimate of the speed up compared to NFS. The response was reluctant, but they said one customer measured speedups around 25 times faster than NFS on the same LAN. The presenter continued that it could be slower or faster than that number depending on usage but that it would never be slower than NFS.

The Global File System goes full circle

Posted Jul 1, 2004 14:09 UTC (Thu) by jeremiah (subscriber, #1221) [Link]

does anyone know a good resource for learning about Fibre channel SAN and NAS configurations. I have one comming up in my future and would like to be a little more prepared than I am now without having to go through some highly overpriced training course. The whole area seems to be vailed in mystery to the average small shop.

Fibre channel

Posted Jul 2, 2004 0:08 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

I think it's veiled in mystery to a small shop because it's immersed in high costs that are impenetrable for the small shop. Fibre channel hardware and software are generally thought of as being practical only for high end shops. Think hundreds or thousands of servers and disks.

This is particularly true now that Ethernet SANs (ISCSI) are becoming more real.

There are no fibre channel NAS (where NAS means file-level networking) systems. Not that there couldn't be; people just don't see the point when Ethernet NAS is doing the job.

Sorry, I don't have an answer the actual question. But I do know that when people demonstrate the high cost of fibre channel, they include the "overpriced training courses" mentioned. They note that it's a lot cheaper to get Ethernet/IP expertise (free, really, since you need that anyway).

Fibre channel

Posted Jul 2, 2004 19:41 UTC (Fri) by jeremiah (subscriber, #1221) [Link]

I don't really agree that the costs are impenetrable. You can get Dual raid chasies for 12K plus 14X$800 for a full set of disks. The controlers for each server are not too bad, and the switches are a couple of grand as well. So your looking at 30K for a 4 terabyte, high speed, redundant array. Even EMC is in this range. But add the required traning for DEll/EMC in order to buy it, and that's an additional 20K, and THAT'S not afforadable. It is true that 30k isn't cheap, but any time you need that much storage for High use systems like a web based image server (court documents in our case with 150K hits per day between 8am-5pm) you're going to spend a chunk of change, but almost half on traning is absurd.

Uh...

Posted Jul 1, 2004 17:29 UTC (Thu) by bjn (guest, #2179) [Link]

'it's inaccurate to say (as many have) that GFS has been "re-released" under the GPL'

You mean like this article does, right in the preceding paragraph? ;-)

Uh...

Posted Jul 1, 2004 18:06 UTC (Thu) by corbet (editor, #1) [Link]

I must take the blame for that one...the sentence in the earlier paragraph was a last-minute edit on my part. It's not Zonker's fault.

The Global File System goes full circle

Posted Jul 30, 2008 9:44 UTC (Wed) by joey159 (guest, #53174) [Link]

A Holocene discussion on the lkml analyzed the possibility of a Linux execution of Sun's ZFS.
Licenses involved cover file system _code_, rather than storage format.
That is openly specified. Just stand up and implement driver for zfs
format from scratch under whatever license you want. This is exactly how
Linux supports "foreign" file systems.
What are the thoughts of the Linux community?
------------------------------
Joey

http://www.widecircles.ca

Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds