Posted Nov 22, 2007 11:34 UTC (Thu) by ringerc (subscriber, #3071)
Parent article: The Ceph filesystem
I've long been frustrated by the lack of a solid, general purpose distributed cluster file
system for Linux. There's a real need even if you *don't* have terabytes of data and hundreds
Right now, there's no good way to distribute things like home directories on the network, even
just between a few Linux servers. You either land up using a central server or set of servers
(failure prone, harder to expand, a pain to maintain, etc) or using a cluster FS that relies
on an iSCSI/AoE/FC shared storage backend (expensive, complex). You'll be lucky to find a
network file system that'll give you reliable home directories, either, with correct handling
of lockfiles, various app-specific databases, etc.
In short, even for a common and simple problem like making sure that users have the same home
directory across several machines, all the existing options seem to stink.
For that matter, even traditional options like using NFS to export homedirs from a central
server have, in my experience, been less than reliable. I'm unimpressed with Linux's NFS
support on both the server & client side; I've seen too many unkillable processes,
unlinked-but-perpetually-undeleted files, etc, and I've had to *REBOOT* too many servers to
fix NFS issues. And that's without going into the issues with NFS's not-quite-POSIX FS
A similar issue applies for virtualized servers. They need storage somewhere, and unless you
can afford a SAN that storage is going to be server based (whether internal or external, it
doesn't matter). There's always one box you can't bring down for maintainance without bringing
down some/all of your VMs. Being able to provide distributed storage for VMs would be quite
If I was really dreaming I'd want a native client for Mac OS X and for Windows, so that there
was no need to re-share the cluster FS though a gateway server. Experience suggests that such
native file system implementations are rarely solid enough to work live over the network with,
though, and are usually only good for copying files back and forth.
Even without that, just being able to collect the storage across the servers at my company
into a shared, internally redundant pool that would remain accessible if any one server went
down would be ... wonderful.