Posted Nov 15, 2007 15:05 UTC (Thu) by dion (subscriber, #2764)
Parent article: The Ceph filesystem
This sounds a lot like Google FS, to the point where I'm sort of missing a comparison.
Could it be that the Ceph author read the GFS paper <http://labs.google.com/papers/gfs.html>
and reimplemented the ideas?
... not that is a bad thing, many great pieces of code have been written as implementations
of other peoples ideas, some times better than the original.
Posted Nov 15, 2007 17:44 UTC (Thu) by i3839 (guest, #31386)
[Link]
Probably not. I was thinking about a distributed filesystem too, and what I came up with
resembles Ceph quite a lot. That was a few years ago, never had the time or urge to implement
it. Point being, most design decisions are rather obvious when you set out the requirements.
Google fs?
Posted Nov 15, 2007 18:12 UTC (Thu) by dion (subscriber, #2764)
[Link]
I wasn't trying to dismiss Ceph as being uncreative in any way, I was just missing the
comparison with google fs.
Google fs?
Posted Nov 15, 2007 18:34 UTC (Thu) by sayler (guest, #3164)
[Link]
After reading the OSDI paper
(http://www.usenix.org/events/osdi06/tech/full_papers/weil...), Ceph tries
to solve a more general problem than GFS.
Note that this neither makes GFS nor Ceph better than the other.
Like many solutions to problems in distributed computing these days, GFS optimizes for
specific workloads (the Ceph authors claim "Similarly, the Google File System is optimized
for very large files and a workload consisting largely of reads and file appends." -- check
out section 8 of the OSDI paper for more comparisons).
In general, Ceph is a much conservative approach, wrt the file system interface. The claim is
that the FS exported by Ceph is general purpose, exposes POSIX file semantics, and performs
well across a wider variety of workloads. It will be hard to say whether this is *true*
(whatever that means) until it is used much more widely..
I made some comments on the Kernel Trap discussion of Ceph, but I'll repeat the high point
here: it's cool to see research software GPL'd and targeted toward a general audience.
There are quite a few interesting local and distributed filesystem projects going on right now
(meaning, ones that are attempting to be more than research vehicles). I look forward to
seeing btrfs, Hadoop's cluster FS, Ceph, and other projects which I have no doubt forgotten
about.. :)
Google fs?
Posted Nov 15, 2007 20:02 UTC (Thu) by zooko (subscriber, #2589)
[Link]
There's my project -- http://allmydata.org Tahoe . It is an open source distributed
filesystem with very nice security properties -- everything is encrypted and integrity-checked
and you can share or withhold access to any subtree of the filesystem with anyone. It uses
erasure coding so that you can choose any M between 1 and 256 and any K between 1 and M such
that the file gets spread out onto M servers, where the availability of any K of the servers
is sufficient to make the file available to you.
Regards,
Zooko
Google fs?
Posted Nov 15, 2007 23:31 UTC (Thu) by sayler (guest, #3164)
[Link]
yay! erasure coding!
Google fs?
Posted Nov 16, 2007 3:40 UTC (Fri) by zooko (subscriber, #2589)
[Link]
Yeah! Also we've separately published the erasure coding library, which is derivative of
Luigi Rizzo's old library, but to which we added a Python interface, a command-line interface,
performance optimization, and other stuff.
http://pypi.python.org/pypi/zfec