Ceph distributed filesystem merged for 2.6.34
Posted Mar 27, 2010 1:32 UTC (Sat) by martinfick
In reply to: Ceph distributed filesystem merged for 2.6.34
Parent article: Ceph distributed filesystem merged for 2.6.34
POSIX semantics, especially atime and the serialization constraints, tend to consume even more of the scarce network bandwidth.
Me thinks you are confused. Those work better with (require is too strong a word) low latency links, not high bandwidth.
POSIX semantics are useful in local filesystems because programs rely on them for IPC. Distributed filesystems are rarely used for IPC-- or if they are, the semantics of the FS are customized ahead of time to work well for that, like in Hadoop's filesystem (HDFS), Google-FS, or MPI-IO.
I think you are stuck and cannot think beyond what is done today. The point is not to design something new to use a distributed POSIX filesystem, but to take currently working applications and entire virtual machine operating systems images and to simply run them over a distributed FS instead of locally so that you get all the benefits of a distributed FS (yes, that also means you get the downsides as well, low latency...). But in many cases, the gains will simply outweigh the downsides, not everything needs supercomputer low latency interconnects, but almost everything can benefit from HA and larger single namespaced filesystems.
Just because things were slow yesterday, does not mean they will be slow tomorrow. What is commonplace today for supercomputers will be commonplace tomorrow on desktops. I would even argue that distributed FSes should have been common yesterday, we are way behind the curve on this one. Fast networking gear is getting cheaper and cheaper and can already outpace disk bandwidth. With the growing size of hard disks, so much space is wasted, but even worse, the failure rates are going up and nothing is addressing this. The average desktop has way too big a drive, but rarely uses RAID since that requires a second drive. Why should we stick with NFS when it is a) not POSIX, and b) not HA or distributed, and C) does not scale disk space easily enough.
to post comments)