Posted Apr 26, 2007 22:03 UTC (Thu) by pimlott
Parent article: Filesystems: chunkfs and reiser4
The core idea is to take a filesystem and split it into several independent filesystems, each of which maintains its own clean/dirty state. Should things go wrong, only those sub-filesystems which were active at the time of failure need to be checked.
I'm not a filesystems hacker, but I think this article misses the real point of chunkfs. After all, the central problem is that you don't know when "things go wrong". Corruption occurs unpredictably, for any number of reasons, so you need to assume that it can happen anywhere, anytime. The exciting use of the dirty bit, as I understand, is the ability to on-line fsck the chunks that are presently not dirty. Granted, you can also fsck the dirty chunks after a system crash, but for modern filesystems this just requires journal replay, which is fast anyway. Though I suppose if a chunk is so active that it never gets fscked on-line, you want to full-fsck it whenever the filesystem is off-line.
The next step is writing data with checksums or even error-correcting codes. But the real solution, for those who are serious about data integrity, is end-to-end checksums or ECC; ie, assure the integrity of the data from the moment is created to the moment it is consumed.
to post comments)