Lack of data integrity checks
Posted Aug 23, 2007 17:51 UTC (Thu) by brouhaha
Parent article: Distributed storage
There is no data integrity checking built into the DST networking layer; it relies on the networking code to handle that aspect of things.
He's living in a fool's paradise if he thinks that TCP or UDP checksums or the link level FCS (e.g, Ethernet CRC) are going to be sufficient to guarantee data integrity. I've seen far too many times where NFS caused data corruption due to the lack of end-to-end checks.
He should define some end-to-end checking, and allow it to be disabled by people that insist on living dangerously.
The checksum/CRC/whatever should be computed over the payload data AND the block identification (device ID, block number), so as to guarantee both that the data has not been corrupted in transit, and that it really is the requested block rather than some other block.
to post comments)