User: Password:
Subscribe / Log in / New account

Lack of data integrity checks

Lack of data integrity checks

Posted Aug 23, 2007 19:22 UTC (Thu) by alex (subscriber, #1355)
In reply to: Lack of data integrity checks by brouhaha
Parent article: Distributed storage

There was a very interesting talk given by a friend of mine from Google about the sort of failures they experience. One example was a data corruption event that wasn't caught by either the TCP checksums and the filesystems own internal checksums.

You don't protect your data with just one number....

(Log in to post comments)

Lack of data integrity checks

Posted Aug 24, 2007 8:46 UTC (Fri) by intgr (subscriber, #39733) [Link]

One number is fine if it is long enough; relying on a 32-bit checksum is naive indeed.

The MD5 TCP checksum feature in Linux kernels might be useful, but as it is not offloaded to the networking hardware, it's too slow for >100Mbit Ethernet. Employing a faster checksum function on the application layer sounds like a more practical idea.

Lack of data integrity checks

Posted Aug 24, 2007 16:30 UTC (Fri) by brouhaha (subscriber, #1698) [Link]

The issue isn't whether a 32-bit CRC is good enough to protect a packet. For maximum length normal Ethernet frames, I would claim that it is good enough. We're trying to detect errors here, not to make it secure against deliberate alteration. If you need to protect against an adversary that may introduce deliberate alterations in your data, you need crytography.

The issue for error detection is that the Ethernet FCS only applies for one hop of a route, and gets recomputed by each router along the way. Thus it does not offer end-to-end protection. The packet will have opportunities to be corrupted between hops, and the node that the packet finally arrives at can only trust the FCS to mean that it wasn't corrupted on the wire since leaving the last router.

A UDP checksum is both better and worse. It's better in that it is end-to-end, but it's far worse in that a 16 bit checksum is very weak in its error detection probability compared to a 32-bit CRC. Part of the weakness is the 16-bit size, but part of it is the nature of a checksum.

I'm not arguing that the integrity checking should be done at the application layer. Although there are certainly applications that should do that, what I'm arguing for is that the remote block device client and server code need to do end-to-end error checking at their own level in the protocol stack.

Lack of data integrity checks

Posted Aug 25, 2007 19:29 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

I'm unclear on what corruptions you're concerned about. When people say "end to end," they're pointing to the fact that something could get corrupted before or after some other integrity check is done. Are you saying there's a significant risk that the data gets corrupted inside a router (outside of Ethernet integrity checks) or inside the client or server network stack (outside of UDP integrity checks)? Are we talking about OS bugs?

Just wondering, because while all kinds of failures are possible, it wouldn't make sense to protect against some risk that we routinely accept in other areas.

You also mention the UDP checksum as simply being too weak. If that's the problem, then I would just refer to "additional integrity checks" rather than emphasize "end to end."

Lack of data integrity checks

Posted Aug 24, 2007 21:01 UTC (Fri) by zdzichu (subscriber, #17118) [Link]

That what IPSec is for. AH or ESP without encryption (hash only) will catch errors missed by TCP/UDP checksums.

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds