LWN.net Logo

Hashes and collisions

Hashes and collisions

Posted Nov 10, 2003 16:49 UTC (Mon) by dd9jn (subscriber, #4459)
In reply to: Hashes and collisions by nowster
Parent article: An attempt to backdoor the kernel

You are mixing up simple hash fucntions (e.g. CRC-32) with cryptographic hash functions (e.g. SHA-1). The latter have a couple of important design criteria and thus you won't ever see a duplicated hash value. If you ever find or can construct a different second file hashing to the same value you have broken that hash function with huge consequences for about all cryptographic protocols. Even for the old MD5 algorithms, no collision has ever been found (the often reported weakness found by Hans Dobbertin is on a reduced MD5 variant). Using MD5 or SHA-1 is a perfectly good choice to identify a text.


(Log in to post comments)

Hashes and collisions

Posted Nov 13, 2003 10:47 UTC (Thu) by rjw (guest, #10415) [Link]

Erm.

A hash function produces a small fixed number of bits from a large, potentially limitless number of bits.

Just think for one second about that.
Lets take a file on a crappy 32-bit operating system as a limit. And hash it to 160 bits.

How the hell do you think that every one of 2^(2^32) combinations can map to 2^160 combinations uniquely?

Cryptographic hash functions merely make it computationally hard to find a match, and even harder to find a match which contains what you want it to.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds