LWN.net Logo

may not be as bad as it appears

may not be as bad as it appears

Posted Aug 18, 2004 18:02 UTC (Wed) by ajax (guest, #7251)
Parent article: Crypto researchers abuzz over flaws (News.com)

Generally speaking, an alternate file generating the same MD5
checksum will look like gibberish rather than English or a
C program or whatever. So, for example, if one
download's what one thinks is apache sources, and the checksums
match, and the source looks like apache, then one could have
confidence that it is the unmodified Apache, even if MD5
proves to be flawed.


(Log in to post comments)

may not be as bad as it appears

Posted Aug 18, 2004 18:24 UTC (Wed) by AnswerGuy (guest, #1256) [Link]

Don't count on it.

While you may be write, and it's obvious that *most* checksum collisions would be obvious and visible (for text files) and non-functional (for executables) you're not accounting for the number of ways in which MD5 and other checksums are embedded into automated processes and protocols.

Nothing is "obviously wrong" to other software; that's a human value judgement.

Also it may be that someone might be able to pad out a file with non-obvious text or no-op code while generating a deliberate hash collision. For example one might append spaces and tabs to the ends of the(compromised/forged) document or the end of each line.

We can intuit that double checksums would make the attacker's job more difficult (using MD5 *and* SHA-1). But this intuition could be flawed and would require some qualified research to assess before it could be recommended for real business use.

I would say that the current weaknesses (theoretical and practical) are still overshadowed by more pragmatic concerns.

Most of us should continue to focus on key/signature management issues (how do we verify any given checksum, how to we verify the provenance of the checksumming utility) and social issues (did the user who generated the checksum use a compromised copy of the tool).

Meanwhile there are still niches for researchers to continue developing more secure (collision resistant) hashes and for them to continue trying to break the existing and newly proposed algorithms, and for them to quantify the security/weakness of existing, proposed, and combinations of hashes.

JimD

may not be as bad as it appears

Posted Aug 19, 2004 14:46 UTC (Thu) by beejaybee (guest, #1581) [Link]

Better still, for open source software, distribute checksums for object code (as compiled by a specific compiler on a specific hardware platform, which may not be the same as the platform the user of distributed source code intends to compile for, nor the same version of the compiler the user intends to use in a production environment) _in addition to_ checksums for the source code.

The point here is that even when it becomes economic to construct a fraudulent source file with a specific checksum, having the checksum of the object matching as well is at least several, possibly many, orders of magnitude more difficult. Downloading the extra checksums is a very marginal cost; whilst, if a very common compiler / hardware platform is chosen, finding a suitable system to run the integrity check on should not be too difficult.

So here's a security plus for OSS. Closed source (binary distribution) software products simply can't compete.

may not be as bad as it appears

Posted Aug 18, 2004 18:47 UTC (Wed) by ncm (subscriber, #165) [Link]

In principle it only takes 20 bytes of carefully-chosen garbage added to any text to give a chosen SHA-1 signature, once you've broken it.

It's often pretty easy to find a place to put that much extra stuff, buried in an ELF section or debug annotation of an executable, in extra compression table entries of a tarball, even in text of a diff that you know patch will skip, and that a human would ignore knowing that patch skips it--e.g. in a .sig.

If it must be base64, you need 30 bytes, instead; or 40 bytes of hex, or 70 decimal digits. 14 common lower-case four-letter English words suffice.

may not be as bad as it appears

Posted Aug 18, 2004 20:32 UTC (Wed) by smoogen (subscriber, #97) [Link]

Dont you mean SHA-0 versus SHA-1?

may not be as bad as it appears

Posted Aug 19, 2004 7:24 UTC (Thu) by ekj (subscriber, #1524) [Link]

That is not nessecarily so.

It depends on the details of the flaw. If the attack depends on custom-crafting the entire input, or worse yet, both inputs, to find a collision, then you are correct.

But it's possible to change only 20 bytes in a file and make the sha1sum equal. That little "garbage" could easily fit in say a comment in C code or an unused static variable in a binary program. The trick is, offcourse, how to select those 20 bytes.

With a good (as in cryptographically strong) hash there's no better way to do that than simply randomly try different garbage-strings until you find one that matches. That is impractical for a hash of sufficient size.

With a broken hash, all bets are off. It depends on the details.

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds