Announcing the first SHA-1 collision
We started by creating a PDF prefix specifically crafted to allow us to generate two documents with arbitrary distinct visual contents, but that would hash to the same SHA-1 digest. In building this theoretical attack in practice we had to overcome some new challenges. We then leveraged Google’s technical expertise and cloud infrastructure to compute the collision which is one of the largest computations ever completed." The SHA-1 era is truly coming to an end, even if most attackers lack access to the computing resources needed for this particular exploit.
Posted Feb 23, 2017 15:26 UTC (Thu)
by xav (guest, #18536)
[Link] (15 responses)
Posted Feb 23, 2017 16:23 UTC (Thu)
by lamawithonel (subscriber, #86149)
[Link] (7 responses)
shattered.io says this: How is GIT affected? GIT strongly relies on SHA-1 for the identification and integrity checking of all file objects and commits. It is essentially possible to create two GIT repositories with the same head commit hash and different contents, say a benign source code and a backdoored one. An attacker could potentially selectively serve either repository to targeted users. This will require attackers to compute their own collision. Adding their two files to a repo in separate commits seemed to work OK, but running
Posted Feb 23, 2017 16:33 UTC (Thu)
by lamawithonel (subscriber, #86149)
[Link]
Posted Feb 23, 2017 16:36 UTC (Thu)
by georgm (subscriber, #19574)
[Link] (1 responses)
If I understand it correctly, you could craft two commits with same SHA-1, but manipulate the tree this commit represents with totally different content.
Posted Feb 23, 2017 16:43 UTC (Thu)
by georgm (subscriber, #19574)
[Link]
1. add a file "foo" with content "test" to a new git repo
2. git cat-file -p HEAD
3. git cat-file -p 831915b1b2a9cc94f7ea7e14f4de9e00950ccc28
4. echo -e "blob 5\0test" | sha1sum
Posted Feb 24, 2017 7:19 UTC (Fri)
by alexl (subscriber, #19068)
[Link] (3 responses)
So, unless you enable transfer.fsckObject or manually run git fsck certain of the attacks described in these threads are already possible.
Posted Feb 24, 2017 18:04 UTC (Fri)
by hkario (subscriber, #94864)
[Link] (1 responses)
Posted Feb 24, 2017 20:54 UTC (Fri)
by flussence (guest, #85566)
[Link]
Posted Feb 25, 2017 13:10 UTC (Sat)
by DigitalBrains (subscriber, #60188)
[Link]
That article links to Debian bug 813157, where the last message seems to claim the contrary if I read it right:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=813157#39
IIUC, SHA-1 checksums for objects aren't even transferred with a pull, let alone that they can be falsified.
Posted Feb 23, 2017 16:24 UTC (Thu)
by anselm (subscriber, #2796)
[Link] (5 responses)
With these collision attacks, the attacker comes up with two separate new documents that are constructed such that they have the same hash value. This is a lot easier than finding a new document that hashes to the specific given hash value of another (existing) document.
As far as git is concerned, an attacker would have to make two separate repositories that contain a collision (people would presumably then keep using the “benign” version but the attacker would be able to replace it with the “malicious” version even in the face of later changes). It is much more difficult to use this attack to introduce a collision into a pre-existing repository.
The lesson to take away from this as far as documents are concerned is to never sign anything without first tweaking it a bit (e.g., by adding a few random spaces here and there). That should mess up an attacker who has just spent an inordinate amount of CPU time coming up with another version of the document that has the same hash value as the one they gave you to sign.
Posted Feb 23, 2017 17:18 UTC (Thu)
by georgm (subscriber, #19574)
[Link]
Posted Feb 23, 2017 21:50 UTC (Thu)
by tialaramex (subscriber, #21167)
[Link] (3 responses)
This is intended as defence in depth against attacks in which the certificate applied for is one half of a collision (the other half being a target that the CA would not willingly sign, such as a CA:TRUE certificate or a certificate for a valuable name the attacker doesn't control)
Last time this was done (as proof of concept, not by actual bad guys) for MD5, the serial numbers from at least one major public CA were fairly predictable in sequence, allowing the attack team to conclude they should be able to purchase certificate #643015 in a week's time. So they could use that week to design a document for certificate #643015, produce a collision imposter for that document, and then as the appointed moment approached, "run up the score" by purchasing certificates #643012, #643013, #643014 and then purchase #643015, snip the signature off and apply it to their imposter document.
(A natural reaction from the CAs was that they could just revoke certificate #643015 after detecting this. Unfortunately, revocation in the X.509 PKI design applies to a serial number, so that doesn't revoke the imposter, which has a different serial number ...)
Posted Feb 24, 2017 0:42 UTC (Fri)
by anselm (subscriber, #2796)
[Link] (2 responses)
And this is why we now have SHA-3, which is not susceptible to the “append stuff to a collision and still have a collision” artefact.
Posted Feb 27, 2017 17:36 UTC (Mon)
by nevyn (guest, #33129)
[Link] (1 responses)
Posted Feb 27, 2017 18:34 UTC (Mon)
by excors (subscriber, #95769)
[Link]
With SHA-1, SHA-2, etc, the output of the hash function is a copy of the state. Knowing the hash means you know the state and can do a length extension attack - given H(X), but unknown X, you can trivially compute H(pad(X) || Y) for any Y. (That's quite bad if X is a secret and you're using prepend-X-then-hash as a signature. You need something like HMAC to prevent that attack.)
If you have two different inputs with the same hash, you know they have the same state, so you can append the same blocks onto both inputs and they will continue colliding.
With SHA-3 the state is 1600 bits, and the output is some function that maps the state onto a smaller number of bits. Knowing the hash doesn't let you derive the whole state, so you can't do the length extension attack, and there's no need for HMAC. That's a useful property of SHA-3.
Two different inputs with the same 256-bit hash may not have the same 1600-bit internal state, so they may no longer collide after you append blocks to both of them. But if you construct two different inputs that lead to an identical internal state then you could append blocks and continue colliding, so it's still susceptible to that problem.
Posted Feb 23, 2017 21:25 UTC (Thu)
by joey (guest, #328)
[Link]
Summary: It doesn't immediately break git, because the attack does not allow generating two git commits with the same sha1 that point to two different trees of files. This is a single-prefix attack; if a chosen-prefix attack later becomes available then that would be possible, which would break git rather badly.
This attack may result in some mitigations being added to git. Or even, in my wildest dreams, git getting the hash parameterized so repositories can be upgraded to sha3..
Posted Feb 23, 2017 16:22 UTC (Thu)
by giggls (subscriber, #48434)
[Link] (6 responses)
I seriously doubt that it is very likely (maybe it is even possible to compute this probability) to have matching SHA-1 and MD5 sums.
Posted Feb 23, 2017 17:33 UTC (Thu)
by hkario (subscriber, #94864)
[Link] (1 responses)
IOW, security of MD5+SHA1 signatures is as strong (for practical purpouses) as SHA-1 alone
Posted Feb 24, 2017 9:07 UTC (Fri)
by At0mik (guest, #90781)
[Link]
However, if I understood correctly, the proof constructs on the fact that the 2 hash functions don't have any other attacks than the "birthday paradox".
In my limited understanding of the theory, SHA-1 + MD5, MAY still be as secure as a "completely secure" 160-bit hash function.
Seems like the google sha1 collision can be constructed in 2^50 time, which is much faster than 2^80 of a secure 160bit hash function.
Posted Feb 23, 2017 18:03 UTC (Thu)
by flussence (guest, #85566)
[Link] (3 responses)
Posted Feb 27, 2017 13:33 UTC (Mon)
by alankila (guest, #47141)
[Link] (2 responses)
I'd imagine this would make searching for collisions that much harder, even if the other hash is much weaker, e.g. it may take 2^50 effort to discover a collision with SHA-1, but then it takes additional 2^x effort to make that collide with MD5 as well. I do not know what the literature says for the weakness of MD5 but if we for sake of argument assume that collision can be manufactured in 2^40 effort then together you need to do 2^90 worth of effort to collide the concatenation of SHA-1 and MD5, which can be considered secure again.
Posted Feb 27, 2017 14:02 UTC (Mon)
by anselm (subscriber, #2796)
[Link] (1 responses)
According to Joux, Multicollisions in Iterated Hash Functions. Application to Cascaded Constructions, using several MD5/SHA-1/2-style hash functions at the same time doesn't really buy you much as far as collision resistance is concerned. For example, using a single SHA-256 hash would be much more secure against collisions than concatenating an MD-5 hash and 128 bits of a SHA-1 hash to yield a “combined” 256-bit hash.
In addition, concatenating several hashes makes you much less secure against first-preimage attacks (e.g., hashed-password cracking).
Posted Feb 28, 2017 11:51 UTC (Tue)
by alankila (guest, #47141)
[Link]
However, my point was that the md5(M) || SHA1(M) combination would still be secure against this specific attack against SHA1, because of the necessity of finding collisions at specific position. You'd have to actually run the MD5 collision machine to generate alternatives at some specific location of the message, thus raising the cost back to exponential level.
Again, birthday paradoxes don't matter. The fact that combining md5 with sha1 together doesn't improve birthday paradox attack resistance much is not relevant, it's already hopelessly impractical to attack the hash by birthday paradox construction alone.
Posted Feb 23, 2017 23:22 UTC (Thu)
by joey (guest, #328)
[Link] (2 responses)
An attacker could take a working firmware file, disassemble it enough to add a conditional jump to a backdoor, spend $100k or so to generate two colliding versions of it, and submit the one that does not run the backdoor. Then they can provide a mirror of that repository (perhaps on github, people like to use mirrors of repositories on github for some reason) that contains the other version.
The modified firmware file would be 128 bytes longer, plus the size of the backdoor. So not a lot larger.
Question is, would enough people use the mirror long enough before someone noticed the difference, to make it worth their while?
Posted Feb 24, 2017 1:10 UTC (Fri)
by excors (subscriber, #95769)
[Link]
If they do review it at all, the only thing they could reasonably do is compare against a vendor-supplied version of the firmware (under the necessary assumption that the vendor is trusted), but then they'd notice your collision-friendly non-backdoored version doesn't match the original, and it would be hard to justify why you added random bytes onto the end, so you still wouldn't get any real benefit from your colliding files.
Posted Feb 24, 2017 7:04 UTC (Fri)
by nhippi (subscriber, #34640)
[Link]
Posted Mar 3, 2017 21:16 UTC (Fri)
by mina86 (guest, #68442)
[Link] (1 responses)
Posted Mar 6, 2017 14:44 UTC (Mon)
by robbe (guest, #16131)
[Link]
You conveniently forgot to be specific, but I’ll assume you are talking about hostnames of onion services. Proposition 224 will fix that, but is not making progress at stellar pace. I don’t have insight whether qualified reviewers or implementors are lacking.
I guess anybody could help change that by sponsoring. Most institutional sponsors of Tor have little interest in onion services.
Announcing the first SHA1 collision
Announcing the first SHA1 collision
`git diff HEAD~1`
has some interesting results.
diff --git a/shattered-2.pdf b/shattered-2.pdf
new file mode 100644
index 0000000..b621eec
Binary files /dev/null and b/shattered-2.pdf differ
lines 1-4/4 (END)
Announcing the first SHA1 collision
Announcing the first SHA1 collision
Announcing the first SHA1 collision
tree 831915b1b2a9cc94f7ea7e14f4de9e00950ccc28
...
100644 blob 9daeafb9864cf43055ae93beb0afd6c7d144bfa4 foo
9daeafb9864cf43055ae93beb0afd6c7d144bfa4
Announcing the first SHA1 collision
See e.g.:
https://groups.google.com/forum/#!msg/binary-transparency...
Announcing the first SHA1 collision
Announcing the first SHA1 collision
Malicilously replacing git objects
Announcing the first SHA1 collision
Announcing the first SHA1 collision
Noise must be early
Noise must be early
Practical collision attacks for MD-style hashes (and SHA-1 is such a hash) revolve around inputs that force the internal state of the hash to have desirable properties, which you can't do if somebody gets to shove gibberish through the hash before you can get there.
Noise must be early
Noise must be early
Announcing the first SHA1 collision
Announcing the first SHA1 collision
Announcing the first SHA1 collision
https://www.iacr.org/cryptodb/archive/2004/CRYPTO/1472/14...
Announcing the first SHA1 collision
In practice, SHA-1 and MD5, are vulnerable because researchers found sophisticated ways to construct a collision faster than the bruteforce.
Announcing the first SHA1 collision
Combining by concatenating would make almost zero difference (2⁶⁴ + 2⁸⁰ ≠ 2¹⁴⁴), and you'd just end up with a longer, slower output than using sha256 in the first place.
Announcing the first SHA1 collision
Announcing the first SHA1 collision
Announcing the first SHA1 collision
Announcing the first SHA1 collision
And, would it be cheaper to spend $100k on backdooring a lot of different firmware files in obfuscated ways, and rely on the obfuscation to prevent the backdoors being noticed?
Announcing the first SHA1 collision
Announcing the first SHA1 collision
Announcing the first SHA-1 collision
Announcing the first SHA-1 collision