Linus on Git and SHA-1
Linus on Git and SHA-1
Posted Feb 26, 2017 16:52 UTC (Sun) by ttelford (guest, #44176)In reply to: Linus on Git and SHA-1 by welinder
Parent article: Linus on Git and SHA-1
First off: How are you supposed to test against a SHA-1 collision when you don’t have one to test with? There’s no way anybody could have tested the condition until a few weeks ago.
There’s also no real reason to test what the consequences of a collision in git would be, as the result is already known: Since the SHA1 is a pointer to the blob (data) that git is storing, a collision means you have two valid objects which the pointer would point to. No matter what happens, one of the pointers involved in a collision would point to the wrong place.
I disagree with Linus’s reasoning for one simple reason: git stores more than source code, and when there is a collision, it’ll be tricky to figure out how to recover both versions of a blob with the same sha1, especially with the (ahem) utter cluelessness and total lack of interest a lot of developers have in version control.
A lot of us who support corporate development teams, with a lot of ’new’ developers that would rather pass files around on USB than use version control (like they did in College)… well, we’re in a different boat.
We’re guaranteed that a hash collision will be difficult to detect and painful to fix. (Fortunately, the odds of it happening are worse than getting struck by lightning while getting married to Rihanna after winning the lottery).
That said it’s ridiculously unlikely that a collision will ever happen by accident.
Posted Feb 26, 2017 17:12 UTC (Sun)
by welinder (guest, #4699)
[Link] (1 responses)
It's very easy as I wrote: "the way you test it in the absence of an actual collision is to patch the hash function to return a the hash of file B when it encounters file A."
I.e., you are testing with a modified hash function SHA1' that is close enough to SHA1 that everything works with your existing repository, yet one for which you can easily find a collision because file A and B will have the same hash.
It's like testing upcoming leap seconds. You don't do it by time travel, but by lying selectively to the system. In the case of leap seconds, you lie about the current time. In the case of hash functions you lie about what the hash of a specific file is.
Posted Feb 27, 2017 14:50 UTC (Mon)
by bronson (subscriber, #4806)
[Link]
Lot of links here: http://stackoverflow.com/questions/9392365/how-would-git-...
There are many experiments and discussions on the mailing lists too.
Posted Feb 27, 2017 14:06 UTC (Mon)
by Sesse (subscriber, #53779)
[Link]
SHA-1 collisions, at least the current ones, tend to induce very specific internal states in the hash function. There are methods to detect such states with very high probability (counter-cryptographic analysis).
Of course, eventually we'll see counter-counter-cryptography in the form of other methods of generating collisions… so it's a band-aid.
Linus on Git and SHA-1
>have one to test with?
Linus on Git and SHA-1
Linus on Git and SHA-1