LWN.net Logo

Why not just use the SHA1 only?

Why not just use the SHA1 only?

Posted Nov 18, 2008 14:41 UTC (Tue) by nevets (subscriber, #11875)
In reply to: Why not just use the SHA1 only? by scarabaeus
Parent article: /dev/ksm: dynamic memory sharing

Actually, my comment was a little facetious. My point was not a way to fix this algorithm, but a comment against what git is doing. The git repo really relies on absolutely no conflicts. If one happens then the database is corrupted. I keep hearing that the chances of this happening is astronomically low, but the fact that the chance can happen, bothers me.


(Log in to post comments)

Why not just use the SHA1 only?

Posted Nov 18, 2008 16:07 UTC (Tue) by nix (subscriber, #2304) [Link]

Jean-Luc Herren did the maths recently on the git list, in
<48E4ABC0.80100@gmx.ch>:

In case it's interesting to someone, I once calculated (and wrote
down) the math for the following scenario:

- 10 billion humans are programming
- They *each* produce 5000 git objects every day
- They all push to the same huge repository
- They keep this up for 50 years

With those highly exagerated assumptions, the probability of
getting a hash collision in that huge git object database is
6e-13. Provided I got the math right.

So, mathematically speaking you have to say "yes, it *is*
possible". But math aside it's perfectly correct to say "no, it
won't happen, ever". (Speaking about the *accidental* case.)

Why not just use the SHA1 only?

Posted Nov 18, 2008 18:37 UTC (Tue) by dlang (✭ supporter ✭, #313) [Link]

git will never overwrite an object that it thinks that it has.

so git could get corrupted, but it would not be corrupted by overwriting old data and loosing it, it would be corrupted by not saving new data (much easier to detect as that is the data that people would be trying to use)

there is an option in git (I don't remember if it's compile time or not) to do additional checking when saving data to check that the data is the same even if it has the same hash, and give an error if it's not the same.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds