|
|
Subscribe / Log in / New account

Is the date there?

Is the date there?

Posted Dec 14, 2024 16:16 UTC (Sat) by khim (subscriber, #9252)
In reply to: Is the date there? by Wol
Parent article: Facing the Git commit-ID collision catastrophe

> If you ask for eg 10 characters, and there's a collision, can't git just display enough extra characters to disambiguate, along with the commit date?

Not if you have hundreds of independent git repos (as happens with kernel development) and commits go into different trees.

> If the hash wasn't initially ambiguous, it obviously refers to the earliest one.

Not if you have many independent repos.

P.S. With one, single, “canonical” repo the whole discussion wouldn't even make any sense since you can just hand over unique IDs sequentially.


to post comments

Is the date there?

Posted Dec 16, 2024 7:44 UTC (Mon) by smurf (subscriber, #17840) [Link] (2 responses)

? We effectively do have a single repo, in that there is a single official git tree.

The chances that a new hash collides with another new hash in a different tree that happens to *also* not be in Linus' tree are way too small to worry about, you'd need roughly 100'000 commits per release cycle for that to be even remotely likely, using 12-character hashes. (16⁶ is 16 million, which is the point where the chance of a collision approaches 50%.)

Is the date there?

Posted Dec 17, 2024 9:57 UTC (Tue) by khim (subscriber, #9252) [Link] (1 responses)

> you'd need roughly 100'000 commits per release cycle for that to be even remotely likely

Is this 100'000 commits or 100'000 unique hashes? Given the fact that each release cycle there are more than 10'000 commits accepted I would suspect that there are significantly more than 100'000 transient commits that don't go into Linus tree (rejected patches, temporary commits into different git trees, etc).

And we are talking about discussions on mailing lists and other such things. These don't belong to a single git tree.

Is the date there?

Posted Dec 17, 2024 11:52 UTC (Tue) by smurf (subscriber, #17840) [Link]

> Is this 100'000 commits or 100'000 unique hashes?

Commits. If you ask git to show something that might be a commit and/or something else, you get a nice list with the candidates, including the (we hope) single commit in question.

> significantly more than 100'000 transient commits that don't go into Linus tree (rejected patches, temporary commits into different git trees, etc).

These are not going to be referenced from short hashes that *are* in Linus' tree.

> And we are talking about discussions on mailing lists and other such things. These don't belong to a single git tree.

The number of commit references on mailing lists, online discussions, other git trees, et al. is significantly smaller than 100'000, which violates the birthday paradox-ish assumption that *any* two commits that share a prefix are a problem.

In other words, the real-life probability of a collision that actually matters to anybody is a lot less than what the BP tells us it might be.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds