LCA: How to destroy your community

Posted Jan 24, 2010 7:09 UTC (Sun) by zooko (guest, #2589)
In reply to: LCA: How to destroy your community by johnflux
Parent article: LCA: How to destroy your community

> Ah yes, I remember. Your email:
>
> http://www.gelato.unsw.edu.au/archives/git/0506/5298.html
>
> And his reply:
>
> http://www.gelato.unsw.edu.au/archives/git/0506/5299.html

That was one of the conversations that I was thinking of, but the message 5298.html that cite above wasn't written by me. I wrote an earlier message in the thread that IIRC was forwarded to lkml by someone else.

> I understand that you see his reply as a bit stinging, but your whole argument was based on the assumption that you could crack md5 in a way that lets you generate a meaningful exploit and then on top of that manage to inject that into the kernel.

I'm not entirely sure what you mean by "crack md5 in a way that lets you generate a meaningful exploit". In the exchange that you cite above, the other person, <linux@horizon.com>, was right and Linus was wrong in respect to the question of whether git users depend on the collision-resistance property of the hash function or not. The truth is that they do, but in a subtle way that most people (including Linus at least at the time he wrote that) don't understand.

At this time (in 2005), when Linus was deciding to stick with SHA-1 for git, certain certificate authorities were deciding to stick with MD5 for signatures, for the same reason -- it seemed to them that they didn't rely on the collision-resistance property. In 2008 it was demonstrated that they did actually rely on that property:

http://www.win.tue.nl/hashclash/rogue-ca/

A similar attack is probably possible on git. It currently costs substantially more than USD 1 million to build a computer that can generate SHA-1 collisions (how much more is not publicly known, but probably less than USD 10 million). For now, only the rich can play.

So while I'm sure that the cryptographers who generated the rogue Root CA (above) can't inject their own code into your git pulls (because they work at public academic institutions and don't have the budget), I'm not sure that the NSA or the Chinese cyberwarriors can't.

> I can see why Linus responded with sillyness :-)

I understand that it seemed ridiculous to him at the time. However he was qualitatively wrong about the properties that git users rely on, and both he and <linux@horizon.com> were quantitatively confused about the cost to generate SHA-1 collisions. (See the rest of the thread that you cited, in which they talk about SHA-1 collisions costing 2^80 computations, when in fact the known upper bound at that time was 2^69. Today the known upper bound on the cost to generate a SHA-1 collision is 2^63.)

One effect of mocking things that seem ridiculous to you is that it deters certain kinds of people from participating in the conversation. I suppose this could be useful if you are right and they are wrong and progress is achieved by getting them to shut up, but of course you take the risk that you were wrong in the first place and by doing this you stay wrong.

I, for one, was reading that conversation at the time, and decided not to join in and try to explain more, in part because I didn't want to have my feelings hurt by mockery and in part because it didn't seem like I would have a good chance of making my point understood.

So to attempt to swerve back onto the topic of this LWN article, when I offered some suggestions to the engineers who are adding crypto to ZFS, they responded with technical arguments that were expressed in polite language. I was therefore emboldended to think that they might actually be listening, and went on to offer more ideas: http://opensolaris.org/jive/thread.jspa?threadID=117092&... . From my very specific, narrow, limited viewpoint, Solaris open source development has been easier to participate in than Linux development. ;-)

LCA: How to destroy your community

Posted Jan 24, 2010 8:44 UTC (Sun) by johnflux (guest, #58833) [Link] (7 responses)

An interesting response, thanks.

Just sticking to the technical side, Linus has been critical of "masturbating monkeys" crowd (his words), that concentrate "on security to the point where they pretty much admit that nothing else matters to them" (his words again). I am not at all surprised at his response to you, whether you were technically right or not (I can't judge sorry).

On the reaction side - he can be a real ass and you got off lightly compared to his scathing on svn developers (just for an example). You do need a thick skin to work on the kernel, and it has actually been something that people are trying to address.

LCA: How to destroy your community

Posted Jan 24, 2010 15:47 UTC (Sun) by zooko (guest, #2589) [Link] (6 responses)

Yes, I'm quite sympathetic to the sentiment, if not to the terminology, that too many security engineers don't understand the concept of "trade-off", or if they do they seem to think that the security knob should be turned to "11" regardless of the consequences.

I don't fault Linus for being impatient with security worriers in the sense of "not wasting his time listening to them", but I do fault him for being impatient in the sense of insulting them.

One thing that has always bugged me about git's use of SHA-1 is that there was very little engineering cost, and probably not too much cost in CPU cycles, to making it secure -- just use SHA-256 instead of SHA-1. (I believe that the reason git uses SHA-1 is the Monotone did. The earliest prototype of git used MD5 because BitKeeper did.)

The engineering cost for upgrading git from SHA-1 to a new algorithm is much higher. I'm not sure how it can be done well. First of all we probably need to deploy a version of git (let's call it version 2 for this conversation) which allows there to be a slot for a new hash value even though it doesn't read or use that space -- it just uses the SHA-1 hash value which is in the other slot. That way once we *eventually* deploy yet a newer version of git, let's call it version 3, which produces SHA-3 hashes in addition to SHA-1 hashes, people using git version 2 will be able to continue to interoperate. (Although, per this discussion, they may be vulnerable and people who rely on their SHA-1-only patches may be vulnerable.)

As far as I understand people using today's git, git version 1, will not be able to exchange patches in any way with people using some future version (which I called "version 3" above) that uses a new algorithm.

LCA: How to destroy your community

Posted Jan 24, 2010 16:09 UTC (Sun) by johnflux (guest, #58833) [Link] (5 responses)

Looking at: http://www.cryptopp.com/benchmarks.html

It seems, roughly, that SHA256 takes about 40% more time than SHA1. From what I understand, the speed of git is determined most by the speed of the SHA1 implementation (Based on a long thread called 'Linus' sha1 is much faster!'). So roughly, switching would make everything 40% slower.

I think that's a trade-off that they wouldn't be willing to make. However, just playing with the C code of the SHA1 code by the git developers ended up making it nearly twice as fast, so I don't know what the optimal speed difference is against SHA256.

If the numbers stay about the same, I think the git guys wouldn't accept a 40% speed decrease.

(On a side point - subscribing so LWN was worth every penny. How I love to have civil conversations with intelligent people.)

LCA: How to destroy your community

Posted Jan 24, 2010 17:14 UTC (Sun) by zooko (guest, #2589) [Link] (4 responses)

Yeah, I was guessing that git is actually network-bound or disk-bound often enough that the CPU hit doesn't matter, but I'm not sure. (According to http://cryptopp.com/benchmarks-amd64.html , SHA-1 is 192 MB/s, SHA-256 is 139 MB/s. Faster than either your disk or your network?)

If git holds out for SHA-3 then hopefully SHA-3 will turn out to be faster than SHA-256. There's even a chance that it will turn out to be faster than SHA-1 on modern CPUs!

LCA: How to destroy your community

Posted Jan 25, 2010 0:32 UTC (Mon) by johnflux (guest, #58833) [Link]

http://www.mail-archive.com/bug-coreutils@gnu.org/msg1729... That's the thread I was thinking of.

LCA: How to destroy your community

Posted Jan 25, 2010 3:56 UTC (Mon) by njs (subscriber, #40338) [Link] (1 responses)

SHA-1 can definitely be a bottleneck in some situations. The most extreme -- though I'm not sure whether git does this -- is verifying hashes on an initial clone. (The idea is to prevent one person's disk corruption on an old file or whatever from spreading throughout the network of clones.) Here the disk and network cost is proportional to the size of the delta-compressed repository, but the SHA-1 cost is proportional to the size of the uncompressed repository, which can easily be in the terabyte range.

It can also easily be the bottleneck on, say, committing a large merge (many modified files, all in cache because they were just written).

LCA: How to destroy your community

Posted Dec 29, 2015 18:11 UTC (Tue) by Spitfire19 (guest, #106038) [Link]

I feel that you would also have a big hit when you are performing CI tasks, as for some scenarios you may want your automated tool to delete everything in given directory before pulling and checking out the latest commit.

LCA: How to destroy your community

Posted Jan 27, 2010 11:58 UTC (Wed) by broonie (subscriber, #7078) [Link]

In normal workflows you end up with the files you're accessing in cache so there's no I/O to physical devices and most operations do get CPU bound, especially read only ones. For performance purposes git pretty much assumes that most of the time you're running from hot cache.