LWN.net Logo

SSL certificates and MD5 collisions

SSL certificates and MD5 collisions

Posted Jan 16, 2009 19:16 UTC (Fri) by jd (guest, #26381)
Parent article: SSL certificates and MD5 collisions

Just a thought, but would it not make sense to start working on a newer version of the SSL protocol that contains two or more hashes, where those hashes are required to use different algorithms?

Whilst it is imaginable that any given hash algorithm will be broken sufficiently for forged certificates in the future, finding a collision which occurs for the same input using two fundamentally different hash algorithms would seem a significantly more complex problem.

Related to that, when I brought up concerns here that SASL also uses MD5 as a digest mechanism, people very kindly posted in reply that SASL uses much stronger validation methods and is much more resiliant to spoofing. (I do genuinely appreciate it when my - or anyone else's error - is gently corrected by highly educational and informative posts.) Are there any times when using SSL for authentication, as opposed to encryption, could be realistically replaced with SASL? What would it take to do so?


(Log in to post comments)

SSL certificates and MD5 collisions

Posted Jan 18, 2009 1:13 UTC (Sun) by dlang (✭ supporter ✭, #313) [Link]

to use SASL instead of SSL for authentication would involve changing how the protocol works

one reason a lot of things use SSL for authentication is that it layers the authentication over the existing protcol without changing the program at all (other than by adding the call to encryption when the connection is established, before the application sends anything over the connection)

SSL certificates and MD5 collisions

Posted Jan 18, 2009 21:49 UTC (Sun) by jd (guest, #26381) [Link]

Yeeees, except that if SSL has a fundamental flaw (and I admit that is a big if) because of inadequate validation, it will be necessary to have an SSL 4.0, which means that code is still going to have to be changed to support the new version.

The problem arises because, in my earlier post asking about the safety of SASL, it was pointed out that SSL doesn't do a whole lot of validation. If SSL is, as was claimed, taking short-cuts on validation and/or ignoring some fields entirely, you have no guarantees that the fields are being correctly populated in the first place. In fact, you can be certain that errors will exist because nobody QAs for things that aren't actual errors.

It would follow that any tightening up of SSL would expose any such coding flaws, which should be assumed to be common-place unless shown otherwise. But if SSL isn't tightened up, you don't necessarily buy a whole lot by phasing out MD5. If a SHA-1 hash trivially replaces an MD5 hash, the argument of the other reply noting problems with n-way collisions applies and finding a SHA-1 collision from a pre-existing MD5 collision would then be much easier. Worse, since both use the same underpinnings, it is theorised that the issues with MD5 mean SHA-1 might easily fall sooner rather than later - a big reason for the SHA-3 challenge.

In the end, postponing the problem might make it easier to fix, though as Y2K demonstrated, everyone waits until the last moment anyway, even when the problem is well-documented and agreed-upon. You're therefore left with the conclusion that code will (at some point) change and that the change will be painful. That suggests you're better off changing code as early as possible. The longer you wait, the more bad code will exist and the more painful the changes will be.

SSL certificates and MD5 collisions

Posted Jan 18, 2009 23:57 UTC (Sun) by dlang (✭ supporter ✭, #313) [Link]

there are several issues here

1. SSL is not fundamentally broken here, one of the hashes that it can optionally use is broken enough that it could be exploited in one case.

2. switching from SSLv3 to SSLv4 would be a simple library change (if it was ever needed), much less invasive than a switch to something completely different like SASL

3. why would you assume that SSL would be broken and SASL wouldn't? swiching form one technology to another based on the fear that someone may break the first one some time in the future is not a very good use of time

4. where did you pick anything up that said that software wasn't verifying fields in the certs, and this is a security problem?

most of the fields in the cert are informational only, they don't matter (unless combined with other knowledge external to the cert)

5. SSL allows for different encryption and hash algorithms to be used. this is how AES has been added to SSL without most of the applications that use it even knowing about it (they just use the openssl libraries, which added a new encryption type). so when SHA-3 is defined, it will be added and people will have the option to start using it.

the issue that another poster mentioned of it being desirable to be able to have multiple signatures to one cert is the reflection of a desire to fundamentally change the basis of trust for the system.

the SSL cert trust model is based on the theory that the signers are trusted absolutly. If they vouch for the identity of someone (or a site), that identity can be trusted.

unfortunantly, in the scale of the Internet today this is a very questionable thing to do. how can you really verify if a person is allowed to get a cert for abc.mysite.com?

and the economic incentives push exactly the other way. A Cert signer who does extensive verification not only spends more to do so (meaning they need to charge more), but it takes longer to fill the customer order, the customer has to spend more time providing the proof that the vendor asks for, and then (to add insult to injury) the resulting cert isn't treated any differently by anyone using it (so there really isn't any benifit from the extra checking)

but if you want to change this you need to define a new approach to solve many problems.

One major problem is that the Internet is just too big for any one entity to really know who is who. but if you try and segment the trust, how do you know when to trust a particular signer and when not to.

if you go to www.mysite.com,
is this a personal site?
is it a banking site?
is it a government site?
is it a store?
is it a business?
is it several of these?

since nobody wants any one country/entity to control the Internet, this means that there will be several entities who could sign for each category. how do you know which ones to trust for this site? it may be that verisign issues the cert that www.microsoft.com uses, bu there is nothing that would prevent a russian or chinese certificate authority from signing a different certificate for the name www.microsoft.com, and if you trust those certificate authorities you would accept either cert. In theory you only want to check the certificate authority that the company has decided to use, but how can you know who that is?

SSL certificates and MD5 collisions

Posted Jan 18, 2009 13:21 UTC (Sun) by djao (guest, #4263) [Link]

Just a thought, but would it not make sense to start working on a newer version of the SSL protocol that contains two or more hashes, where those hashes are required to use different algorithms?

The two hash suggestion arises so frequently that Wikipedia contains an extended discussion of it. Basically, it doesn't work, because it provides less security than one long hash. The only advantage of two hashes is redundancy, but even this advantage only matters in the event of a complete break, which is not the case for the vast majority of hash function attacks. (For example, if you read the article, the MD5 attack being discussed here is a 2^51 attack on a 2^64 problem, representing a factor of 2^13 speedup. This is the kind of partial attack that would be easily foiled by a single long hash.)

In short, two hash functions don't improve security with respect to the weakest link in the system, namely resistance to partial attacks, so the proposal is not generally useful.

SSL certificates and MD5 collisions

Posted Jan 18, 2009 21:26 UTC (Sun) by jd (guest, #26381) [Link]

It's important to note that the Wikipedia page specifically says that using two hash functions in case one is broken is a valid use. It's also important to note that the assumption of a corresponding collision in B if a collision is found in A is most useful if A and B are not orthogonal but have a common basis. For example, using SHA-1 and MD5 is not smart as they have common elements. If A and B are completely orthogonal, then knowing a collision in one should tell you nothing about whether a corresponding collision exists in the other.

The effective strength of two hashes can never be greater than a hash with an equal number of bits to the two combined, assuming all three hashes are orthogonal and have no known weaknesses. One of the problems with simply adding bits to a hash is that doesn't guarantee it will be any stronger. If the bits are poorly generated, it might even weaken it. This could happen if the extended hash is insufficiently close to random and information is exposed.

If, however, you're already working with hashes that are as long as you can usefully generate them, the combined strength of the hashes (again, assuming there are no common elements) must always exceed a hash of twice the length because the algorithms are no longer safe in the extended form, creating an unnecessary weakness in addition to any weakness that might be inherent in the algorithm anyway.

SSL certificates and MD5 collisions

Posted Jan 18, 2009 23:30 UTC (Sun) by djao (guest, #4263) [Link]

It's important to note that the Wikipedia page specifically says that using two hash functions in case one is broken is a valid use.

This is true, and I addressed this exact issue in my previous reply, although perhaps the manner in which I addressed it was too oblique.

I will start with your usage of the word "broken". What does it mean for a hash function to be "broken"? The image that frequently comes to mind is that of a total break, the sort of break where one can generate almost arbitrary collisions with trivial effort. If you are worried about your hash functions suffering a total break, then, yes, two hashes have some advantages over one hash.

The problem is that the stereotypical portrayal of a total break is not something that happens very often in real life, at least not with the more popular hash functions. More often, what happens is that a hash function is broken in some sort of incremental manner, in which the hash can be broken marginally more easily than brute force, but the attack is not effective enough to be considered totally trivial. In such a situation, a single long hash provides much much more of a safety margin than two short hashes. It is important to realize that a long hash, like SHA-256 (which is of course 256 bits), has to be much more thoroughly broken than a short hash before attacks on the long hash become practical. To give a concrete example: SHA-1, which is 160 bits, is nowadays considered marginally broken. The best known attack on SHA-1 requires 2^63 computations. This represents a factor of 2^17 speedup compared to brute force, which sounds bad, until you observe that the theoretical optimum for a 128-bit hash function (such as MD5) is 2^64 computations. In other words, even the so-called "broken" SHA-1 still has the same security as what we thought MD5 used to have before MD5 was broken. This is the kind of thing that a longer hash function does for you: it provides a safety margin, just by virtue of being longer.

It is instructive to compare SHA-1 with MD5 and SHA-256. A hypothetical 2^17 speedup in attacking MD5 leads to MD5 being broken (which is in fact more or less what happened to MD5), and a hypothetical 2^17 speedup in attacking SHA-256 would be a purely theoretical result.

The reason I'm replying at length is because people often don't realize that the safety margin provided by simply using a longer hash function is extremely large and very likely to be more valuable than the relatively minor safety margin provided by using two short hashes, due to the observed fact that many many more hash functions suffer partial breaks than total breaks.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds