SSL certificates and MD5 collisions

By Jake Edge
January 14, 2009

There is a fair amount of confusion surrounding the recent research resulting in the ability to create bogus SSL certificates. The research combined a weakness in the certificate generation process with the ability to create MD5 hash collisions and generated a certificate that would be accepted by all browsers. That certificate could be used to sign other certificates, allowing the researchers to create a valid certificate purporting to be from any domain they chose.

Cryptographic hashes, like MD5, are used in digital signature algorithms; in effect it is the hash that is signed as a stand-in for the actual content. It has been known since 2004 that MD5 collisions—two different inputs generating the same hash value—could feasibly be found. So, a signature on data with a specific MD5 hash would be considered a valid signature on any other data that hashed to the same value. What the researchers did was to create a certificate that the certificate authority (CA), in this case RapidSSL, was willing to sign, then transferred that signature to a different certificate. That second certificate hashed to the same value, but had the ability to sign additional certificates.

This is a very significant attack on SSL that was addressed rather quickly. One wonders why these certificate authorities were still using MD5 long after it had been deprecated because of the collision vulnerability. Inertia is the likely culprit, but RapidSSL and other CAs using MD5 changed to SHA-1 within hours of the report in some cases. In addition, RapidSSL stopped using sequential serial numbers in certificates. That flaw helped the researchers immensely, so that they only needed four attempts (with a 200 node Playstation 3 cluster) to create their colliding certificate. A random serial number over a sizable range once again makes this attack infeasible—at least on today's hardware.

Eventually, MD5 will no longer be accepted as the hash used in the signatures on certificates—or anything else, probably—but as of now, SSL implementations will accept them. There are large numbers of such certificates in use today, so browsers cannot just stop accepting them. CAs are generally offering their customers free replacement certificates that use SHA-1. Because users rarely root through the certificates presented to their browser to determine what hash algorithm was used, there is a extension for Firefox called SSL Blacklist that detects these certificates and pops up a warning.

But, for those sites affected—LWN for example—it can be a bit worrying to hear from users that their certificate may be bad. The LWN certificate and countless others are really no more vulnerable to this attack than any other. A site that has an SHA-1 signature can be spoofed by this attack as easily as one with an MD5 signature. But a site that has an MD5 signed certificate does make it harder to switch away from MD5. That switch won't happen soon in any case, but it could be slowed down by sites that are slow to change.

If an attacker currently has a certificate of the type that the researchers created, they can use it to sign certificates for any domain they wish, and they can use SHA-1 in that signature. This particular vulnerability requires an MD5 signed certificate in the chain of certificates, but does not require that the final, domain-specific certificate be signed with MD5. It should also be noted that some of the root certificates distributed with browsers are MD5-signed. Those are not vulnerable because they are distributed with the browser—if an attacker can change one's root certificate stash, there are much easier attacks possible. For this reason, SSL Blacklist looks for MD5 signatures in the certificate chain anywhere after the root certificate.

This incident is a good illustration of how cryptographic research often proceeds. First, small cracks are found in an algorithm, causing some worry in cryptographic circles, then partial attacks are found, which generally starts to raise the alarm in the wider security community. But it usually takes a full-scale attack or proof-of-concept to really cause those who use the algorithms, knowingly or unknowingly, to take remedial action. That delay provides a nice window that attackers can and will exploit.

Index entries for this article
Security	Secure Sockets Layer (SSL)/Certificates

SSL certificates and MD5 collisions

Posted Jan 15, 2009 9:04 UTC (Thu) by HenrikH (subscriber, #31152) [Link] (7 responses)

While not a fix for the problem as such, I think it would be good if the browser (say Firefox) did store the certificate's signature together with the URL in a database (for example the same database used for the address bra history).

That way it could be discovered when a site suddenly changes it's certificate. Now this will happen for legit reasons (some time before expiry date) but could also be a sign that some one is trying to forge the certificate.

Perhaps with more explicit warnings if the change was from say a certificate with extra validation (aka the green address bar certs) to a url only certificate.

SSL certificates and MD5 collisions

Posted Jan 15, 2009 9:25 UTC (Thu) by Trou.fr (subscriber, #26289) [Link] (1 responses)

I have been wishing for such a feature for year. The way OpenSSH does it is perfect. I do trust my personnal certificate hash storage more than CAs, their search for profit prevents accurate verification of identities.

The probability that the site I visit was not showing the right certificate the first time I connected to it is very low and comparing the hash is adding a really useful security layer. Defense in depth.

SSL certificates and MD5 collisions

Posted Jan 15, 2009 15:06 UTC (Thu) by iabervon (subscriber, #722) [Link]

Beyond the profit motive, privacy concerns and scalability issues prevent CAs from really verifying identities usefully. In order for them to give useful verification, they'd have to not issue certificates to anyone other than my bank and credit card company; otherwise, there's the risk that some other entity will have a confusing site that makes me think that it's my bank when it is actually something else. Since the CA doesn't know what bank I use, and since it wants to be useful to customers of other banks, this is impossible.

SSL certificates and MD5 collisions

Posted Jan 15, 2009 19:02 UTC (Thu) by bronson (subscriber, #4806) [Link] (3 responses)

Ya, great. Now a simple DNS spoof/poisoning can replace all your browser certs with evil ones.

Beware the cure that's worse than the disease.

SSL certificates and MD5 collisions

Posted Jan 15, 2009 20:54 UTC (Thu) by HenrikH (subscriber, #31152) [Link] (2 responses)

Exactly how could your scenario replace my browser certs with evil ones when all my proposal does is to add an extra layer of verification (that the certificate has been changed since my last visit) ?

SSL certificates and MD5 collisions

Posted Jan 17, 2009 19:55 UTC (Sat) by bronson (subscriber, #4806) [Link] (1 responses)

Sorry, I misread your suggestion. I thought you were suggesting a CRL-like behavior. If you're just talking about adding an SSH-like warning, and not manipulating or revoking the certificates on the browser itself, then I take back what I said.

Of course, the SSH warning has been shown to be fairly ineffective in the real world. And that's with sysadmins who should know better! What about regular people? When presented a dialog box saying "The certificate for Chase Bailout Bank has changed! Do you want to continue?" I would guess that 99 out of 100 of them would simply click "Yes."

But, I agree, it wouldn't hurt.

SSL certificates and MD5 collisions

Posted Jan 20, 2009 9:33 UTC (Tue) by HenrikH (subscriber, #31152) [Link]

That is exactly my suggestion, to add a SSH-like warning. Agreed that not many people would benefit from it, but I know that I would :-)

SSL certificates and MD5 collisions

Posted Jan 19, 2009 14:44 UTC (Mon) by TRS-80 (guest, #1804) [Link]

Perspectives (LWN discussion) sort of does what you want, except storing the cert on public "network notary servers".

SSL certificates and MD5 collisions

Posted Jan 15, 2009 15:01 UTC (Thu) by cde (guest, #46554) [Link]

The real danger is the huge number of CAs currently loaded in browsers, because it really helps for an attacker to focus on the CA having the least physical and/or logical security to compromise their private signing keys.

SSL certificates and MD5 collisions

Posted Jan 16, 2009 19:16 UTC (Fri) by jd (guest, #26381) [Link] (6 responses)

Just a thought, but would it not make sense to start working on a newer version of the SSL protocol that contains two or more hashes, where those hashes are required to use different algorithms?

Whilst it is imaginable that any given hash algorithm will be broken sufficiently for forged certificates in the future, finding a collision which occurs for the same input using two fundamentally different hash algorithms would seem a significantly more complex problem.

Related to that, when I brought up concerns here that SASL also uses MD5 as a digest mechanism, people very kindly posted in reply that SASL uses much stronger validation methods and is much more resiliant to spoofing. (I do genuinely appreciate it when my - or anyone else's error - is gently corrected by highly educational and informative posts.) Are there any times when using SSL for authentication, as opposed to encryption, could be realistically replaced with SASL? What would it take to do so?

SSL certificates and MD5 collisions

Posted Jan 18, 2009 1:13 UTC (Sun) by dlang (guest, #313) [Link] (2 responses)

to use SASL instead of SSL for authentication would involve changing how the protocol works

one reason a lot of things use SSL for authentication is that it layers the authentication over the existing protcol without changing the program at all (other than by adding the call to encryption when the connection is established, before the application sends anything over the connection)

SSL certificates and MD5 collisions

Posted Jan 18, 2009 21:49 UTC (Sun) by jd (guest, #26381) [Link] (1 responses)

Yeeees, except that if SSL has a fundamental flaw (and I admit that is a big if) because of inadequate validation, it will be necessary to have an SSL 4.0, which means that code is still going to have to be changed to support the new version.

The problem arises because, in my earlier post asking about the safety of SASL, it was pointed out that SSL doesn't do a whole lot of validation. If SSL is, as was claimed, taking short-cuts on validation and/or ignoring some fields entirely, you have no guarantees that the fields are being correctly populated in the first place. In fact, you can be certain that errors will exist because nobody QAs for things that aren't actual errors.

It would follow that any tightening up of SSL would expose any such coding flaws, which should be assumed to be common-place unless shown otherwise. But if SSL isn't tightened up, you don't necessarily buy a whole lot by phasing out MD5. If a SHA-1 hash trivially replaces an MD5 hash, the argument of the other reply noting problems with n-way collisions applies and finding a SHA-1 collision from a pre-existing MD5 collision would then be much easier. Worse, since both use the same underpinnings, it is theorised that the issues with MD5 mean SHA-1 might easily fall sooner rather than later - a big reason for the SHA-3 challenge.

In the end, postponing the problem might make it easier to fix, though as Y2K demonstrated, everyone waits until the last moment anyway, even when the problem is well-documented and agreed-upon. You're therefore left with the conclusion that code will (at some point) change and that the change will be painful. That suggests you're better off changing code as early as possible. The longer you wait, the more bad code will exist and the more painful the changes will be.

SSL certificates and MD5 collisions

Posted Jan 18, 2009 23:57 UTC (Sun) by dlang (guest, #313) [Link]

there are several issues here

1. SSL is not fundamentally broken here, one of the hashes that it can optionally use is broken enough that it could be exploited in one case.

2. switching from SSLv3 to SSLv4 would be a simple library change (if it was ever needed), much less invasive than a switch to something completely different like SASL

3. why would you assume that SSL would be broken and SASL wouldn't? swiching form one technology to another based on the fear that someone may break the first one some time in the future is not a very good use of time

4. where did you pick anything up that said that software wasn't verifying fields in the certs, and this is a security problem?

most of the fields in the cert are informational only, they don't matter (unless combined with other knowledge external to the cert)

5. SSL allows for different encryption and hash algorithms to be used. this is how AES has been added to SSL without most of the applications that use it even knowing about it (they just use the openssl libraries, which added a new encryption type). so when SHA-3 is defined, it will be added and people will have the option to start using it.

the issue that another poster mentioned of it being desirable to be able to have multiple signatures to one cert is the reflection of a desire to fundamentally change the basis of trust for the system.

the SSL cert trust model is based on the theory that the signers are trusted absolutly. If they vouch for the identity of someone (or a site), that identity can be trusted.

unfortunantly, in the scale of the Internet today this is a very questionable thing to do. how can you really verify if a person is allowed to get a cert for abc.mysite.com?

and the economic incentives push exactly the other way. A Cert signer who does extensive verification not only spends more to do so (meaning they need to charge more), but it takes longer to fill the customer order, the customer has to spend more time providing the proof that the vendor asks for, and then (to add insult to injury) the resulting cert isn't treated any differently by anyone using it (so there really isn't any benifit from the extra checking)

but if you want to change this you need to define a new approach to solve many problems.

One major problem is that the Internet is just too big for any one entity to really know who is who. but if you try and segment the trust, how do you know when to trust a particular signer and when not to.

if you go to www.mysite.com,
is this a personal site?
is it a banking site?
is it a government site?
is it a store?
is it a business?
is it several of these?

since nobody wants any one country/entity to control the Internet, this means that there will be several entities who could sign for each category. how do you know which ones to trust for this site? it may be that verisign issues the cert that www.microsoft.com uses, bu there is nothing that would prevent a russian or chinese certificate authority from signing a different certificate for the name www.microsoft.com, and if you trust those certificate authorities you would accept either cert. In theory you only want to check the certificate authority that the company has decided to use, but how can you know who that is?

SSL certificates and MD5 collisions

Posted Jan 18, 2009 13:21 UTC (Sun) by djao (guest, #4263) [Link] (2 responses)

Just a thought, but would it not make sense to start working on a newer version of the SSL protocol that contains two or more hashes, where those hashes are required to use different algorithms?

The two hash suggestion arises so frequently that Wikipedia contains an extended discussion of it. Basically, it doesn't work, because it provides less security than one long hash. The only advantage of two hashes is redundancy, but even this advantage only matters in the event of a complete break, which is not the case for the vast majority of hash function attacks. (For example, if you read the article, the MD5 attack being discussed here is a 2^51 attack on a 2^64 problem, representing a factor of 2^13 speedup. This is the kind of partial attack that would be easily foiled by a single long hash.)

In short, two hash functions don't improve security with respect to the weakest link in the system, namely resistance to partial attacks, so the proposal is not generally useful.

SSL certificates and MD5 collisions

Posted Jan 18, 2009 21:26 UTC (Sun) by jd (guest, #26381) [Link] (1 responses)

It's important to note that the Wikipedia page specifically says that using two hash functions in case one is broken is a valid use. It's also important to note that the assumption of a corresponding collision in B if a collision is found in A is most useful if A and B are not orthogonal but have a common basis. For example, using SHA-1 and MD5 is not smart as they have common elements. If A and B are completely orthogonal, then knowing a collision in one should tell you nothing about whether a corresponding collision exists in the other.

The effective strength of two hashes can never be greater than a hash with an equal number of bits to the two combined, assuming all three hashes are orthogonal and have no known weaknesses. One of the problems with simply adding bits to a hash is that doesn't guarantee it will be any stronger. If the bits are poorly generated, it might even weaken it. This could happen if the extended hash is insufficiently close to random and information is exposed.

If, however, you're already working with hashes that are as long as you can usefully generate them, the combined strength of the hashes (again, assuming there are no common elements) must always exceed a hash of twice the length because the algorithms are no longer safe in the extended form, creating an unnecessary weakness in addition to any weakness that might be inherent in the algorithm anyway.

SSL certificates and MD5 collisions

Posted Jan 18, 2009 23:30 UTC (Sun) by djao (guest, #4263) [Link]

It's important to note that the Wikipedia page specifically says that using two hash functions in case one is broken is a valid use.

This is true, and I addressed this exact issue in my previous reply, although perhaps the manner in which I addressed it was too oblique.

I will start with your usage of the word "broken". What does it mean for a hash function to be "broken"? The image that frequently comes to mind is that of a total break, the sort of break where one can generate almost arbitrary collisions with trivial effort. If you are worried about your hash functions suffering a total break, then, yes, two hashes have some advantages over one hash.

The problem is that the stereotypical portrayal of a total break is not something that happens very often in real life, at least not with the more popular hash functions. More often, what happens is that a hash function is broken in some sort of incremental manner, in which the hash can be broken marginally more easily than brute force, but the attack is not effective enough to be considered totally trivial. In such a situation, a single long hash provides much much more of a safety margin than two short hashes. It is important to realize that a long hash, like SHA-256 (which is of course 256 bits), has to be much more thoroughly broken than a short hash before attacks on the long hash become practical. To give a concrete example: SHA-1, which is 160 bits, is nowadays considered marginally broken. The best known attack on SHA-1 requires 2^63 computations. This represents a factor of 2^17 speedup compared to brute force, which sounds bad, until you observe that the theoretical optimum for a 128-bit hash function (such as MD5) is 2^64 computations. In other words, even the so-called "broken" SHA-1 still has the same security as what we thought MD5 used to have before MD5 was broken. This is the kind of thing that a longer hash function does for you: it provides a safety margin, just by virtue of being longer.

It is instructive to compare SHA-1 with MD5 and SHA-256. A hypothetical 2^17 speedup in attacking MD5 leads to MD5 being broken (which is in fact more or less what happened to MD5), and a hypothetical 2^17 speedup in attacking SHA-256 would be a purely theoretical result.

The reason I'm replying at length is because people often don't realize that the safety margin provided by simply using a longer hash function is extremely large and very likely to be more valuable than the relatively minor safety margin provided by using two short hashes, due to the observed fact that many many more hash functions suffer partial breaks than total breaks.

SSL certificates and MD5 collisions

Posted Jan 17, 2009 16:45 UTC (Sat) by sergey (guest, #31763) [Link] (1 responses)

If similar vulnerability were to be found in PGP/GPG, how would it affect the users? Is "web of
trust" in this case more resilient than PKI's "chain of trust?"

SSL certificates and MD5 collisions

Posted Jan 18, 2009 1:17 UTC (Sun) by dlang (guest, #313) [Link]

there are two different things involved.

1. a particular signer is comprimized/doing things wrong

in this case RapidSSL had done two major things wrong
A. they didn't update to not use MD5
B. they used sequential serial numbers.

2. something fundamental in the signing/encryption mechanism is broken

it can be argued that the MD5 problem really falls in this category.

with PGP/GPG you get signatures from many different people. As such the odds of all of them doing things wrong is much lower (they create their keys at different times, with different versions of software, using different options for creating their keys, eetc)

if somthing is fundamentally broken in the core mechansim it could still break everything.

SSL certificates and MD5 collisions

Posted Jan 23, 2009 15:07 UTC (Fri) by forthy (guest, #1525) [Link]

There are large numbers of such certificates in use today, so browsers cannot just stop accepting them.

Of course they can, and they should. Well "stop accepting" is a bit strong, anyway, since you can user-override insecure certificates. What they should is provide an appropriate warning, e.g. orange. It's not completely broken (like red), but when looking for the details, the browser should warn about the outdated certificate hash algorithm.

ObCryptology: All these certificates are broken, including the SHA-1 ones. SHA-1 is an insecure hash key, even though you need way more computing power, or at least a lot longer document than an SSL certificate. But wait a few years, and a campus scale cluster of GPGPUs can crack SHA-1 in weeks. One thing to learn about digital signatures is that you should never just sign a hash. Use salted hashes for signatures (salt: Random stuff injected into the hash generation). An unsalted hash with n bits has only the strength of max n/2 bits, a salted hash (salt=n bits) gains full strength. A certificate signed by several parties, each using a different salt, has even more strength.