LWN.net Logo

RSA keys not as random as they should be (The H)

The H reports on research that found a significant number of RSA public keys are not secure. "Of the 6,185,372 X.509 certificates analysed, the researchers found 266,729 public keys in which moduli were reused. The modulus is the core component of a public key – if it is the same, then the secret key matches. In one extreme case, the same modulus was found 16,489 times. This means that each of the owners of the 16,489 certificates could spoof or spy on each of the other 16,488. The researchers note that it is not unusual to recycle keys when, for example, extending a certificate, but a significant number of these keys belong to entirely independent owners." Interestingly, OpenPGP keys generated by GPG do not seem to suffer from this problem.
(Log in to post comments)

RSA keys not as random as they should be (The H)

Posted Feb 16, 2012 21:25 UTC (Thu) by masoncl (subscriber, #47138) [Link]

Kudos for LWN for being the first place I've seen mention in big friendly letters the status of gpg keys.

RSA keys not as random as they should be (The H)

Posted Feb 16, 2012 21:33 UTC (Thu) by tialaramex (subscriber, #21167) [Link]

Jake's apparent surprise that GnuPG is not affected reflects a general misunderstanding of what's going on here. This is an implementation bug. All the _facts_ reported in the paper are correct, but you should check other sources before believing their conclusions, let alone leaping to more general conclusions like "there's a chance my bank is affected".

Kaminsky explains that this isn't about RSA versus anything, unless in the sense that RSA beats everything because it was the only public key algorithm being used widely enough for this sort of result to be found:

http://dankaminsky.com/2012/02/14/ronwhit/

Heninger explains what causes this sort of phenomenon (bad implementations + poor entropy) and gives you some idea where all these vulnerable keys came from (think $50 Internet appliance, not $50Bn Internet bank)

https://freedom-to-tinker.com/blog/nadiah/new-research-th...

RSA keys not as random as they should be (The H)

Posted Feb 16, 2012 23:23 UTC (Thu) by Ben_P (subscriber, #74247) [Link]

That second link was great, thanks.

RSA keys not as random as they should be (The H)

Posted Feb 17, 2012 9:28 UTC (Fri) by jezuch (subscriber, #52988) [Link]

There is also a longer than usual comment from Bruce Schneier:

> The cause of this is almost certainly a lousy random number generator used to create those public keys in the first place. This shouldn't come as a surprise. One of the hardest parts of cryptography is random number generation. It's really easy to write a lousy random number generator, and it's not at all obvious that it is lousy.

http://www.schneier.com/blog/archives/2012/02/lousy_rando...

RSA keys not as random as they should be (The H)

Posted Feb 17, 2012 3:45 UTC (Fri) by fredcadete (subscriber, #81023) [Link]

Maybe those keys were all generated under Debian when it had that openssl bug.

RSA keys not as random as they should be (The H)

Posted Feb 17, 2012 8:13 UTC (Fri) by geuder (subscriber, #62854) [Link]

> Maybe those keys were all generated under Debian when it had that openssl bug

The exact properties of keys created using versions affected by that bug should be perfectly known to the researchers in this field (not to me though...) We are talking about open source after all. So if that well-known bug is not mentioned there it must be something different/new or the work is not worth its classification as research.

RSA keys not as random as they should be (The H)

Posted Feb 17, 2012 8:53 UTC (Fri) by puzzlement (subscriber, #51999) [Link]

The keyspace that Debian was generating was so small that not just the properties of the keys are known, the keys themselves can and have been exhaustively listed, at least for certain common key lengths.

http://packages.debian.org/sid/openssl-blacklist and similar packages contain the relevant lists of keys.

RSA keys not as random as they should be (The H)

Posted Feb 17, 2012 16:23 UTC (Fri) by jezuch (subscriber, #52988) [Link]

> The keyspace that Debian was generating was so small that not just the properties of the keys are known, the keys themselves can and have been exhaustively listed, at least for certain common key lengths.

You think this was small? Schneier says: "(One of the reporters who called me about this story said that the researchers told him about a real-world random number generator that produced just seven different random numbers.)"

RSA keys not as random as they should be (The H)

Posted Feb 17, 2012 22:50 UTC (Fri) by puzzlement (subscriber, #51999) [Link]

Well, the Debian space was using the process ID, so there were a few tens of thousands of psuedo-random number streams. In cryptographic terms, the magnitude difference between seven and thirty thousand is not that interesting!

RSA keys not as random as they should be (The H)

Posted Feb 17, 2012 12:28 UTC (Fri) by tialaramex (subscriber, #21167) [Link]

Yes, the paper's authors mention this precedent and remove its effects as best they can from their source data by eliminating keys from the known lists of bad keys generated by Debian-bug-affected systems.

In any case, the researchers were more interested in the phenomenon of keys with just one shared random factor. The Debian bug could not cause this problem, and although it's conceivable that someone could damage the OpenSSL library in such a way as to create such keys it seems unlikely. So probably all these poor quality keys were generated by hand-rolled RSA key generators that had both a shoddy algorithm AND a bad entropy source.

Suppose we are making 5 million RSA enabled electric toasters. If we embedded an older Debian in these toasters, perhaps they only have 100k distinct RSA private keys. Bad guys can download the list from Debian, or else buy a hundred thousand toasters and recover enough keys to statistically break into almost every other toaster (and... burn people's toast?)

If we instead used a correct OpenSSL implementation but we hook it up to use a very weak entropy source (e.g. xkcd's joke solution of rolling a die and hard-coding the result in the PRNG source code) when the keys are generated, perhaps every toaster gets an identical key. Bad guys need buy only one toaster to fill kitchens across the land with the smell of burning.

However if we hand-rolled our own RSA key generator, and we used a somewhat weak entropy source, then we might create the phenomenon these researchers found in a small but significant minority of wild RSA keys. Of the millions of toasters only a few tens of thousands have identical keys to one another BUT most toasters share a common prime factor with another toaster. As I understanding it in THIS scenario bad guys don't have to buy ANY toasters to get at their private keys, by examining the public keys of many toasters they can soon burn toast across the land.

How this ends up being so bad

Posted Feb 17, 2012 12:49 UTC (Fri) by tialaramex (subscriber, #21167) [Link]

RSA depends on primes. A real RSA system uses large primes, but those boggle the human mind and make examples impractical to follow. So I shall use relatively small primes, and I will dodge lots of other complexities that make RSA hard for humans to follow.

We are toasters, for our RSA key we must pick two primes and then tell them to nobody, but we tell the product of the two numbers to everybody. Knowing the primes is the private key to our "trapdoor" which enables us to perform an operation bad guys cannot afford to, such a trapdoor function is essential to any public key system.

Because of our faulty entropy source, we always pick our prime factors from the following list:

419, 431, 457, 461, 463, 487, 491 and 523

Suppose I pick 419 and 487. I multiply them together and tell people 204053 is my public key. Meanwhile, another toaster picks 491 and 419. Their public key is 205729

Our hypothetical bad guys observe that 205729 - 204053 is 1676 (subtraction is very cheap). They then very easily find that 419 is a factor of 1676 (factorisation of small numbers is quite cheap), and it's also a factor of 204053 and 205729. They have identified both sets of private keys using only cheap operations and the public keys!

How this ends up being so bad

Posted Feb 18, 2012 1:04 UTC (Sat) by bronson (subscriber, #4806) [Link]

Now that's a great description.

How this ends up being so bad

Posted Feb 18, 2012 3:19 UTC (Sat) by rgmoore (✭ supporter ✭, #75) [Link]

The authors actually narrow in on a slightly different explanation than just a small pool of primes. They point out that OpenSSL adds a small amount of entropy (the current time in seconds) to the pool between generating the first and second prime. In that case, a very weak entropy pool will result in a lot of duplication of the first prime in the pair, while the extra entropy results in much less duplication of the second one. That will result in relatively few completely duplicated keys but many keys that share one prime.

They also did their process of finding the re-used primes differently from the way you describe. Comparing all the keys pairwise is O(N^2), which will be slow when looking at millions of keys even if you have an efficient algorithm. What they did instead is to recognize that you can look for common primes between one key and all the other keys by taking the GCD of the one key and the product of all the other keys. Taking the product of millions of keys is also slow, so they sped the process up by tweaking the algorithm so it can work with the product of all keys including the one being compared to the rest. That way they only have to do the slow multiplication step once. That ought to give them an algorithm that's something like O(N*logN), and hence much more tractable when N is large.

How this ends up being so bad

Posted Feb 18, 2012 10:07 UTC (Sat) by tialaramex (subscriber, #21167) [Link]

Yes, their method for working with huge volumes of keys is much cleverer than what I've shown (and the GCD is just better outright and I should have used it even in my illustration)

However, I am surprised that you say "They point out that OpenSSL adds a small amount of entropy ...". Do the authors actually claim this? I can't find it in their paper, indeed the paper seems to make out that the origin of the shared factors is unknown to them, and other commentators have put it down to unknown proprietary key generators rather than the easily inspected OpenSSL. Perhaps they have written about it elsewhere?

How this ends up being so bad

Posted Feb 18, 2012 15:26 UTC (Sat) by rgmoore (✭ supporter ✭, #75) [Link]

The comment about OpenSSL is in the Freedom to Tinker link tialaramex provided above. The full quote is:

OpenSSL's RSA key generation functions this way: each time random bits are produced from the entropy pool to generate the primes p and q, the current time in seconds is added to the entropy pool. Many, but not all, of the vulnerable keys were generated by OpenSSL and OpenSSH, which calls OpenSSL's RSA key generation code.

I'm a bit surprised that didn't make it into the paper.

How this ends up being so bad

Posted Feb 18, 2012 17:39 UTC (Sat) by tialaramex (subscriber, #21167) [Link]

Wow. So it does. OK, that seems like a fairly stupid decision in OpenSSL and I hope it gets changed (if your entropy is good it makes things no worse, but if it's bad then it makes things no better AND it makes it harder to spot that you've got a problem)

I presume it's not in the Lenstra et al. paper because that's a different research group. It's not clear how much, beyond the massive pile of SSL data that everyone now has access to, was shared between these groups. If the method is sound then obviously everyone trying it will find similar results (ie lots of keys with a shared factor) but they may draw different conclusions from what they find.

I guess this gives me even more reason to pay attention to the Princeton group's full paper.

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds