Unmixing the pool
One of the more useful outcomes from the Snowden revelations may well be the increased scrutiny of the security of our systems. While no one would (sensibly) claim that all of the problems have been found and fixed, there has been improvement in many different areas over the last year or so. One of the areas that has received some attention recently has been the kernel's random number generation. Two recent patches continue that trend, though it is hard to claim that they are the direct result of the exposure of NSA (and other secret agency) spying.
Both patches update the state of the "entropy pool" that is used to generate random numbers for both /dev/random and /dev/urandom. That pool is associated with a (conservative) estimate of the amount of entropy (i.e. state unknowable by an attacker) stored within it. Anything that is believed to truly add entropy gets credited in that estimate, while other, possibly even attacker-controlled, input is simply mixed into the pool without entropy credit. The entropy estimate is used to block /dev/random readers when the amount of data requested is larger than the amount of entropy in the pool.
Adding RDSEED
The first patch is from H. Peter Anvin and it simply adds support for the RDSEED instruction to the
kernel. RDSEED is an instruction being added to Intel processors
that returns "fully
conditioned entropy that is suitable for use as seeds to a PRNG
[pseudo-random number generator]
". The patches use four bytes of
RDSEED output
to mix into the entropy pool at boot time (with
no entropy credit). In addition, four bytes are generated using the
instruction once per second and then mixed into the pool.
It is also used to do an "emergency refill" with 64 bytes of RDSEED output
if /dev/random is
about to block due to a lack of entropy credit to fulfill a
request. In both of the latter cases, four bits of credit are given for
each byte of RDSEED output that gets mixed into the pool.
Some may not be convinced that a black-box hardware random number generator (RNG) buried inside an Intel chip should be given that much (or any) entropy credit. It is a difficult question, as there is no technical barrier to the instruction returning known-to-the-NSA sequences and there is no way for anyone (at least anyone outside of Intel) to know for sure. While that may seem paranoid, many formerly paranoid scenarios have moved into the "plausible" category over the last year. That concern has not been raised about the RDSEED patches, however.
Mixing and unmixing
The other patch, from Kees Cook, would add some
output from a newly instantiated hardware RNG into the entropy pool.
When the RNG is
registered (via hwrng_register()), sixteen bytes of its output
would get mixed into the pool, but without any entropy credit. Jason
Cooper was concerned that even mixing these
bytes into the pool could lead to problems: "By adding this patch, even without crediting entropy to the pool, a
rogue hwrng now has significantly more influence over the initial state
of the entropy pools.
"
But Cook didn't see it as any different than mixing in other random or system-specific data at initialization time:
In addition, former random subsystem maintainer Matt Mackall brought up an important aspect of the design of the mixing function. Because it can be reversed, mixing even attacker-controlled data into the pool can never remove randomness that was there at the outset:
That means, if I have an initial secret pool state X, and hostile attacker controlled data Y, then we can do:
X' = mix(X, Y)
and
X = unmix(X', Y)
We can see from this that the combination of (X' and Y) still contain the information that was originally in X. Since it's clearly not in Y.. it must all remain in X'.
That didn't entirely mollify Cooper, who was still concerned that built-in hardware RNGs would have their output mixed in early in the boot sequence. He was worried that those bytes could pollute the pool, but Mackall reiterated his argument, putting it in starker terms:
Put another way: mixing can't ever [remove] unknownness from the pool, it can only add more. So the only reason you should ever choose not to mix something into the pool is performance.
While the mixing function design with reversibility in mind (and its implications) was clear in Mackall's mind, it would seem that others in the kernel community did not know about it. It's an interesting property that makes perfect sense, once you know about it, but is rather counter-intuitive otherwise. In any case, Cooper's objections were withdrawn, and hardware RNG maintainer Herbert Xu queued the patch. We should see it in 3.15.
Index entries for this article | |
---|---|
Kernel | Random numbers |
Security | Linux kernel |
Security | Random number generation |
Posted Mar 13, 2014 16:52 UTC (Thu)
by smoogen (subscriber, #97)
[Link] (1 responses)
Posted Mar 13, 2014 18:28 UTC (Thu)
by hkario (subscriber, #94864)
[Link]
Posted Mar 13, 2014 20:04 UTC (Thu)
by jimparis (guest, #38647)
[Link] (2 responses)
That's assuming the pool state X is secret. If the HWRNG can snoop on things, like the CPU's RDRAND instruction can, then it can easily choose Y based on X, in which case the "it's clearly not in Y" assertion doesn't apply. See e.g. http://blog.cr.yp.to/20140205-entropy.html.
Posted Mar 13, 2014 20:46 UTC (Thu)
by dlang (guest, #313)
[Link] (1 responses)
And if they only know about some inputs, but not others, they can't predict the output.
Posted Mar 13, 2014 21:00 UTC (Thu)
by jimparis (guest, #38647)
[Link]
Assuming they have some way of getting that information out of the compromised system, sure. From the link I mentioned: "Of course, the malicious device will also be able to see other sensitive information, not just x and y. But this doesn't mean that it's cheap for the attacker to exfiltrate this information! The attacker needs to find a communication channel out of the spying device. Randomness generation influenced by the device is a particularly attractive choice of channel, as I'll explain below".
Posted Mar 20, 2014 18:29 UTC (Thu)
by mentor (guest, #80761)
[Link] (1 responses)
I'm no expert in the field, but it seems to me that even if a reversible process has a useful security property, it may not be necessary to use that reversible process to get the property.
Posted Mar 24, 2014 22:48 UTC (Mon)
by kleptog (subscriber, #1183)
[Link]
And if your threat model includes the possibility that the RDRAND can actually peek into the state of the pool it is being mixed into, well, then you're screwed anyway.
Posted Mar 24, 2014 16:54 UTC (Mon)
by nix (subscriber, #2304)
[Link]
Unmixing the pool
Unmixing the pool
Unmixing the pool
> X' = mix(X, Y)
> and
> X = unmix(X', Y)
> We can see from this that the combination of (X' and Y) still contain the information that was originally in X. Since it's clearly not in Y.. it must all remain in X'.
Unmixing the pool
Unmixing the pool
Reversible mixing
Reversible mixing
Unmixing the pool