Replacing /dev/urandom
The first of these comes from Stephan Müller, who has two independent sets of concerns that he is trying to address:
- The randomness (entropy) in the RNG, in the end, comes from sources of
physical entropy in the outside world. In practice, that means the
timing of disk-drive operations, human-input events, and interrupts in
general. But the solid-state drives deployed in current systems are
far more predictable than rotating drives, many systems are deployed
in settings where there are no human-input events at all, and, in any
case, the entropy gained from those events duplicates the entropy from
interrupts in general. The end result, Stephan fears, is that the
current RNG is unable to pick up enough entropy to be truly random,
especially early in the bootstrap process.
- The RNG has shown some scalability problems on large NUMA systems, especially when faced with workloads that consume large amounts of random data from the kernel. There have been various attempts to improve RNG scalability over the last year, but none have been merged to this point.
Stephan tries to address both problems by throwing out much of the current RNG and replacing it with "a new approach"; see this page for a highly detailed explanation of the goals and implementation of this patch set. It starts by trying to increase the amount of useful entropy that can be obtained from the environment, and from interrupt timing in particular. The current RNG assumes that the timing of a specific interrupt carries little entropy — less than one bit. Stephan's patch, instead, accounts a full bit of entropy from each interrupt. Thus, in a sense, this is an accounting change: there is no more entropy flowing into the system than before, but it is being recognized at a higher rate, allowing early-boot users of random data to proceed.
Other sources of entropy are used as well when they are available; these include a hardware RNG attached to the system or built into the CPU itself (though little entropy is credited for the latter source). Earlier versions of the patch used the CPU jitter RNG (also implemented by Stephan) as another source of entropy, but that was removed at the request of RNG maintainer Ted Ts'o, who is not convinced that differences in execution time are a trustworthy source of entropy.
The hope is that interrupt timings, when added to whatever other sources of entropy are available, will be sufficient to quickly fill the entropy pool and allow the generation of truly random numbers. As with current systems, data read from /dev/random will remove entropy directly from that pool and will not complete until sufficient entropy accumulates there to satisfy the request. The actual random numbers are generated by running data from the entropy pool through the SP800-90A deterministic random bit generator (DRBG).
For /dev/urandom, another SP800-90A DRBG is fed from the primary DRBG described above and used to generate pseudo-random data. Every so often (ten minutes at the outset), this secondary generator is reseeded from the primary. On NUMA systems, there is one secondary generator for each node, keeping the random-data generation node-local and increasing scalability.
There has been a certain amount of discussion of Stephan's proposal, which is now in its third iteration, but Ted has said little beyond questioning the use of the CPU jitter technique. Or, at least, that was true until May 2, when he posted a new RNG of his own. Ted's work takes some clear inspiration from Stephan's patches (and from Andi Kleen's scalability work from last year) but it is, nonetheless, a different approach.
Ted's patch, too, gets rid of the separate entropy pool for /dev/urandom; this time, though, it is replaced by the ChaCha20 stream cipher seeded from the random pool. ChaCha20 is deemed to be secure and, it is thought, will perform better than SP800-9A. There is one ChaCha20 instance for each NUMA node, again, hopefully, helping to improve the scalability of the RNG (though Ted makes it clear that he sees this effort as being beyond the call of duty). There is no longer any attempt to track the amount of entropy stored in the (no-longer-existing) /dev/urandom pool, but each ChaCha20 instance is reseeded every five minutes.
When the system is booting, the new RNG will credit each interrupt's timing data with one bit of entropy, as does Stephan's RNG. Once the RNG is initialized with sufficient entropy, though, the RNG switches to the current system, which accounts far less entropy for each interrupt. This policy reflects Ted's unease with assuming that there is much entropy in interrupt timings; the timing of interrupts might be more predictable than one might think, especially on virtualized systems with no direct connection to real hardware.
Stephan's response to this posting has been
gracious: "In general, I have no concerns with this approach
either. And thank you that some of my concerns are addressed.
"
That, along with the fact that Ted is the ultimate decision-maker in this
case, suggests that his patch set is the one that is more likely to make it
into the mainline; it probably will not come down to flipping a coin. It
would be most surprising to see that merging happen for 4.7
— something as sensitive as the RNG needs some review and testing time —
but it could happen not too long thereafter.
Index entries for this article | |
---|---|
Kernel | Random numbers |
Kernel | Security/Random number generation |
Security | Linux kernel |
Security | Random number generation |
Posted May 5, 2016 4:17 UTC (Thu)
by kokada (guest, #92849)
[Link] (1 responses)
Posted May 5, 2016 5:46 UTC (Thu)
by dlang (guest, #313)
[Link]
Posted May 5, 2016 8:25 UTC (Thu)
by mgedmin (subscriber, #34497)
[Link] (1 responses)
Posted May 5, 2016 8:42 UTC (Thu)
by dlang (guest, #313)
[Link]
remember that even if the official estimate of the entropy in the pool is 1 bit, the pool has had a lot of odds and ends stuffed into it. The official estimate is deliberately low, so by the time userspace is invoked, there going to be enough data there to initialize the stream cipher. The only question is how pure it is.
Posted May 5, 2016 13:56 UTC (Thu)
by jond (subscriber, #37669)
[Link] (2 responses)
Posted May 5, 2016 14:28 UTC (Thu)
by fandingo (guest, #67019)
[Link] (1 responses)
Posted May 6, 2016 12:27 UTC (Fri)
by shane (subscriber, #3335)
[Link]
Posted May 6, 2016 2:02 UTC (Fri)
by tytso (subscriber, #9993)
[Link] (6 responses)
http://thread.gmane.org/gmane.linux.kernel.cryptoapi/19710
... and the latest patches can be found here:
http://git.kernel.org/cgit/linux/kernel/git/tytso/random....
Posted May 9, 2016 17:17 UTC (Mon)
by nix (subscriber, #2304)
[Link] (5 responses)
Posted May 10, 2016 1:16 UTC (Tue)
by tytso (subscriber, #9993)
[Link] (4 responses)
The entropy estimation was being done for the non-blocking (/dev/urandom) pool before, but we weren't really doing anything with it, and in particular we were rate-limiting the amount we would draw down from the input pool if the non-blocking pool was getting aggressively drained. (And if you are running the Chrome browser, it was getting drained extremely aggressively because the Chrome browser was using /dev/urandom for everything, including session keys for https connections --- it wasn't using its own per-process CRNG.) So the entropy accounting for /dev/urandom was largely pointless which is why effectively we have been gradually transitioning /dev/urandom into something which is more and more a CRNG. The most recent patch is just the final step in a gradual evolution.
Posted May 10, 2016 21:13 UTC (Tue)
by nix (subscriber, #2304)
[Link] (3 responses)
Posted May 10, 2016 22:46 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
/dev/urandom works and is easy to use. It's also secure - you can't easily dump its state by attaching to a process with ptrace or gleam it from core dumps.
Posted May 11, 2016 20:58 UTC (Wed)
by tytso (subscriber, #9993)
[Link]
Posted May 11, 2016 22:33 UTC (Wed)
by nix (subscriber, #2304)
[Link]
Posted May 16, 2016 11:56 UTC (Mon)
by Otus (subscriber, #67685)
[Link] (3 responses)
Is there any evidence for the amounts credited?
My immediate impression is to expect interrupts during boot to be *less* random rather than more,
Posted May 16, 2016 12:22 UTC (Mon)
by tao (subscriber, #17563)
[Link] (2 responses)
Posted May 16, 2016 13:13 UTC (Mon)
by paulj (subscriber, #341)
[Link] (1 responses)
Posted May 17, 2016 4:38 UTC (Tue)
by Otus (subscriber, #67685)
[Link]
If 1 bit / interrupt is a good enough estimate for boot-time entropy, it ought to be good enough for later.
Replacing /dev/urandom
Replacing /dev/urandom
Replacing /dev/urandom
Replacing /dev/urandom
Replacing /dev/urandom
Replacing /dev/urandom
The Raspberry Pi also has hardware random number generator!
Replacing /dev/urandom
Replacing /dev/urandom
Replacing /dev/urandom
Replacing /dev/urandom
Replacing /dev/urandom
Why is it silly?
Replacing /dev/urandom
Replacing /dev/urandom
Replacing /dev/urandom
Replacing /dev/urandom
Replacing /dev/urandom
Replacing /dev/urandom
If it is not, then we should not pretend to have enough entropy when we do not.