| This article brought to you by LWN subscribers Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible. |
The kernel's random-number generator (RNG) has seen a great deal of attention over the years; that is appropriate, given that its proper functioning is vital to the security of the system as a whole. During that time, it has acquitted itself well. That said, there are some concerns about the RNG going forward that have led to various patches aimed at improving both randomness and performance. Now there are two patch sets that significantly change the RNG's operation to consider.
The first of these comes from Stephan Müller, who has two independent sets of concerns that he is trying to address:
Stephan tries to address both problems by throwing out much of the current RNG and replacing it with "a new approach"; see this page for a highly detailed explanation of the goals and implementation of this patch set. It starts by trying to increase the amount of useful entropy that can be obtained from the environment, and from interrupt timing in particular. The current RNG assumes that the timing of a specific interrupt carries little entropy — less than one bit. Stephan's patch, instead, accounts a full bit of entropy from each interrupt. Thus, in a sense, this is an accounting change: there is no more entropy flowing into the system than before, but it is being recognized at a higher rate, allowing early-boot users of random data to proceed.
Other sources of entropy are used as well when they are available; these include a hardware RNG attached to the system or built into the CPU itself (though little entropy is credited for the latter source). Earlier versions of the patch used the CPU jitter RNG (also implemented by Stephan) as another source of entropy, but that was removed at the request of RNG maintainer Ted Ts'o, who is not convinced that differences in execution time are a trustworthy source of entropy.
The hope is that interrupt timings, when added to whatever other sources of entropy are available, will be sufficient to quickly fill the entropy pool and allow the generation of truly random numbers. As with current systems, data read from /dev/random will remove entropy directly from that pool and will not complete until sufficient entropy accumulates there to satisfy the request. The actual random numbers are generated by running data from the entropy pool through the SP800-90A deterministic random bit generator (DRBG).
For /dev/urandom, another SP800-90A DRBG is fed from the primary DRBG described above and used to generate pseudo-random data. Every so often (ten minutes at the outset), this secondary generator is reseeded from the primary. On NUMA systems, there is one secondary generator for each node, keeping the random-data generation node-local and increasing scalability.
There has been a certain amount of discussion of Stephan's proposal, which is now in its third iteration, but Ted has said little beyond questioning the use of the CPU jitter technique. Or, at least, that was true until May 2, when he posted a new RNG of his own. Ted's work takes some clear inspiration from Stephan's patches (and from Andi Kleen's scalability work from last year) but it is, nonetheless, a different approach.
Ted's patch, too, gets rid of the separate entropy pool for /dev/urandom; this time, though, it is replaced by the ChaCha20 stream cipher seeded from the random pool. ChaCha20 is deemed to be secure and, it is thought, will perform better than SP800-9A. There is one ChaCha20 instance for each NUMA node, again, hopefully, helping to improve the scalability of the RNG (though Ted makes it clear that he sees this effort as being beyond the call of duty). There is no longer any attempt to track the amount of entropy stored in the (no-longer-existing) /dev/urandom pool, but each ChaCha20 instance is reseeded every five minutes.
When the system is booting, the new RNG will credit each interrupt's timing data with one bit of entropy, as does Stephan's RNG. Once the RNG is initialized with sufficient entropy, though, the RNG switches to the current system, which accounts far less entropy for each interrupt. This policy reflects Ted's unease with assuming that there is much entropy in interrupt timings; the timing of interrupts might be more predictable than one might think, especially on virtualized systems with no direct connection to real hardware.
Stephan's response to this posting has been gracious: "In general, I have no concerns with this approach either. And thank you that some of my concerns are addressed." That, along with the fact that Ted is the ultimate decision-maker in this case, suggests that his patch set is the one that is more likely to make it into the mainline; it probably will not come down to flipping a coin. It would be most surprising to see that merging happen for 4.7 — something as sensitive as the RNG needs some review and testing time — but it could happen not too long thereafter.
Replacing /dev/urandom
Posted May 5, 2016 4:17 UTC (Thu) by m45t3r (subscriber, #92849) [Link]
Replacing /dev/urandom
Posted May 5, 2016 5:46 UTC (Thu) by dlang (guest, #313) [Link]
Replacing /dev/urandom
Posted May 5, 2016 8:25 UTC (Thu) by mgedmin (subscriber, #34497) [Link]
Replacing /dev/urandom
Posted May 5, 2016 8:42 UTC (Thu) by dlang (guest, #313) [Link]
remember that even if the official estimate of the entropy in the pool is 1 bit, the pool has had a lot of odds and ends stuffed into it. The official estimate is deliberately low, so by the time userspace is invoked, there going to be enough data there to initialize the stream cipher. The only question is how pure it is.
Replacing /dev/urandom
Posted May 5, 2016 13:56 UTC (Thu) by jond (subscriber, #37669) [Link]
Replacing /dev/urandom
Posted May 5, 2016 14:28 UTC (Thu) by fandingo (guest, #67019) [Link]
The Raspberry Pi also has hardware random number generator!
Posted May 6, 2016 12:27 UTC (Fri) by shane (subscriber, #3335) [Link]
Replacing /dev/urandom
Posted May 6, 2016 2:02 UTC (Fri) by tytso (✭ supporter ✭, #9993) [Link]
http://thread.gmane.org/gmane.linux.kernel.cryptoapi/19710
... and the latest patches can be found here:
http://git.kernel.org/cgit/linux/kernel/git/tytso/random....
Replacing /dev/urandom
Posted May 9, 2016 17:17 UTC (Mon) by nix (subscriber, #2304) [Link]
Replacing /dev/urandom
Posted May 10, 2016 1:16 UTC (Tue) by tytso (✭ supporter ✭, #9993) [Link]
The entropy estimation was being done for the non-blocking (/dev/urandom) pool before, but we weren't really doing anything with it, and in particular we were rate-limiting the amount we would draw down from the input pool if the non-blocking pool was getting aggressively drained. (And if you are running the Chrome browser, it was getting drained extremely aggressively because the Chrome browser was using /dev/urandom for everything, including session keys for https connections --- it wasn't using its own per-process CRNG.) So the entropy accounting for /dev/urandom was largely pointless which is why effectively we have been gradually transitioning /dev/urandom into something which is more and more a CRNG. The most recent patch is just the final step in a gradual evolution.
Replacing /dev/urandom
Posted May 10, 2016 21:13 UTC (Tue) by nix (subscriber, #2304) [Link]
Replacing /dev/urandom
Posted May 10, 2016 22:46 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]
/dev/urandom works and is easy to use. It's also secure - you can't easily dump its state by attaching to a process with ptrace or gleam it from core dumps.
Replacing /dev/urandom
Posted May 11, 2016 20:58 UTC (Wed) by tytso (✭ supporter ✭, #9993) [Link]
Replacing /dev/urandom
Posted May 11, 2016 22:33 UTC (Wed) by nix (subscriber, #2304) [Link]
Replacing /dev/urandom
Posted May 16, 2016 11:56 UTC (Mon) by Otus (subscriber, #67685) [Link]
Is there any evidence for the amounts credited?
My immediate impression is to expect interrupts during boot to be *less* random rather than more,
Replacing /dev/urandom
Posted May 16, 2016 12:22 UTC (Mon) by tao (subscriber, #17563) [Link]
Replacing /dev/urandom
Posted May 16, 2016 13:13 UTC (Mon) by paulj (subscriber, #341) [Link]
Replacing /dev/urandom
Posted May 17, 2016 4:38 UTC (Tue) by Otus (subscriber, #67685) [Link]
If 1 bit / interrupt is a good enough estimate for boot-time entropy, it ought to be good enough for later.
If it is not, then we should not pretend to have enough entropy when we do not.
Copyright © 2016, Eklektix, Inc.
This article may be redistributed under the terms of the
Creative
Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds