The search for truly random numbers in the kernel

By Jonathan Corbet
September 18, 2013

The ongoing disclosures of governmental attempts to weaken communications security have caused a great deal of concern. Thus far, the evidence would seem to suggest that the core principles behind cryptography remain sound, and that properly encrypted communications can be secure. But the "properly encrypted" part is a place where many things can go wrong. One of those things is the generation of random numbers; without true randomness (or, at least, unpredictability), encryption algorithms can be far easier to break than their users might believe. For this reason, quite a bit of attention has been paid to the integrity of random number generation mechanisms, including the random number generator (RNG) in the kernel.

Random number generation in Linux seems to have been fairly well thought out with no obvious mistakes. But that does not mean that all is perfect, or that improvements are not possible. The kernel's random number generator has been the subject of a few different conversations recently, some of which will be summarized here.

Hardware random number generators

A program running on a computer is a deterministic state machine that cannot, on its own, generate truly random numbers. In the absence of a source of randomness from the outside world, the kernel is reduced to the use of a pseudo-random number generator (PRNG) algorithm that, in theory, will produce numbers that could be guessed by an attacker. In practice, guessing the results of the kernel's PRNG will not be an easy task, but those who concern themselves with these issues still believe that it is better to incorporate outside entropy (randomness) whenever it is possible.

One obvious source of such randomness would be a random number generator built into the hardware. By sampling quantum noise, such hardware could create truly random data. So it is not surprising that some processors come with RNGs built in; the RDRAND instruction provided by some Intel processors is one example. The problem with hardware RNGs is that they are almost entirely impossible to audit; should some country's spy agency manage to compromise a hardware RNG, this tampering would be nearly impossible to detect. As a result, people who are concerned about randomness tend to look at the output of hardware RNGs with a certain level of distrust.

Some recently posted research [PDF] can only reinforce that distrust. The researchers (Georg T. Becker, Francesco Regazzoni, Christof Paar, and Wayne P. Burleson) have documented a way to corrupt a hardware RNG by changing the dopant polarity in just a few transistors on a chip. The resulting numbers still pass tests of randomness, and, more importantly, the hardware still looks the same at almost every level, regardless of whether one looks at the masks used or whether one looks at the chip directly with an electron microscope. This type of hardware compromise is thus nearly impossible to detect; it is also relatively easy to carry out. The clear conclusion is that hostile hardware is a real possibility; the corruption of a relatively simple and low-level component like an RNG is especially so. Thus, distrust of hardware RNGs would appear to be a healthy tendency.

The kernel's use of data from hardware RNGs has been somewhat controversial from the beginning, with some developers wanting to avoid such sources of entropy altogether. The kernel's approach, though, is that using all available sources of entropy is a good thing, as long as it is properly done. In the case of a hardware RNG, the random data is carefully mixed into the buffer known as the "entropy pool" before being used to generate kernel-level random numbers. In theory, even if the data from the hardware RNG is entirely hostile, it cannot cause the state of the entropy pool to become known and, thus, it cannot cause the kernel's random numbers to be predictable.

Given the importance of this mixing algorithm, it was a little surprising to see, earlier this month, a patch that would allow the user to request that the hardware RNG be used exclusively by the kernel. The argument for the patch was based on performance: depending entirely on RDRAND is faster than running the kernel's full mixing algorithm. But the RNG is rarely a performance bottleneck in the kernel, and the perceived risk of relying entirely on the hardware RNG was seen as being far too high. So the patch was not received warmly and had no real chance of being merged; sometimes it is simply better not to tempt users to compromise their security in the name of performance.

H. Peter Anvin raised a related question: what about hardware RNGs found in other components, and, in particular, in trusted platform module (TPM) chips? Some TPMs may have true RNGs in them; others are known to use a PRNG and, thus, are fully deterministic. What should the kernel's policy be with regard to these devices, which, for the most part, are ignored currently? The consensus seemed to be that no particular trust should be put into TPM RNGs, but that using some data from the TPM to seed the kernel's entropy pool at boot could be beneficial. Many systems have almost no entropy to offer at boot time, so even suspect random data from the TPM would be helpful early in the system's lifetime.

Overestimated entropy

As noted above, the kernel attempts to pick up entropy from the outside world whenever possible. One source of entropy is the timing of device interrupts; that random data is obtained by (among other things) reading the time stamp counter (TSC) with a call to get_cycles() and using the least significant bits. In this way, each interrupt adds a little entropy to the pool. There is just one little problem, pointed out by Stephan Mueller: on a number of architectures, the TSC does not exist and get_cycles() returns zero. The amount of entropy found in a constant stream of zeroes is rather less than one might wish for; the natural consequence is that the kernel's entropy pool may contain less entropy than had been thought.

The most heavily used architectures do not suffer from this problem; on the list of those that do, the most significant may be MIPS, which is used in a wide range of home network routers and other embedded products. As it turned out, Ted Ts'o had already been working with the MIPS maintainers to find a solution to this problem. He didn't like Stephan's proposed solution — reading a hardware clock if get_cycles() is not available — due to the expense; hardware clocks can take a surprisingly long time to read. Instead, he is hoping that each architecture can, somehow, provide some sort of rapidly increasing counter that can be used to contribute entropy to the pool. In the case of MIPS, there is a small counter that is incremented each clock cycle; it doesn't hold enough bits to work as a TSC, but it's sufficient for entropy generation.

In the end, a full solution to this issue will take a while, but, Ted said, that is not necessarily a problem:

If we believed that /dev/random was actually returning numbers which are exploitable, because of this, I might agree with the "we must do SOMETHING" attitude. But I don't believe this to be the case. Also note that we're talking about embedded platforms, where upgrade cycles are measured in years --- if you're lucky. There are probably home routers still stuck on 2.6 [...]

So, he said, it is better to take some time and solve the problem properly.

Meanwhile, Peter came to another conclusion about the entropy pool: when the kernel writes to that pool, it doesn't account for the fact that it will be overwriting some of the entropy that already exists there. Thus, he said, the kernel's estimate for the amount of entropy in the pool is almost certainly too high. He put together a patch set to deal with this problem, but got little response. Perhaps that's because, as Ted noted in a different conversation, estimating the amount of entropy in the pool is a hard problem that cannot be solved without knowing a lot about where the incoming entropy comes from.

The kernel tries to deal with this problem by being conservative in its accounting for entropy. Quite a few sources of unpredictable data are mixed into the pool with no entropy credit at all. So, with luck, the kernel will have a vague handle on the amount of entropy in the pool, and its mixing techniques and PRNG should help to make its random numbers thoroughly unpredictable. The end result should be that anybody wanting to attack the communications security of Linux users will not see poor random numbers as the easiest approach; in this world, one cannot do a whole lot better than that.

Index entries for this article
Kernel	Random numbers
Kernel	Security/Random number generation
Security	Linux kernel/Random number generation
Security	Random number generation

The search for truly random numbers in the kernel

Posted Sep 19, 2013 3:55 UTC (Thu) by lsl (subscriber, #86508) [Link]

> There are probably home routers still stuck on 2.6

2.6? Optimistic. There are consumer routers sold today that ship with a 2.4 kernel.

The search for truly random numbers in the kernel

Posted Sep 19, 2013 6:25 UTC (Thu) by eru (subscriber, #2753) [Link] (4 responses)

It seems the truly paranoid should solder their own hardware RNG. If I remember correctly, a basic RNG is not hard: wire a cheap transistor suitably and sample the resulting noisy signal. The last part is harder today. In the past, one could repurpose the printer port or an analogue joystick port for simple digitizing, but they have disappeared from current computers

The search for truly random numbers in the kernel

Posted Sep 20, 2013 17:13 UTC (Fri) by broonie (subscriber, #7078) [Link] (3 responses)

A sound card is usually pretty straightforward to get - a USB one can be added if required.

The search for truly random numbers in the kernel

Posted Sep 20, 2013 17:37 UTC (Fri) by pizza (subscriber, #46) [Link] (2 responses)

There's not enough entropy (ie noise) in my server environment for audio-based entropy sources to work. I found that surprising, but that's what audio-entropyd determined.

I simply don't have any meaningful source of entropy available, which is why I'm resorting to hacking together some sort of USB-based RNG.

The search for truly random numbers in the kernel

Posted Sep 20, 2013 20:33 UTC (Fri) by giraffedata (guest, #1954) [Link] (1 responses)

I thought broonie's point was that you can use a sound processor to sample eru's transistor's output.

The search for truly random numbers in the kernel

Posted Sep 21, 2013 13:19 UTC (Sat) by broonie (subscriber, #7078) [Link]

Yes, just using it as an ADC.

The search for truly random numbers in the kernel

Posted Sep 19, 2013 11:54 UTC (Thu) by alankila (guest, #47141) [Link] (4 responses)

Linux's random source pool is IMHO distinguished by being miserably slow at accumulating random bits and ridiculously complex for what it does with difficult problems like entropy estimation. Why don't we just use some design such as Fortuna?

Fortuna

Posted Sep 19, 2013 14:04 UTC (Thu) by cesarb (subscriber, #6266) [Link] (3 responses)

Seconded. One interesting feature of Fortuna is that it does not need any entropy estimation.

Fortuna has two parts. The generator is simply AES (or another good block cipher) in counter mode, with the key being the state. When you request random bytes, it generates the requested bytes plus a new key (rotating the state).

The accumulator has 32 pools. Incoming entropy is uniformly distributed between the pools. Each pool is used for seeding half as often as the previous pool, so the first pool is used every reseed, the second pool is used every other reseed, the third pool is used every four reseeds, and so on. A reseed is done every time the generator is called (before calling the generator), if the first pool has received more than a fixed number of bytes, and at most once every 0.1 second.

With this design, at least one of the pools will have received enough entropy when it comes its turn to be used to reseed the generator. You do not have to know which one, so there is no need to estimate entropy. The state for the generator is the counter and the key, and the state for the accumulator is a rolling hash for each pool (which summarizes its contents) plus the size of the first pool and the last reseed time.

For more information, see http://th.informatik.uni-mannheim.de/people/lucks/papers/... and https://en.wikipedia.org/wiki/Fortuna_%28PRNG%29.

I believe the current Linux PRNG was originally derived from the one used by PGP, with modifications accumulating over time. Fortuna (and its predecessor Yarrow) is a more modern design.

Fortuna

Posted Sep 19, 2013 17:26 UTC (Thu) by kleptog (subscriber, #1183) [Link] (2 responses)

Well, Fortuna is designed to be a secure pseudo-random number generator. If that's what you want then /dev/urandom is fine. What we are talking is true random output, which you can't simply generate large amounts of without external input.

We could just drop /dev/random altogether and symlink it to /dev/urandom. Which is what using Fortuna would amount to.

Fortuna

Posted Sep 19, 2013 20:56 UTC (Thu) by cesarb (subscriber, #6266) [Link] (1 responses)

> Well, Fortuna is designed to be a secure pseudo-random number generator. If that's what you want then /dev/urandom is fine. What we are talking is true random output, which you can't simply generate large amounts of without external input.

Both /dev/random and /dev/urandom are secure pseudo-random number generators, and both are seeded with external entropy in the same way. The only difference between both is that /dev/random blocks if the entropy estimate is too low, while /dev/urandom does not.

You could use Fortuna for /dev/random too, you would just need to add some sort of entropy estimator, to keep compatibility with userspace which expects it to block arbitrarily. Since most users (even cryptographic libraries) seem to use /dev/urandom, any problems with this entropy estimator would not affect them.

And for users of /dev/random, you would gain the advantage that the quality of the random numbers would not be affected by bugs on the entropy estimator.

Yes, an overestimation would that mean more numbers would be generated from the same amount of entropy. But Fortuna (and Yarrow) is designed so that, as long as the generator has more than a minimum amount of good quality entropy to start with (256 bits IIRC), and the state is not compromised, the output is secure even without reseeding (if the state can be compromised from the output, AES is broken and you have bigger problems). After the initial seeding, the reseeding is there only to recover if the generator state is compromised; see the presentation for more details.

Fortuna

Posted Sep 24, 2013 21:29 UTC (Tue) by joern (guest, #22392) [Link]

While in principle I like the fortuna design, I am wondering what problem you are trying to solve. Both fortuna and the current linux design are prng, both use (or can use) real entropy to seed their pools and to stir them up once in a while. Both should give random numbers that are impossible to predict, assuming the attacker can do almost anything except dump the entropy pool itself or control every bit of input to the pool.

One of the two might have a more pleasing design. But as long as both get the job done equally well, why would you want to do a replacement? You can easily introduce subtle bugs and, without a strong upside, I would like to avoid that possibility.

The SOURCE for truly random numbers in the kernel is external hardware

Posted Sep 19, 2013 15:04 UTC (Thu) by faramir (subscriber, #2327) [Link] (6 responses)

If I can buy a 4Gbyte USB flash drive for < $5, it should be possible to build a USB attachable source of truly random numbers for a reasonable amount of money. All that needs to happen in the kernel is to define exactly how such a device talks to the kernel and a way for a system administrator to enable use of a specific device. Then let the hardware guys do as many different implementations as the market will bear.

Personally, I would be happy to contribute $20 to a kickstarter project that wrote the appropriate software for Linux; designed and built the hardware; and made both the software and hardware "open source". For the paranoid, make it an option to just get an unpopulated circuit board so the recipient could source the rest of the off the shelf components from wherever they like. Many people may not be able to design the requisite hardware, but almost anyone who cares could learn to solder well enough to put such a device together.

Finally, given the existence of USB flash drives which

1. Pretend to have a higher capacity then they actually do (in order to commit fraud on the purchaser)

2. Pretend to be a USB keyboard/mouse for various nefarious purposes

this would seem to be a simple project for someone with hardware design skills. (Which is unfortunately not me.)

The SOURCE for truly random numbers in the kernel is external hardware

Posted Sep 19, 2013 15:11 UTC (Thu) by mpr22 (subscriber, #60784) [Link] (1 responses)

It has already been done, and indeed has been mentioned in LWN comments in the past.

The SOURCE for truly random numbers in the kernel is external hardware

Posted Sep 19, 2013 16:36 UTC (Thu) by nix (subscriber, #2304) [Link]

I note that the entropy key's network entropy daemon actually relies on the kernel's entropy estimation: in particular, it relies on it so that it knows when not to bother sending in more entropy. This matters because extracting the entropy over USB is relatively CPU-expensive (it can eat a few percent of CPU time on slow machines), so it is best not to throw too much entropy into the pool unless people are using what you throw in. The key generates enough entropy to fill the pool in a couple of seconds, after all... and by the standards of modern entropy sources that is *slow*.

The SOURCE for truly random numbers in the kernel is external hardware

Posted Sep 19, 2013 15:21 UTC (Thu) by pizza (subscriber, #46) [Link] (3 responses)

>it should be possible to build a USB attachable source of truly random numbers for a reasonable amount of money.

There's an outfit out of the UK (www.entropykey.co.uk) that was selling such a widget for about $60, but they've hit some sort of unrelated financial problems and haven't been able to make/ship any for some time now.

In the mean time, I'm actually working on a USB-attached RNG now, utilizing an STM32F4 MCU, which has a high-quality hardware RNG onboard. You can buy the eval boards for $15 in single-unit quantity (STM32F4DISCOVERY), so I don't think there's much of a point in trying to design a custom board since we won't be able to meet that price target without a large enough initial order.

When I get it working (The USB stack is a bit of a PITA), I'll be releasing the firmware (and appropriate Linux code) under the GPL. If there's enough interest in dedicated hardware, perhaps Kickstarter may be an option, hmm.

The SOURCE for truly random numbers in the kernel is external hardware

Posted Sep 19, 2013 15:46 UTC (Thu) by felixfix (subscriber, #242) [Link] (2 responses)

But if commercial RNGs are not to be trusted, packaging it into a USB stick doesn't magically remove the stigma.

The SOURCE for truly random numbers in the kernel is external hardware

Posted Sep 19, 2013 17:52 UTC (Thu) by daney (guest, #24551) [Link]

You may not trust a commercial RNG, but if the USB interface to said class of devices were standardized, you might have the option of easily switching RNG vendor.

The SOURCE for truly random numbers in the kernel is external hardware

Posted Sep 19, 2013 18:06 UTC (Thu) by pizza (subscriber, #46) [Link]

My servers are perpetually entropy-starved simply by random web crawlers hitting its TLS-exposed services. A full entropy pool is typically depleted within a few seconds, even under low-load situations.

Meanwhile, I won't be using the RNG output of the STM32 directly; it will be mixed and mangled before being passed to the host -- and since Linux will mix it with its other entropy sources, it's considerably less likely to be a problem.

Besides, let's be honest here, if you distrust commercial RNGs, wouldn't any random pre-packaged RNG design be equally suspect? Just because the design/code is open source doesn't mean there's not a weakness in it that only the NSAs in the world are capable of recognizing. And besides, even assuming noble intentions, designing a good RNG is *hard*; I'm actually more likely to introduce weaknesses (as opposed to improvements) with my meddling.

The search for truly random numbers in the kernel

Posted Sep 20, 2013 1:28 UTC (Fri) by gmaxwell (guest, #30048) [Link] (4 responses)

Oh geesh, please don't reduce the estimates without increasing the size of the pool.

On many systems the only thing that adds to the pool estimate anymore is the TSC ... at about 50 bits per second. It's very easy to drain the cruddy 4kbit pool dry. There is no good way to increase the size of the pool— you have to modify the kernel and that means moving off a distribution kernel image to something you have to maintain for yourself.

The search for truly random numbers in the kernel

Posted Sep 20, 2013 19:01 UTC (Fri) by ikm (guest, #493) [Link] (3 responses)

> It's very easy to drain the cruddy 4kbit pool dry.

Do you have a proof of that? Even simply XORing new entropy into the current seed would never make the seed less random, given that the sources of entropy don't have access to its current value. I actually doubt that it is possible to decrease entropy at all, even if it is certainly possible to not increase it (in the case where you mixing in zeroes, for example).

The search for truly random numbers in the kernel

Posted Sep 20, 2013 19:10 UTC (Fri) by gmaxwell (guest, #30048) [Link] (2 responses)

::facepalm::

The entropy estimator is a counter. It is incremented by a scarce few entropy sources in the kernel. Decremented on read from /dev/random. When it reaches zero /dev/random blocks. I am talking about the housekeeping, not quality of the CSPRNG.

On a busy system (e.g. one with a lot of SSL activity) it is very easy to get mysteriously poor performance— things like SSH taking a long time to connect, as various random tasks block trying to read a few bytes from /dev/random.

Here is a monitoring on a system which has this problem (but normally has a daemon running that keeps the pool full, you can see what happens when the daemon is down): http://mf4.xiph.org/munin/xiph.org/mf4.xiph.org/entropy-y... (the graph claims 'bytes' but its really bits, and when its at zero, ssh connections and such start becoming visibly slow)

The search for truly random numbers in the kernel

Posted Sep 20, 2013 19:32 UTC (Fri) by ikm (guest, #493) [Link] (1 responses)

Oh, I see what you meant now. But do SSH/SSL sessions really use /dev/random and not /dev/urandom? Sounds like an overkill to me. Once /dev/urandom is properly seeded, I would expect the numbers it provided to be sufficiently random for session keys.

The search for truly random numbers in the kernel

Posted Sep 20, 2013 20:00 UTC (Fri) by gmaxwell (guest, #30048) [Link]

SSH does, at least on some systems. If this is advisable or not is another question.

When long-term secrets are used for signing with DSA then whatever argument for /dev/random there was in the first place also really applies to the nonce generation— since weak nonces will leak the private key.

To some extent there is pressure on developers to use the "more secure" thing so long as it exists. No one wants to be wearing the dunce cap for some massive security compromise.

But it would be nice if there were enough space in the pool that it wasn't quite so much of a trap.

The search for truly random numbers in the kernel

Posted Sep 20, 2013 18:22 UTC (Fri) by ikm (guest, #493) [Link]

> on a number of architectures, the TSC does not exist and get_cycles() returns zero

In this case, one small source of entropy which could be used is the caller's instruction pointer (i.e. the address where get_cycles() should return to). Since the order in which get_cycles() is called due to hardware interrupts is unpredictable in most cases, this would certainly be better than always feeding zeroes.

Another option is mixing in the current top of stack pointer (be it the kernel or the userspace one), depending on the architecture implementation.

While giving this any entropy credit would be wrong, the randomness it provides can be useful.