On the safety of Linux random numbers
[Posted May 9, 2006 by corbet]
Random number generation is an important operating system function. The
generation of networking sequence numbers, cryptographic session keys, and
public keys all depend on the creation of numbers which are sufficiently
random that they cannot be guessed by an attacker. Weak random numbers can
lead to session hijacking, disclosed secrets, forged identities, and
predictable umber hulks. Any system which is serious about security has to
be serious about creating good random numbers.
Doing that, however, can be a challenge for computers. As a general rule,
designers of computers like to make hardware which does the same thing
every time. Randomness is not normally a desirable feature in computer
operation; for most systems, it is restricted to emacs responding to
mistaken keystrokes. So, while there is no shortage of algorithms which
can produce a random-seeming sequence of numbers, those numbers are not
truly random. Restart the algorithm with the same initial conditions, and
the same sequence of numbers will result.
Linux implements a purely algorithmic random number generator, accessible
as /dev/urandom. Its results are good enough for most purposes,
but there are times when true randomness is needed. To that end, the
kernel attempts to harvest randomness (called "entropy") from its
environment. The timing between the keystrokes as your editor types this
article, for example, exhibits some randomness. The same is true of, for
example, the timing of disk interrupts. The lower bits of the system time
stamp counter can also provide a bit of entropy. The kernel collects this
entropy into a special pool of bits, and uses this entropy pool when true
random numbers (obtained from /dev/random) are required. The
amount of accumulated entropy is also tracked; if there is insufficient
entropy in the pool to satisfy a random number request, the requesting
process will block until the needed entropy arrives.
One of the most common ways of putting entropy into the pool is to register
interrupt handlers with the SA_SAMPLE_RANDOM flag. That flag
tells the kernel that the indicated interrupt will arrive at random times,
so its timing can be used to generate entropy. This interface has been in
place for many years, but Matt Mackall has recently decided that it is not
the best way to go. So he has posted a series
of patches removing SA_SAMPLE_RANDOM from a large number of
request_irq() calls.
Most of the changes are not controversial. For example, a number of disk
drivers set SA_SAMPLE_RANDOM, but also use the block-specific
add_disk_randomness() function. Removing
SA_SAMPLE_RANDOM in those cases eliminates a source of redundant
"entropy." But Matt rekindled an old debate
when one of his patches removed SA_SAMPLE_RANDOM from a set of
network drivers.
The issue with network drivers is this: network interrupts are created by
incoming and outgoing packets. If an attacker gets access to the network
segment used by a target system, that attacker can observe the timing of
packets entering and leaving that system. The attacker can also influence
that timing by generating packets and sending them to the target in a
carefully-timed manner. Over the years, a number of people have worried
that a well-connected attacker might be able to guess the contents of the
entropy pool and predict future random numbers.
Others argue that nobody has shown a scenario where the ability to observe
and generate packet timings could actually lead to the compromise of the
entropy pool. The actual timing of packets hitting a given system can only
be reliably observed by another system on the same network segment. But
network segments are almost never shared anymore; most systems tend to be
plugged into switches, and a switch will hide packets and change their
timing. In addition, anybody who is in a position to get onto a target
system's network segment is quite likely to be able to obtain physical
access to the target itself. At that point, the installation of a
keystroke logger or hostile kernel patch seems easier than trying to guess
where the entropy pool will go.
If we assume a particularly determined and masochistic attacker, however,
then we can start to think about the other challenges this person will have
to face. One is guessing the contents of the entropy pool at a given
time. Such a guess will have to be made by observing the random numbers
generated by the system, which can be done by looking at sequence numbers
and keys emitted by that system. Then the attacker will have to find a way
to reverse the algorithm (SHA-1) which is used to generate a given random
number from the pool. That reversal will generate a large set of possible
pool values which could all hash to the same value, so the attacker must be
prepared to work with many simultaneous possibilities.
Once the pool has been guessed, it is time to predict its future value, as
determined by the incoming entropy. The problem here is that the timing of
packets on the wire does not exactly match the timing of interrupts within
the kernel. There are delays within the network card, delays in DMAing a
packet into main memory (which can be influenced by other memory traffic
being generated in the system), variable interrupt handling times caused by
critical sections which mask interrupts, cache misses, etc. Then there is
the occasional mixing of bits from the time stamp counter, the value of
which is not available to the attacker. All told, it is a fair stretch to
go from an observation of traffic on the network to any sort of guess as to
what the random number generator will produce next.
Meanwhile, many systems running as network servers have access to
relatively few sources of entropy. If interrupt timings from network
interfaces are made unavailable, those systems could run out of entropy
altogether. Given that need, and given that most developers seem unworried
about the potential weaknesses, the use of network timings is unlikely to
go away anytime soon. What might happen, however, is the addition of some
sort of runtime configuration option. Truly paranoid administrators could
then disallow entropy from network interfaces. Those who are merely
worried could, instead, use those timings, but reduce the amount of entropy
which is credited to a network interface timing value. And most of the
rest of us will probably leave things the way they are now.
[See also: this paper by
Z. Gutterman, B. Pinkas and T. Reinman [PDF] on potential weaknesses in
the Linux random number generator (thanks to Neil Harris).]
(
Log in to post comments)