By Jake Edge
December 12, 2007
Linux random number generation (RNG) is often a source
of confusion to developers, but it is also a very integral part of the
security of the system. It provides random data to generate cryptographic
keys, TCP sequence numbers, and the like, so unpredictability as well as
very strong random numbers are required. When someone notices a flaw, or
a possible flaw in the RNG, kernel hackers take notice.
Recurring universally unique identifiers (UUIDs), as reported by the smolt hardware
profiler client program, had some worried about problems in the
kernel RNG. As it turns out, the problem exists in
the interaction between Fedora 8 LiveCD installations and smolt –
essentially the UUID came from the CD – but it sparked a discussion
leading to some possible improvements. Along the way, some common
misconceptions about kernel RNG were cleared up.
The kernel gathers information from external sources to provide input to
its entropy pool. This pool contains bits that have extremely strong
random properties, so long as unpredictable events (inter-keypress timings,
mouse movements, disk interrupts, etc.) are sampled. It provides direct
access to this pool via the /dev/random device. Reading from that
device will provide the strongest random numbers that Linux can offer
– depleting the entropy pool. When the entropy pool runs low,
reads to /dev/random block until there is sufficient entropy.
The alternative interface, the one that nearly all programs should
use, is /dev/urandom. Reading from that device will not block.
If sufficient entropy is available, it will provide random numbers just as
strong as /dev/random, if not, it uses the SHA cryptographic hash
algorithm to generate very strong random numbers.
Developers often overestimate how strong their random numbers need to be;
they also overestimate how easy "breaking" /dev/urandom would be,
which leads to programs that, unnecessarily, read /dev/random. Ted
Ts'o, who wrote the kernel RNG, puts it this way:
Past a certain point /dev/urandom will start returning results which
are cryptographically random. At that point, you are depending on the
strength of the SHA hash algorithm, and actually being able to not
just to find hash collisions, but being able to trivially find all or
most possible pre-images for a particular SHA hash algorithm. If that
were to happen, it's highly likely that all digital signatures and
openssh would be totally broken.
There is still a bit of hole in all of this: how does a freshly installed
system, with little or no user interaction, at least yet, get its initial
entropy? When Alan Cox and Mike McGrath started describing the smolt
problem, the immediate reaction was to look closely at how the entropy pool
was being initialized. While that turned out not to be the problem, it did
lead Matt Mackall, maintainer of the kernel RNG, to start thinking about better pool
initialization. Various ideas about mixing in data specific to the
host, like MAC address and PCI device characteristics were discussed.
As Ts'o points out, that will
help prevent things like UUID collisions, but it doesn't solve the problem
of predictability of the random numbers that will be generated by these
systems.
In order to do that we really do need to improve
the amount of hardware entropy we can mix into the system. This is a
hard problem, but as more people are relying on these facilities, it's
something we need to think about quite a bit more!
Linux provides random numbers suitable for nearly any purpose via
/dev/urandom. For the truly paranoid, there is also
/dev/random, but developers would do well to forget that device
exists for everything but the most critical needs. If one is generating a
large key pair, to use for the next century, using some data from
/dev/random is probably right. Anything with lower requirements
should seriously consider /dev/urandom.
(
Log in to post comments)