LWN.net Logo

Sharing random bits with Entropy Broker

By Nathan Willis
April 10, 2013

Randomness: it rarely seems like it is in short supply—until you need it. Quite a few security features on a modern operating system rely on a supply of random bits: cryptographic key generation, TCP sequence number selection, and so on. The operating system can provide random bits by collecting "entropy"—in essence, measuring an unpredictable process. But there are instances when the kernel's own entropy supply cannot keep up with demand, either because demand is too high, or supply is too low. Entropy Broker is a framework that allows multiple machines to pool their entropy sources together. It allows client machines that consume more entropy than they can themselves generate to use entropy donations from server machines that produce more entropy than they need.

How random

For starters, a little terminology. Many people (and software projects) use the terms randomness and entropy interchangeably; to make the distinction a little clearer, it might help to think of entropy as a property of a physical system: clock oscillations vary minutely, the least-significant digit of the timestamp on an interrupt is unpredictable, and so on. Measuring that entropy provides input that can be turned into a sequence of random bits. But rate at which each source being measured can provide entropy is a property of the system, too. The clock crystal oscillates at its frequency, interrupts only arrive when they are needed, and so on.

On a typical Linux system the kernel draws on a small set of entropy sources in order to fill up the pool of random bytes it delivers in the /dev/random pseudo-file. The incoming bits of entropy are "stirred" together using one-way hash functions, in order to make them less predictable. In theory, the stirring process greatly reduces the danger that a pattern will occur in the physical process creating entropy, but one thing stirring cannot do is increase the speed at which the entropy bits are collected. And it is when the supply of collected entropy runs low that causes concern. There are various approaches to solving this supply problem; true random number generators (RNGs), for example, can provide entropy at rates orders of magnitude higher than software collection, and there are potential entropy sources outside of what the kernel relies on, such as reading the low-level noise from microphones or video cameras.

Nevertheless, when it comes to the availability of random bits, there are concerns other than how to collect sufficient entropy. For instance, although /dev/random will block whenever it temporarily runs short of entropy bits, /dev/urandom will instead use a pseudo-random number generator. But that pseudo-random number generator needs to have a different seed value at every reboot, or it will be predictable. Immediately after boot, when no entropy has been collected, running the generator with the same seed will produce the same output. Fixing that problem can be especially difficult when dealing with virtual machines (consider cloud server images, for example, which are created from scratch on demand).

Entropy Broker offers a relief option for several of these scenarios. It defines "clients" as those machines needing a supply of entropy, and "servers" as machines that collect entropy and contribute it to the shared pool. A broker process handles mixing the incoming contributions together and servicing the requests from clients. The client machines could be high-volume entropy consumers (e.g., performing a lot of cryptographic work), or they could lack the entropy sources found on a normal machine. A diskless workstation, for example, would have no hard disk interrupt timings (a common entropy source) to measure; a virtual machine might have no reliable entropy sources at all.

Brokering it

Entropy Broker is maintained by Folkert van Heusden; the current release is version 2.1, released in December 2012.

At the lowest level, Entropy Broker starts with a suite of separate "server" programs (most of which can run on any UNIX-like platform, although a few are Linux-specific) that collect entropy bits from a specific source (timer jitter, audio or video4linux noise, or even a hardware RNG). The server processes are configured to send their entropy to a broker process, which can run on the same machine or on a different one. The entropy bits that each server collects are hashed, then the bits and the hash are encrypted with a pre-shared key prior to transmission; the hash allows the broker to check that the data was not tampered with during transmission. Both the cipher and the hash function used are configurable. By default, the broker, servers, and clients use TCP port 55225 to communicate, which is configurable as well.

In the broker process, incoming entropy packets are decrypted, and the entropy bits checked against their hash. If the hash verifies their integrity, the entropy bits are "stirred" together into a common pool, using either AES, 3DES, Blowfish, or Camellia. The stirring process takes the existing pool of entropy and encrypts with the selected cipher, using the new entropy bits as the key (thus, the new bits mix up the old bits, thereby stirring them). As is the case with /dev/random, the broker keeps track of how many bits of entropy it has in its pool; if it runs out, it sends an "empty pool" message to its clients and servers. Conversely, if it fills its entropy pool, the broker sends a "sleep" message to its servers.

The Entropy Broker package includes several "client" programs to run on machines that want to request entropy from the broker. The client_linux_kernel program uses entropy from the broker to fill the kernel's local entropy pool. This is the most general-purpose solution for Linux systems; it allows both high-demand and supply-poor machines to fill /dev/random, with no changes visible to user-space applications. But there are other clients as well. The client_egd program can mimic the Entropy Gathering Daemon (EGD) and provide a socket interface, and the client_file program that can write entropy bits to a general-purpose file.

When a client needs some entropy, it sends a request message to the broker asking for a particular number of bits. When enough bits are available, the broker "unstirs" them from the pool by hashing the pool, then decrypting the pool with the hash value used as a key (as with stirring, this process is performed so that the bits extracted come from the pool as a whole, as opposed to pushing and popping values from a stack). The hash value is "folded" in half (by XORing the first and last halves of the hash value together, which protects against some attacks by further obfuscating the state of the pool), then it is sent to the client, and the broker decrements its count of entropy bits available.

At your service

Delivering the accumulated entropy to clients is fairly straightforward; the throttling options allow the broker to tell clients when there is not enough entropy to go around as well as to make more efficient use of hardware RNGs on the servers, which could very well be capable of producing entropy faster than the clients need it. But the entropy servers are interesting in their own right, since collecting entropy is at times a tricky affair. Entropy Broker 2.1 includes twelve servers and one "proxy" server designed to mix together less reliable entropy sources.

The best option, if it is available, is a hardware RNG device. Systems with a hardware RNG either in the chipset or connected via serial port can use server_stream, passing the device name to the server as the -d argument. For example,

    server_stream -I mybroker.example.com -d /dev/ttyS0 -s -X my_preshared_credentials.txt

where mybroker.example.com is the address of the machine running the broker process, and my_preshared_credentials.txt on the client machine contains the pre-shared password used to authenticate with the broker. There are three servers written for hardware RNG products not covered by server_stream. Systems with an EntropyKey device can use server_egd, which requires two additional parameters: -b read_interval (in microseconds) and -a bytes_to_read (per interval). As the name suggests, server_egd can also use the EGD daemon to collect entropy. The server_ComScire_R2000KU server is designed to collect entropy data from the USB ComScire R2000KU RNG device. The server_smartcard server can collect entropy from compatible ISO 7816 smart cards—although "compatible" in this case means that the card must accept the GET_CHALLENGE command, which the Entropy Broker documentation notes is not supported by every card on the market.

Some hardware RNG modules can be used by the kernel directly to fill /dev/random; on these machines the server_linux_kernel server can make use of the hardware without special configuration. Naturally, server_linux_kernel can also be used donate entropy collected by the kernel through the normal software sources. On a machine that will be used as an entropy source, though, there are other server options worth exploring, even when no hard RNG is available—such as the server_audio and server_v4l servers. The audio server collects Johnson-Nyquist noise data from an ALSA-compatible sound card. The sound card so utilized needs to be unused, since an active signal will drown out the desired noise. The Video4Linux2 server can extract noise from either a webcam or a TV tuner card. As with the sound card, a TV tuner needs to be picking up static, not an actual signal. It is less obvious how best to collect noise from a webcam; the Entropy Broker documentation points users toward two possibilities: LavaRnd (which requires pointing the webcam at a constantly-moving random image such as a lava lamp), and AlphaRad (which uses a cheap radiation source pulled from a home smoke detector).

A step down from the above hardware sources, the server_usb server can extract entropy from the response times of attached USB devices (even simple input devices like keyboards). An x86-compatible system can collect entropy from server_cycle_count, which implements a variant of the HAVEGE algorithm to measure CPU flutter. The final hardware option is server_timers, which measures jitter in the length of usleep() timings. All of the servers that collect entropy from hardware sources use Von Neumann whitening to normalize the signal before sending it to the broker.

Two other options exist if none of the others suffice. The first, server_ext_proc, allows one to use any external process as an entropy input source; the user is on his or her own as to the viability of that external process. Similarly, server_file can be used to read entropy bits from a file (the contents of which, hopefully, are replenished between reads). In either case, the Entropy Broker network protocol includes a free-form "server type" field; if the entropy source is less than reliable, at least that fact is communicated by server_ext_proc and server_file. The package does include a utility that can theoretically improve mediocre entropy, eb_proxy_knuth_m. This is a server program that mixes together two or more entropy streams, via algorithm M from Donald Knuth's The Art of Computer Programming, Volume 2, chapter 3.2.2. The same algorithm is also described in Bruce Schneier's Applied Cryptography, in case it needs additional credibility.

Altogether, Entropy Broker's server selection covers a wide range of randomness-collection options. Virtual machines may be able to use server_timers or server_cycle_count at the "low end" and contribute to a shared pool of entropy, while systems with fast hardware RNGs can be pooled together to provide high-quality entropy to a pool of clients. In between lies the possibility of a room full of underutilized file servers with built-in motherboard sound cards. There are clearly a variety of different topologies that could make use of Entropy Broker, including sharing a hardware RNG between peers (i.e., one-to-many), or feeding excess entropy into a common pool lest it go unused (i.e., many-to-one). There are some currently unimplemented features described in the network protocol, such as quotas. In the field, making sure that Entropy Broker is producing quality output can be critical, so there are several test utilities provided for plotting and otherwise measuring the randomness of the broker's shared pool. Some of Entropy Broker's more interesting servers are Linux-only (the audio and Video4Linux2 servers, for example), but others should work on other UNIX-like systems as well.

Of course, ultimately, the entropy used to fill /dev/random is important insofar as it provides good, unpredictable input to things like cryptographic key or pad generation. Providing good randomness is a small piece of the overall security puzzle, but attacks that exploit /dev/urandom and /dev/random are certainly not unheard of. For virtual machines or crypto-heavy servers, the need for quality randomness is magnified. In such cases, a framework like Entropy Broker is no silver bullet, but it can offer a flexible solution. Entropy involves a curious paradox on computer systems; although believed to be constantly increasing, it can still be hard to deliver on demand.


(Log in to post comments)

Sharing random bits with Entropy Broker

Posted Apr 12, 2013 5:14 UTC (Fri) by kleptog (subscriber, #1183) [Link]

The entropy bits that each server collects are hashed, then the bits and the hash are encrypted with a pre-shared key prior to transmission; the hash allows the broker to check that the data was not tampered with during transmission.
Aargh! Cryptography 101: When doing authenticated encryption, always encrypt-then-hash, never hash-then-encrypt. The former is provably secure, the latter is not. Yes, SSL does this wrong and we're still paying for it with all the security issues related to this.

It's probably too late to easily fix the protocol now...

Courtesy of Dan Boneh's online cryptography course.

Sharing random bits with Entropy Broker

Posted Apr 12, 2013 5:53 UTC (Fri) by apoelstra (subscriber, #75205) [Link]

> Aargh! Cryptography 101: When doing authenticated encryption, always encrypt-then-hash, never hash-then-encrypt. The former is provably secure, the latter is not. Yes, SSL does this wrong and we're still paying for it with all the security issues related to this.

My understanding is that it's the hashes themselves that are being transmitted (since they contain the randomness). That is, the hashes -are- the data, not hashes -of- the data.

Sharing random bits with Entropy Broker

Posted Apr 12, 2013 6:39 UTC (Fri) by kleptog (subscriber, #1183) [Link]

Well, my reading of the protocol is different. What is sent is (under 'get bits'):
- a hash of the data, using a SHA256 (*1) hash function so 32 bytes in size
- the data
    - the data + hash are encrypted with blowfish (*1)
        - the blowfish cipher is at start initialized with the user-password as the key
Essentially, an encryption algorithm is being used for authentication, which is never a good idea. There are some combinations that can be safe, but the protocol allows users to choose the encryption/hash, so you most likely have an unsafe combination. You're right that the data being is not the original random data, but it is whatever is going to be added to the entropy pool of the receiver.

Sharing random bits with Entropy Broker

Posted Apr 16, 2013 21:44 UTC (Tue) by alankila (subscriber, #47141) [Link]

The paper which makes the case for insecurity of the authenticate-then-encrypt method appears to construct a very peculiar encoding scheme for messages so that it can show that there are cases where information is disclosed. It achieves this by doing a thing no normal person will ever do: design multiple ways to represent the same payload data in the ciphertext by doubling each bit into 2 bits, and deciding that either '01' or '10' sequence both represents '1' in the actual data. After this it is possible to flip two subsequent bits in the cipher stream and observe if the payload changed or not, and thus gain information of the payload. I guess this sort of sleight of hand can be used to make the case that some encryption algorithm choices (such as the paper's author's!) result in information disclosure, but I have severe doubts about the practical value of this particular attack.

The use case here looks secure to me. The SHA-256 value is a simple digest over the message, and not even an authentication code because there is no key involved. I do not see how any part of the ciphertext could be manipulated without failing the subsequent digest check, except perhaps for padding which may or may not be present, and in any case would not cause changes relative to the payload data. Still, careful implementations must fail the message if the padding appears to have been tampered with.

Sharing random bits with Entropy Broker

Posted Apr 17, 2013 6:38 UTC (Wed) by kleptog (subscriber, #1183) [Link]

No need to look at academic papers, the best example is SSL/TLS. Several of the attacks over the last few years have been ones that would not be possible with encrypt-then-mac. The notable ones being padding-oracle and timing attacks.

At the very least entropy broker is not careful about its checking of the hash (using just a plain memcpy) which means it's probably vulnerable to a timing attack where the attacker could send any data and then find the corresponding hash to get it accepted. It's not clear to me whether the underlying crypto++ library does anything for the padding attacks. From a quick read of the code it's not clear to me what the padding mode is anyway.

Anyway, the important point here is that with mac-then-encrypt you have to be very careful about the implementation because side-channels like timing can reveal information about the plaintext, because decryption is done first. In contrast, with encrypt-then-mac timing does not matter in the mac check because in the worst case it can reveal information about the encrypted data, which the attacker already has. If the mac check passes you know you have a valid message so the timing of the rest doesn't matter either.

However, like you I don't think it's a practical problem in this case, because the attacker might be able to modify the message without seeing what they are modifying and since the data is random the result is random anyway. Padding-oracle would reveal the data being sent but even that would require a lot of work and not reveal very much either (the receiver would probably have gotten entropy from elsewhere in the meantime). But in general non-cryptographers should always do encrypt-then-mac so they don't need to care about the above attacks.

Sharing random bits with Entropy Broker

Posted Apr 17, 2013 11:57 UTC (Wed) by paulj (subscriber, #341) [Link]

The funny encoding scheme is just saying the plaintext into the block cipher may not be plaintext. It may be a protocol, with syntax and semantics that will be validated.

If that is the case, then clearly with _just_ encryption it will be possible for an attacker to twiddle the ciphertext and then exploit the fact that the resulting plaintext will go through a validation layer. If the result of that validation can be observed in some way (which often it can be, with 2-way network protocols), then information is leaked about the contents of the message. As the paper gives an example of.

You might think that the MAC of the plaintext guards against this. And I think, if all the input to the cipher is guarded, then that is the case. However, they seem to argue that sometimes ciphers DO add their own encoding, e.g. to add required padding in stream ciphers. When this is so, then you end up with the situation of above - a validation layer after decryption, which can be exploited to reveal bits. Indeed, with AtE you have a perfect validation layer to attack.

So AtE *can* be secure, however it is not generally secure. The security depends on the details of the cipher used - so it is fragile. Encrypt-then-Authenticate on the other hand will be secure, regardless of whether the cipher has an encoding phase.

Seems to be my impression of the paper. :)

My DIY entropy broker

Posted Apr 18, 2013 5:42 UTC (Thu) by cpeterso (guest, #305) [Link]

strace -itttvT nice curl --location --raw --verbose $URL 2>&1 | shasum

where $URL is https://news.google.com or https://en.wikipedia.org/wiki/Special:Random. :)

You get what you test....

Posted Jul 13, 2013 1:11 UTC (Sat) by vomlehn (subscriber, #45588) [Link]

It's a truism that you get what you test, i.e. the outcome of a process is affected and even determined by the results of the tests you use to validate it. This is something missing from /dev/random--we claim that we get good random numbers out of it, but no statistics are available on a given system to determine just how good the randomness is. You might assert that the random numbers on my 32-bit single-processor ARM-based electric meter are going to be awesome based on your tests with a 64-bit x86 cloud server with a zillion cores, but if you do, I'm going to laugh. Then I'm going to cry. Then we can spend some time looking at the contributors to /dev/random and you'll see what I mean.

This not not currently a big enough itch for me to scratch since I'd have to brush up on random number goodness, but I know enough that providing some sort of basic randomness metrics in parallel with handing out values from /dev/random is straight forward. So, consider this to be the whine one makes when something is irritating enough to want *someone* to do something about a problem, but not irritating enough to just go do it myself. Folks with specific and demanding applications will want more but those are just the people who know enough to do it themselves.

You get what you test....

Posted Sep 20, 2013 22:55 UTC (Fri) by starlight (guest, #92967) [Link]

I ran the NIST STS 2.1.1 test suite

http://www.random.org/analysis/
http://csrc.nist.gov/groups/ST/toolkit/rng/index.html

against 10 one-million-bit streams from the output of

1) physical server /dev/random with 40Kbit/sec ST33 TPM TRNG
2) physical server /dev/urandom, no extra entropy
3) virtual machine /dev/urandom, no extra entropy
4) 200MHz MIPS WRT54G running 'openwrt' 2.4 kernel
5) a TAR gzip of some source code
6) a GPG of (5)

1-4 and 6 all passed *all* the NIST tests

5 failed horribly

Also ran all the same inputs against FIPS 140-2 using 'rngtest'. This test identified one short "bit run" in two of the five good samples, but a reasonable number of bit runs are expected. As with NIST STS, FIPS 140-2 flunked sample 5.

So it would appear that TRNG (/dev/random) and TRNG+PRNG (/dev/urandom) number generation in 2.4 and 2.6 Linux kernels are as good as unclassified random number generation gets; can be trusted as far as that goes--even when run in virtual machine.

The Linux implementation is known to fail immediately after a reboot on devices with a fixed initial state and limited sources of entropy (such as small routers), but this case is easily identified and avoided.

Of course the problem with all this, as the Dilbert cartoon (see link above) aptly puts it, "you can never be sure." But Snowden is on record as saying modern crpyto works when properly implemented and Linux has that.

Sharing random bits with Entropy Broker

Posted Sep 20, 2013 20:01 UTC (Fri) by starlight (guest, #92967) [Link]

Cryptographic considerations aside, this project is unfortunately not anywhere close to mature. Took half a day to compile the version 2.4 (9/18/13) code and then immediately ran into a bug testing with the 'file' client and asking for 1000 bytes of random data. After appearing to transfer roughly half that the client simply hung.

In addition to that the 'entropy_broker' daemon creates a ridiculous number of temporary files. This sort of daemon should do everything in memory and shared memory.

Anyone considering trying it out should know it's written entirely in C++.

Personally I do not have time for it, so I'll be using 'virtio-rng' which is mature and is known to work.

Sharing random bits with Entropy Broker

Posted Sep 20, 2013 21:26 UTC (Fri) by starlight (guest, #92967) [Link]

Ha! Here's a "good enough" solution that will make hardcore cryptographers sputter with indignation.

I'm running an old kernel in a VM and don't want to upgrade to a new distro. Every exposed app running is compiled from recent versions as well as the latest openssl-1.0.1e, so from the perspective of elliptic-curve cipher support it's better than current RH/CentOS. No known vulnerabilities to the old kernel and I believe it's safer as the attack surface is way smaller.

So 'virtio-rng' is not likely to work here, which is why I was interested in EB.

Instead I did this:

VM HOST

nc -l -p 9898 vmguest

VM GUEST

cd /etc
mknod vmhost_random p
nc vmhost 9898 >vmhost_random &
rngd -r vmhost_random

And Voilla! 40K bits/sec of high-quality random data (the host has a ST33 TPM feeding it, also via 'rngd') is now available on the guest.

rngtest -t5 </dev/random

tells me that it is so:

bits received from input: 2151424
FIPS 140-2 successes: 107
FIPS 140-2 failures: 0
FIPS 140-2(2001-10-10) Monobit: 0
FIPS 140-2(2001-10-10) Poker: 0
FIPS 140-2(2001-10-10) Runs: 0
FIPS 140-2(2001-10-10) Long run: 0
FIPS 140-2(2001-10-10) Continuous run: 0
input channel speed: (min=41.599; avg=131.038; max=4095.460)Kibits/s
FIPS tests speed: (min=71.705; avg=90.013; max=92.142)Mibits/s
Program run time: 16030586 microseconds

The older kernel seems to be conjuring more bits/sec than the host has--have to chalk that up to the radically quirky nature of all this wierd entropy code.

And no, I'm not worried about someone hacking into the unencrypted TCP connection passing through the 'br0' interface virtual network on its way from host to guest.

Sharing random bits with Entropy Broker

Posted Sep 20, 2013 21:55 UTC (Fri) by starlight (guest, #92967) [Link]

Works better than I thought.

'rngd' stop reading the random data source as soon as the kernel has a full entropy pool. This causes the pipe and two 'nc' instances to block, and halts entropy drain on the hosts /dev/random pool. So it should be possible to run several such connections.

One can always use 'ssh' to move the data over any real networks that might need to be traversed.

As is often the situation KISS is the way to go.

Sharing random bits with Entropy Broker

Posted Sep 20, 2013 23:02 UTC (Fri) by starlight (guest, #92967) [Link]

minor typographical error, it's

nc -l -p 9898 vmguest </dev/random

on the VMHOST.

Sharing random bits with Entropy Broker

Posted Sep 21, 2013 17:05 UTC (Sat) by starlight (guest, #92967) [Link]

final tweak:

'nc' is a text/line oriented utility so it will hang at times for lack of NEWLINE character in the binary /dev/random stream. 'base64' corrects that. Note also the -W 3072 parameter which tells 'rngd' to top-up the entropy pool to 3/4th before stopping to allow local kernel entropy to be added/mixed in. Older version of 'rngd' do not handle -W properly so check '/proc/sys/kernel/random/write_wakeup_threshold' and add a script line to adjust the value if it's not closer to 4096 than to 0.

VM HOST

base64 </dev/random | nc -l -p 9898 vmguest

VM HOST

cd /etc
mknod vmhost_random p
nc vmhost 9898 | base64 -d >vmhost_random &
sbin/rngd -r vmhost_random -W 3072

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds