Random numbers from CPU execution time jitter

Posted Apr 30, 2015 14:26 UTC (Thu) by tpo (subscriber, #25713)
In reply to: Random numbers from CPU execution time jitter by epa
Parent article: Random numbers from CPU execution time jitter

> You may be right but the thing about an entropy pool is that mixing in some new data can never make things worse.

Your assertion is wrong unless you qualify it more precisely.

Let's say you have some entropy pool and add /dev/null as a further source to it. Now, depending on the size of the pipe that sucks randomness out of that pool it might be that the pool is empty - except for /dev/null.

So if instead of blocking you now continue to feed the pipe from /dev/null, then randomness disappears into complete determinism.

So I think you have to explain how the output of your entropy pool is actually mixed before asserting that "it never can make things worse".

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 15:08 UTC (Thu) by fandingo (guest, #67019) [Link] (7 responses)

Your criticism is predicated on a complete change of how entropy pools are used. Randomness is never "sucked out" of an entropy pool. New random data is folded into the existing data, and the overall pool never decreases in size. The output of an entropy pool is transformed data, too, so you're never giving out the seed data (because that would disclose state).

(This seems to confuse a lot of people when they look at the blocking behavior of /dev/random. The pool never depletes, but a calculation of the quality of the randomness in the pool -- i.e. the entropy -- causes blocking, not a depletion of the actual data.)

That's why adding data doesn't hurt. If you have an entropy pool that you trust at t1, folding in a bunch of low-quality data still leaves you with the original t1 randomness.

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 15:26 UTC (Thu) by tpo (subscriber, #25713) [Link]

Thanks for enlightening me!
*t

Actually still confused

Posted Apr 30, 2015 15:51 UTC (Thu) by tpo (subscriber, #25713) [Link] (5 responses)

On further searching around I find, that I still do not understand the basic mechanism of an entropy pool.

My current mental model of the likes of /dev/random is that one starts with a certain amount of gathered randomness R (the seed). Then, once someone pulls data from /dev/random, you apply some function f(R,t) or fn(R,fn-1) that calculates/generates random bytes from the initially gathered seed either incrementally via reiteration or by including some monotonically increasing input such as a clock.

Now, as you explain, I effectively am confused by the fact that let's say ssh-keygen or openssl keygen will block and wait to gather more entropy even after the machine has been running for months and thus has seen "infinite" amounts of randomness. What is the reason to repeatedly start gathering further entropy at that point if as you seem to imply generating an infinite amount of random bytes from the initial seed does not reduce the random quality of future generated random bytes?

I've checked "entropy" and "entropy pool" on Wikipedia, but either I misunderstand it or Wikipedia is confused when using phrases like "entropy depletion" and similar, which according to what you say is not possible inside an entropy pool.

Is there basic, coherent explanation of the whole mechanism somewhere?
*t

Actually still confused

Posted Apr 30, 2015 18:10 UTC (Thu) by fandingo (guest, #67019) [Link] (3 responses)

> Then, once someone pulls data from /dev/random, you apply some function f(R,t) or fn(R,fn-1) that calculates/generates random bytes from the initially gathered seed

Not exactly and perhaps this can clear up some of the confusion. There is an entropy pool of data that is filled immediately and always stays full. Over time, this pool has new random data from a variety of sources mixed into it. As data is mixed in, the kernel estimates how much entropy it thinks is now in the pool and sets a counter appropriately. In the background, there is a kernel thread that checks a different output pool. If the pool isn't full, f(epool) is run to populate the output pool.

Both urandom and random are using the same f() and entropy pool, but they do get individual output pools. The only difference between urandom and random is that the background worker to populate random's output pool will block if the estimated entropy is too low.

> I've checked "entropy" and "entropy pool" on Wikipedia, but either I misunderstand it or Wikipedia is confused when using phrases like "entropy depletion" and similar, which according to what you say is not possible inside an entropy pool.

Check out this image: https://i.imgur.com/VIPLO2d.png from this paper: https://eprint.iacr.org/2012/251.pdf *

There is not much information and a lot of confusion about PRNG and CSPRNG. Linux implements a CSPRNG, and the CS stands for cryptographically secure. This means that a partial disclosure of PRNG state should not compromise the output both forwards and backwards. That's the heart of why the kernel says it "consumes" estimated entropy as the entropy pool data is used for output. The state must continually incorporate new entropy data and mix the pool, or else partial state disclosures can make outputted data predictable.

There are a lot of people that disagree with the blocking nature of /dev/random and how much of the CSPRNG operates. In particular, FreeBSD has a nonblocking /dev/random. They also use the very fast arc4 for their output function. Personally, I prefer the Linux CSPRNG because I like the security considerations, even though they come at a high performance cost. It's better to get high quality and secure random data (that includes urandom) from the kernel, and then feed it in as the key for a very fast stream cipher if that's what you need. (For example, `dd if=/dev/urandom of=/dev/sda` is a terrible misuse. Instead, use something like `openssl enc -aes128 -k "shred" < /dev/urandom > /dev/sda`.)

* This is an excellent paper that covers the CSPRNG in both an approachable and mathematical methodology.

Actually still confused

Posted May 1, 2015 10:10 UTC (Fri) by tpo (subscriber, #25713) [Link] (1 responses)

So I've been reading that paper [1] and think it is neither particularly clear nor precise. But my opinions about that paper are besides the issue that I'm interested in, which is how the random generator actually works.

The one thing that has become clearer to me - thank you! - is that there exists a mechanism to add input data to the entropy pool, which has the property of not reducing the existing entropy in the pool no matter what the entropy quality of the new input data is. I've not verified that claim, but assume it true, it being a long standing mathematical finding. That's good news to me.

However you write:

> There is an entropy pool of data that is filled immediately and always stays full. Over time, this pool has new random data from a variety of sources mixed into it. As data is mixed in, the kernel estimates how much entropy it thinks is now in the pool and sets a counter appropriately. In the background, there is a kernel thread that checks a different output pool. If the pool isn't full, f(epool) is run to populate the output pool.

I think the contentious claim here is "the entropy pool ... always stays full". If you mean "stays full" in the sense of "a stack that never gets an element popped out from it" then I agree with that, since the pool is a fixed size structure, that, even if it were "empty" still contains "something" even if its "only all zeros". However that is not what is relevant in this discussion. The relevant thing is that by generating random data from that pool you transfer entropy out of the entropy pool. I quote the paper:

"When k bytes need to be generated, ... k output bytes are generated
from this pool and the entropy counter is decreased by k bytes."

Thus if we measure "fullness" by the here relevant metric of "amount of entropy" contained in the entropy pool, then the pool is *not* always full and in fact sometimes even empty as in the case where you have ssh-keygen pulling random data out of /dev/random and blocking because the kernel is unable to refill the entropy pool from its entropy sources.

All this said, the above is only my understanding acquired by reading what I have been referred to and what I could find. My understanding may well still be insufficient and wrong. If you've put up enough with an ignorant of my likeness then I can fully understand that. Otherwise I'll be happy to hear more and try to improve my understanding of the matter.

Thanks,
*t

[1] https://eprint.iacr.org/2012/251.pdf

Actually still confused

Posted May 1, 2015 13:52 UTC (Fri) by cesarb (subscriber, #6266) [Link]

Think of the pool's entropy as a "measure of unpredictability". If the entropy is ten bits, for instance, you'd need at most 1024 guesses to find the pool state.

You should think of the random generator as having two separate and independent parts: the pool itself and the output function.

Inputs are mixed into the pool using a cryptographic function which takes two values: the previous pool state and the input value, and outputs the new pool state. This function thoroughly mixes its inputs, such that a one-bit difference on any of them would result on average to half of the output bits changing, and other than by guessing it's not possible to get from the output back to the inputs.

Suppose you start with the pool containing only zeros. You add to it an input containing one bit of entropy. Around half of the pool bits will flip, and you can't easily reverse the function to get back the input, but since it's only one bit of entropy you can make two guesses and find one which matches the new pool state. Each new bit of entropy added doubles the number of guesses you need to make; but due to the pigeonhole principle, you can't have more entropy than the number of bits in the pool.

To read from the pool, you use the second part, the output function. It again is a cryptographic function which takes the whole pool as its input, mixes it together, and outputs a number. Other than by guessing, it's not possible to get from this output to the pool state.

Now let's go back to the one-bit example. The pool started with zero entropy (a fixed initial state), and got one bit of entropy added. It can now be in one of two possible states. It goes through the output function, which prevents one reading the output from getting back to the pool state. However, since there were only two possible states (one bit of entropy), you can try both and see which one would generate the output you got... and now the pool state is completely predictable, that is, it now has zero bits of entropy again! By reading from the pool, even with the protection of the output function, you reduced its entropy. Not only that, but there were only two possible outputs, so the output itself had only one bit of entropy, no matter how many bits you had asked for.

Now if you read a 32-bit number from a pool with 33 bits of entropy, you can make many guesses and find out a possible pool state. However, again due to the pigeonhole principle, there's on average two possible pool states which will generate the same 32-bit output. Two pool states = one bit. So by reading 32 bits, you reduced the remaining entropy to one bit.

This is important because if you can predict the pool state, you can predict what it will output next, which is obviously bad.

----

Now, why isn't the situation that bad in practice? First, the input entropy counter tends to underestimate the entropy being added (by design, since it's better to underestimate than to overestimate). Second, "by guessing" can take a long enough time to be impractical. Suppose you have a 1024-bit pool, and read 1023 bits from it. In theory, the pool state should be almost completely predictable: there should be only two possible states. In practice, you would have to do more than 2^1000 guesses (a ridiculously large number) before you could actually make it that predictable.

However, that only applies after the pool got unpredictable enough. That's why the new getrandom() system call (which you should use instead of reading /dev/random or /dev/urandom) will always block (or return failure in non-blocking mode) until the pool has gotten enough entropy at least once.

Actually still confused

Posted May 7, 2015 16:23 UTC (Thu) by itvirta (guest, #49997) [Link]

> (For example, `dd if=/dev/urandom of=/dev/sda` is a terrible misuse.
> Instead, use something like
> `openssl enc -aes128 -k "shred" < /dev/urandom > /dev/sda`.)

Doesn't that still read from urandom as much as the dd since urandom
is used as the input data?

Maybe you mean something like
openssl enc -aes128 -pass file:/dev/urandom < /dev/zero > /dev/sda

(Or even with the -nosalt flag added, because otherwise the
output always starts with the string "Salted__".)

The idea is good, however. I've used shred(1) for wiping disks, and
in random mode it uses urandom directly to get the randomness. It
makes it hideously slow. Perhaps someone(tm) should patch it to support
a faster generator or just make a smarter dedicated (simple-to-use) tool. :)

Actually still confused

Posted Apr 30, 2015 18:43 UTC (Thu) by HIGHGuY (subscriber, #62277) [Link]

> ... will block and wait to gather more entropy even after the machine has been
> running for months and thus has seen "infinite" amounts of randomness.

Now in this case you would assume that the entropy pool is a piece of memory with random content that grows as it gathers more entropy. This would be impractical as it would deplete memory for the "infinite" amount of randomness.

Actually, you should think of it as a fixed size memory buffer where feeding data into it is a transformation function ent_new = f(ent_old, new_data).

Random numbers from CPU execution time jitter

Posted Apr 30, 2015 16:08 UTC (Thu) by epa (subscriber, #39769) [Link] (3 responses)

Right, you may do some accounting of how much entropy you have or whether the data you have added is 'good'. In that case, you had better be sure of the quality of what you're putting in before labelling it 'good'. However, it remains true (despite, IMHO, what djb wrote) that if you add an additional source of entropy and don't let it affect the accounting of 'good enough', it can at worst do nothing.

Lowering entropy

Posted May 13, 2015 9:36 UTC (Wed) by DigitalBrains (subscriber, #60188) [Link] (2 responses)

> if you add an additional source of entropy and don't let it affect the accounting of 'good enough', it can at worst do nothing.

But that's the whole problem, is it not? You can't lower the actual entropy of the pool by mixing in /dev/zero. This is however beside the point! What is actually affected is the kernel's estimate of the quality of the randomness. Say you're down to an estimate of 8 bits of entropy. The kernel starts mixing in something completely deterministic like /dev/zero, but it thinks it is increasing entropy in the pool and is now under the impression it has a whopping 256 bits of entropy to give out. Too bad it still only has 8 bits of actual entropy, which gets used as a cryptographic key!

I'm using overly dramatic numbers, and the kernel purposely underestimates its availability of entropy. But entropy is a well-defined, if difficult to measure, concept. If I need 128 shannons of entropy for my crypto, I will not get there with 96 shannons and something deterministic mixed in.

Lowering entropy

Posted May 13, 2015 19:25 UTC (Wed) by dlang (guest, #313) [Link] (1 responses)

> If I need 128 shannons of entropy for my crypto, I will not get there with 96 shannons and something deterministic mixed in.

how do you decide that you need "128 shannons of entropy for my crypto"?

and even if you only have 96 shannons of entropy, unless the attacker knows/controls the deterministic data that was mixed in, it's still effectively random as far as the attacker is concerned. This only becomes a problem when the deterministic factor can be known by the attacker, and even different amounts of deterministic data will result in different output.

Lowering entropy

Posted May 14, 2015 15:07 UTC (Thu) by DigitalBrains (subscriber, #60188) [Link]

> how do you decide that you need "128 shannons of entropy for my crypto"?

I'm going by the principle that the only secret thing about my crypto application is its key. I'm assuming for the moment that my attacker is able to exhaustively search the 96 shannons, or reduce the search space far enough to do an exhaustive search on what remains.

Because only my key is unknown, I'm assuming the attacker can reproduce the deterministic portions.

When you argue that mixing in bad quality randomness is not a problem because there's still plenty left, this seems like the bald man's paradox. If you have a full set of hair (good quality randomness), and some individual hairs fall out (bad randomness), you still have a full set of hair. Fine, so mixing in some bad sources doesn't make you go bald. But at some point, if enough individual hairs fall out, you are going bald: producing bad quality randomness. So I think that if you suspect that this cpu execution time jitter produces only 0.2 shannons per bit, you should not use it as if it has a full shannon per bit. You can still use it, but you shouldn't pretend it has more information content than it has. And if you don't feel confident giving a reliable lower bound on the amount of entropy delivered by the method, you might even be better off not using it. Better be safe than sorry.

It's about delivering an application what it expects, about quantifying what you mean when you say you are using a certain amount of randomness. A crypto application requesting 128 shannons of entropy does that because its designers decided this is a good amount of entropy to use. There are always margins built in, so it might be safe to give it only 96 shannons. But you're eating away at the built in margins, and at some point you're going past them.

The main point I tried to make is that I agree with commenters saying that you can't lower the entropy by mixing in determinism, but that that is not the point. Other than that, I think this is a really complicated subject and I'm not an expert at all, just an interested hobbyist.

> [...] unless the attacker knows/controls the deterministic data that was mixed in, it's still effectively random as far as the attacker is concerned.

I think that when you say that something is not a good quality source of randomness, you're effectively saying that you suspect someone could predict a part of its output. So yes, then there are attackers that might know the deterministic data to some extent. They can use this knowledge to reduce their search space. It's still only a small reduction; they still have a long way to go.