Theoretical vs. practical cryptography in the kernel
Theoretical vs. practical cryptography in the kernel
Posted Aug 14, 2020 2:45 UTC (Fri) by flussence (guest, #85566)Parent article: Theoretical vs. practical cryptography in the kernel
(I guess this would sound more appealing if we didn't already have getrandom(2)…)
Posted Aug 14, 2020 18:47 UTC (Fri)
by Wol (subscriber, #4433)
[Link] (1 responses)
If a flag says "wait until sufficient entropy is available" how many people will use it anyway, and then wonder why their system fails to boot ...
At the end of the day if programmers don't understand (or more likely bother to read) the documentation, ANY attempted fix is going to be a disaster.
Cheers,
Posted Aug 27, 2020 9:50 UTC (Thu)
by Vipketsh (guest, #134480)
[Link]
The real issue is that there is no reasonable response a program can make when it is told "no". If you asked for random, you need that random to continue, there is no way around that. You can read or write as much documentation as you want, but the situation doesn't change.
Posted Aug 14, 2020 19:49 UTC (Fri)
by shemminger (subscriber, #5739)
[Link] (19 responses)
Posted Aug 14, 2020 21:04 UTC (Fri)
by atnot (subscriber, #124910)
[Link] (18 responses)
Posted Aug 14, 2020 22:38 UTC (Fri)
by NYKevin (subscriber, #129325)
[Link]
If the PRNG is not very secure, as in this case, then it is possible for an attacker to work out the PRNG's state. Once a plausible adversary has the ability to determine the value of an input to the PRNG (i.e. its seed), then that seed is no longer unpredictable, and it is not unreasonable to call it "used up" in that case, particularly if the same seed has also been reused for other purposes under the assumption that it was unpredictable. However, this characterization is problematic, because it encourages the (wrong) assumption that generating N bits of random data "uses up" f(N) bits of input entropy for some f() - which is certainly not the case with a proper CSPRNG. This was the whole problem with the random(4) interface randomly blocking for (in most cases) no good reason.
Posted Aug 15, 2020 3:06 UTC (Sat)
by Paf (subscriber, #91811)
[Link] (16 responses)
The idea of reseeding (/mixing in new entropy in to) a prng, by the way, is much like the idea of re-keying in cryptography. It resets the search space for anyone trying to crack it. But if only one bit was added...? So we *have* to try to measure entropy.
What if we made heavy use of a source that turns out to be predictable or mostly predictable? The effective state space of the prng seed is dramatically reduced.
You say if the state of a prng is not known it doesn’t get easier to crack as its use continues. Given this, why does anyone ever rekey in cryptography? There are in fact many cryptographic attacks that require large quantities of cypher text (obviously with the same key). So, you rekey a lot, whether or not you know such an attack exists. Same for the prng handling ‘random‘.
So, we *have* to try to measure/guess entropy of our inputs - attackers certainly will. And, sure, we can’t do it perfectly, but we can get a decent idea of how predictable things are in practice, and use that.
Posted Aug 15, 2020 13:55 UTC (Sat)
by kleptog (subscriber, #1183)
[Link] (5 responses)
I don't see how this follows. Since entropy is only additive there is no reason not to simply include any and every piece of data that could possibly contain some entropy. For attackers it might be interesting to try and estimate the entropy, but for the system (Linux in this case) the only useful tactic is to be adding entropy faster than an attacker can constrain the search space.
Rekeying in data streams is indeed for resetting the search space. For a CSPRNG this means adding entropy equal to the size of the internal state. However, unlike in a data stream you don't actually need to know or care when this happens. There's no actual reason to be trying to estimate the entropy since it doesn't actually make a difference.
So throw in the block numbers of data being read from disk. Do they contain entropy? Who knows? Let a theoretical attacker worry about it. We've at least wasted their time having to think about it.
Posted Aug 15, 2020 18:38 UTC (Sat)
by Wol (subscriber, #4433)
[Link] (2 responses)
Cheers,
Posted Aug 16, 2020 5:37 UTC (Sun)
by NYKevin (subscriber, #129325)
[Link] (1 responses)
Posted Aug 26, 2020 20:04 UTC (Wed)
by amarao (guest, #87073)
[Link]
Posted Aug 16, 2020 18:55 UTC (Sun)
by nivedita76 (subscriber, #121790)
[Link] (1 responses)
The reason to reseed any PRNG, CS or not, is because you are concerned about an attacker using some means to exfiltrate the state of the PRNG (for a non-CSPRNG, this might be as easy as observing a moderate number of outputs, an LFSR just requires observing a number of outputs that is proportional to the state size, for example).
Given the scenario that reseeding is protecting against, adding small amounts of entropy at a time provides almost no additional security, and has two problems. Firstly, it effectively "wastes" that entropy: if you had waited to accumulate enough entropy and add it all at the same time, you could actually have gained security. Secondly, it actually reveals the bits you are mixing in to the attacker. If those bits are then also used later to seed a different PRNG, you've reduced the security of that PRNG as well, for no gain in security of the first one.
Posted Aug 24, 2020 14:16 UTC (Mon)
by ecree (guest, #95790)
[Link]
Except that this isn't actually true.
Viewing the attack from the information-theoretical perspective, there is _no difference_ between "mix in the entropy in dribs and drabs as you collect it" and "save it up until you can catastrophically re-seed": either you're adding entropy faster than it leaks (in which case you're fine) or you're leaking state faster than you add entropy (in which case you're screwed).
The difference only appears when you consider the amount of work required to brute-force-invert the mixing function _if_ state is leaking fast enough to allow the information-theoretic attack; partial-state-extension on frequent small re-seedings adds up to less total work than one big re-seed because the 2^k factor dominates.
Catastrophic re-seeding is thus important if your security depends on the cryptographic strength of your PRNG, but _not_ if your security depends on adding noise to the state faster than the attacker can obtain information to constrain that state (e.g. by observing outputs) — in which case cat reseed may actually _harm_ you by letting the attacker reach a greater partial state in the (longer) interval between re-seed events.
> for a non-CSPRNG, this might be as easy as observing a moderate number of outputs, an LFSR just requires observing a number of outputs that is proportional to the state size, for example
From the information-theoretical perspective, the same is true of any PRNG, CS or no, because you can always enumerate all possible states and see which one matches. Observing (on average) n consecutive outputs and doing O(n*2^n) work will crack an n-bit CSPRNG just fine. The difference with the LFSR is that there's a fairly easy 'cryptanalysis' that lets you do it with much less work than brute force.
One of the things that drove me crazy reading the original thread on netdev was the way some people (especially Spelvin) equivocated between the information-theoretical argument and the work argument. Tbh it looked like taking the Schneier cat-reseed paper and cargo-culting it into some kind of panacea-cum-universal-necessity.
Posted Aug 15, 2020 16:09 UTC (Sat)
by imMute (guest, #96323)
[Link] (9 responses)
One answer is damage mitigation. If the encryption key gets leaked, how much data is immediately available?
>What if we made heavy use of a source that turns out to be predictable or mostly predictable? The effective state space of the prng seed is dramatically reduced.
That's why you use more that one source of entropy. Suppose you don't trust RDRAND. I say mix it into the entropy pool anyway and just don't "credit" those bits. So far I've not seen anyone show that an untrusted entropy source is capable of *reducing* entropy of the pool.
Posted Aug 15, 2020 23:18 UTC (Sat)
by excors (subscriber, #95769)
[Link]
There's the idea shown in "Prototyping an RDRAND Backdoor in Bochs" (https://archive.org/stream/pocorgtfo03#page/n17/mode/1up). If code does something like "entropy_pool ^= RDRAND()" to mix in the new randomness, a malicious implementation of RDRAND could return "entropy_pool ^ x" where x is a random-looking value that is known to the attacker, cancelling out all the previous entropy. To someone who doesn't know x, the output of RDRAND would still look perfectly random, and the CPU is otherwise following the architecture specification perfectly (which makes it harder for users to observe any malicious behaviour).
That's probably not very useful in an emulator, because a malicious emulator could leak the content of entropy_pool in plenty of other ways, and it's probably difficult in a physical CPU without affecting the timing or performance counters in a way that someone would eventually notice. But still, it means in principle RDRAND can reduce entropy without doing anything other than returning a carefully constructed value.
Posted Aug 16, 2020 16:23 UTC (Sun)
by Paf (subscriber, #91811)
[Link] (7 responses)
Cryptography is hard; the idea that we can add whatever to the pool forever is a dangerous one (the possibility of deliberately tainted inputs noted above, for example). I don’t think Linus is making that argument. He’s arguing about where the fence goes, not about removing it.
Posted Aug 16, 2020 16:49 UTC (Sun)
by kleptog (subscriber, #1183)
[Link] (1 responses)
Umm, I don't think that's how it works at all. Adding a gigabyte of zeros to the pool doesn't reduce the entropy in it at all. It doesn't add anything either, but it doesn't hurt.
Tainted inputs are an issue, but that just means to need to add as many sources as you can find, and you have to be careful how you add things. If any untainted source is included you have basically made it infeasible for the attacker.
Posted Aug 20, 2020 4:43 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link]
> Umm, I don't think that's how it works at all. Adding a gigabyte of zeros to the pool doesn't reduce the entropy in it at all. It doesn't add anything either, but it doesn't hurt.
The original quote is in fact true, but does not mean what you think it means. If you didn't get enough data with real randomness, then you could have a predictable state. This happens regardless of whether or not you also added a bunch of stuff that wasn't random.
The problem, therefore, is whether they should do something with the entropy accounting, not whether they should allow reusing the fast pool bits in this fashion at all.
Posted Aug 17, 2020 7:18 UTC (Mon)
by tialaramex (subscriber, #21167)
[Link] (4 responses)
HMAC(oldState, "Predictable stuff") -> newState
Doesn't tell us anything about oldState or newState, even though we're quite certain what "Predictable stuff" is. Worse, if we're a traitor inside the machine we can't usefully influence newState despite knowing exactly what oldState was and having total control over "Predictable stuff".
Posted Sep 2, 2020 18:29 UTC (Wed)
by tanriol (guest, #131322)
[Link] (3 responses)
In the HMAC example, before the "mixing" you had an N-bit pool in one of the 2**N possible states and when mixing every input state becomes an output state. However, as multiple input states can become a single output state, the final number of possible state decreases after this operation. This looks like the birthday paradox setting, so I'd guess that a series of known random mappings applied to the pool can reduce the number of possible states to sqrt(2**N) at most.
Posted Sep 8, 2020 11:31 UTC (Tue)
by tialaramex (subscriber, #21167)
[Link] (2 responses)
I don't see why a method that exploits this for real (on an actual HMAC with an actual N-bit pool) wouldn't also work perfectly well to collide arbitrary hashes, which is not supposed to be practical. But maybe you can explain why it doesn't do that?
Posted Sep 8, 2020 12:11 UTC (Tue)
by tanriol (guest, #131322)
[Link] (1 responses)
In the original statement you said that spewing a bunch of something does not benefit the adversary *at all*. The way I understood that was that the N-bit pool still has N bits of entropy. That's obviously not the case because the number of possible states goes down, and quick experiments with a really small toy HMAC confirm that (repeatedly running it causes the entropy to reduce from the original 16 bits to 8 or so, just as expected from the birthday paradox style configuration). To avoid that you need your primitive to be an (input-driven, obviously) permutation in the state space so that the number of states stays the same. There are constructions like this - IIUC, SHA3 uses one, unlike MD5/SHA1/SHA2 - but you need to pay attention to this.
On the other hand, for collision attacks N/2 is already the baseline due to the birthday paradox, so by itself this effect does not provide any improvements over that - a hash resistant to collision attacks would also be resistant to this effect.
Posted Sep 9, 2020 10:00 UTC (Wed)
by tialaramex (subscriber, #21167)
[Link]
Theoretical vs. practical cryptography in the kernel
Wol
Theoretical vs. practical cryptography in the kernel
Theoretical vs. practical cryptography in the kernel
Both could easily use up all the available entropy for other high need users.
Theoretical vs. practical cryptography in the kernel
Theoretical vs. practical cryptography in the kernel
Theoretical vs. practical cryptography in the kernel
Theoretical vs. practical cryptography in the kernel
Theoretical vs. practical cryptography in the kernel
Wol
Theoretical vs. practical cryptography in the kernel
Theoretical vs. practical cryptography in the kernel
Theoretical vs. practical cryptography in the kernel
Theoretical vs. practical cryptography in the kernel
Theoretical vs. practical cryptography in the kernel
It seems Linus is in a similar line of thinking - who cares if we can't *prove* that all these things are secure, throw a shitload of them into the mix and make the attackers job *extremely* difficult.
Theoretical vs. practical cryptography in the kernel
Theoretical vs. practical cryptography in the kernel
If you add a bunch of stuff that turns out to not be random, then you could make the state predictable if you didn’t get enough data with real randomness in. Linus is having an argument of degree here - this stuff looks pretty random, let’s add it in. He’s very much not suggesting you add ... whatever ... to the pool.
Theoretical vs. practical cryptography in the kernel
Theoretical vs. practical cryptography in the kernel
You can in fact "add whatever to the pool forever"
You can in fact "add whatever to the pool forever"
You can in fact "add whatever to the pool forever"
You can in fact "add whatever to the pool forever"
You can in fact "add whatever to the pool forever"