Random numbers and virtual-machine forks

By Jonathan Corbet
March 11, 2022

One of the key characteristics of a random-number generator (RNG) is its unpredictability; by definition, it should not be possible to know what the next number to be produced will be. System security depends on this unpredictability at many levels. An attacker who knows an RNG's future output may be able to eavesdrop on (or interfere with) network conversations, compromise cryptographic keys, and more. So it is a bit disconcerting to know that there is a common event that can cause RNG predictability: the forking or duplication of a virtual machine. Linux RNG maintainer Jason Donenfeld is working on a solution to this problem.

The kernel's RNG maintains an "entropy pool" from which random numbers are derived. As randomness from the environment is harvested, it is mixed into the pool, keeping the level of entropy up. Every running system has its own pool, naturally, with its own internal state. If two systems were to somehow end up with their entropy pools containing the same data, they would produce the same sequence of random numbers, for a while at least. That is something that should never happen.

But, as Donenfeld pointed out in a patch set first released in February, there is a way that two systems can end up with the same entropy-pool content. If a running virtual machine is somehow duplicated, the entropy pool will be duplicated with it. This can happen if a machine is checkpointed and restored, or if it forks for any reason. Once the restored or forked machine starts running, it will reproduce the sequence of random data created by the previous instance until the addition of new entropy perturbs the pool.

Microsoft, it seems, has already addressed this concern in Windows; the solution takes the form of a "virtual-machine ID" made available via the ACPI firmware. When a virtual machine forks or is restarted, the ID is changed. The kernel can watch this value and, on noticing that it has changed, conclude that some sort of virtual-machine fork has occurred and that action is necessary to keep the random-data stream unique. Some virtualization systems, including QEMU, have implemented this functionality, so it makes sense for Linux to make use of it as well.

The patch set thus adds a new "vmgenid" driver that makes the virtual-machine ID available to the kernel. When this driver is notified (by the ACPI firmware) of a change, it checks the ID and, if that has changed, calls a new function (crng_vm_fork_inject()) to inform the RNG that it needs to muddy up the entropy pool. This is done by mixing in that same virtual-machine ID. It is not claimed to be the ultimate in security, but it does address the immediate problem; Donenfeld intends to merge this work for 5.18.

This project does not stop there, though; in a later email, Donenfeld described the changes he would like to make next. He started by complaining about the design of Microsoft's solution, which has a race condition designed into it. The kernel cannot respond to a virtual-machine fork until it notices that the generation ID has changed; the new virtual machine could run for some time before that happens, and it will generate duplicate random numbers during that time. It would have been better, he said, to provide a simple "generation counter" that could be quickly polled by the CPU every time random data is requested; that would allow a change to be caught immediately. "But that's not what we have, because Microsoft didn't collaborate with anybody on this, and now it's implemented in several hypervisors".

Having gotten that off his chest, he proceeded to the real task at hand: propagating the news about a virtual-machine fork to other interested kernel subsystems. He originally envisioned creating a new power-management event to serve as a sort of notifier, but that was seen as an awkward fit; a virtual-machine fork isn't really related to power management in the end. So Donenfeld posted a new solution creating a separate notifier (using the kernel's existing notifier mechanism) to inform subsystems. The first user is, unsurprisingly, the WireGuard VPN, which needs to know about such events:

When a virtual machine forks, it's important that WireGuard clear existing sessions so that different plaintext is not transmitted using the same key+nonce, which can result in catastrophic cryptographic failure.

User-space code may benefit from knowing about virtual-machine forks as well; for example, the Samba server may want to reset sessions in that situation. For user space, Donenfeld proposes adding a new virtual file that programs can poll for VM-fork notifications. This file is currently located in /proc/sys, even though it is not a true sysctl knob in that it cannot be written to as a way of tuning system behavior.

Response to this work has been positive, overall; kernel developers tend to have little appetite for catastrophic cryptographic failure. That said, Greg Kroah-Hartman did observe: "It seems crazy that the 'we just were spawned as a new vm' notifier is based in the random driver, but sure, put it here for now!" So this work, too, seems destined for merging for the 5.18 kernel release. That should help to close a vulnerability that many of us may not have ever been aware existed.

Index entries for this article
Kernel	Random numbers
Kernel	Security/Random number generation
Kernel	Virtualization

Random numbers and virtual-machine forks

Posted Mar 11, 2022 18:43 UTC (Fri) by bof (subscriber, #110741) [Link] (2 responses)

Hmmm. I understand checkpoint+restore as a migration concept, but my mind's coming up blank with scenarios where a virtual machine would actually be forked into two completely identical copies (that now notice they are to be not so identical in a growing number of ways?)

I mean, duplicate MACs? Duplicate IPs? Huh? What do I miss?

Random numbers and virtual-machine forks

Posted Mar 11, 2022 21:14 UTC (Fri) by josh (subscriber, #17465) [Link]

Fast startup: boot your VM to the point where it's about to invoke userspace, then fork it.

Fast per-client VMs: boot your VM to the point where it's listening for connections, then fork it.

You don't need to worry about IPs or MAC addresses if your VMM is handling routing and giving your VM an internal-use-only IP.

Random numbers and virtual-machine forks

Posted Mar 12, 2022 0:58 UTC (Sat) by iabervon (subscriber, #722) [Link]

Could be good for software upgrades for services that have state. Before you start, fork the VM; after you've done the upgrade and tested the system, you can decide which fork to keep.

There are various applications for checkpoint+restore+restore+..., where the copies are mostly not contemporaneous; it's really nice for being able to produce test environments that aren't full of the remains of previous testing, for example, but you generally don't need to have your pristine copy running all the time and being forked into new copies while live.

Random numbers and virtual-machine forks

Posted Mar 12, 2022 0:00 UTC (Sat) by ejr (subscriber, #51652) [Link] (12 responses)

Too many connotations are dropped on "random numbers." Some folks refer to them as "randumb." NIST has been pushing on expanding definitions to clarify things for those who already understand. I'm not aware of any effort to reach general understanding, but this also is **NOT** my area beyond being a user in a few disparate applications.

There are "true" random numbers. Defining that (to me) is as difficult as defining true.

There are "cryptographic" random numbers. Far, far outside my bailiwick.

There are "sufficient" but reproducible random numbers. A climate simulation may depend on random numbers for everything from error bounds to intermediate perturbations. If you cannot reproduce the results across a variety of hardware, using those results in international negotiations is, um, tricky. Similar for software debugging where it is my problem.

And certainly there are more.

Randomness is our construct. In our views of nature, sometimes random becomes a pattern. The issue in my uneducated opinion is defining what "random" means in each context. I work with randomized algorithms, so I understand *a few* of those contexts. It ain't easy.

Random numbers and virtual-machine forks

Posted Mar 12, 2022 1:14 UTC (Sat) by NYKevin (subscriber, #129325) [Link]

> There are "true" random numbers. Defining that (to me) is as difficult as defining true.

Here's my attempt: A number X is "truly" random if:

1. We know that X follows some specific distribution. For example, we know that "the number of radioactive decay events in an hour" (or a minute, or a second) follows a Poisson distribution with the lambda parameter determined by the quantity of the radioactive element and its half-life (as well as the length of the period sampled). Similarly, "the number of heads when you flip ten fair coins" follows a binomial distribution with p=0.5 and n=10.
2. We know that CDF(X) is not correlated with any quantity that can be physically measured, other than X itself. For radioactive decay events, this is believed to be true under our current theory of physics. For coin flips, it may be correlated with the initial orientation of the coin as well as other physical factors, but those are certainly very *hard* to measure.
3. Nobody knows the value of X (yet).

> There are "cryptographic" random numbers. Far, far outside my bailiwick.

To a rough approximation, "cryptographically secure" means "close enough to truly random that it's unrealistic an attacker will be able to predict the value in practice." Depending on the application, there may be an allowance for "the attacker can make limited predictions, but they are not accurate enough to be useful for real, plausible cryptanalytic attacks against a particular algorithm." For example, AES-256 provides 256-bit security, meaning that a naive brute-force attack would require trying 2^256 possible keys. If the key was generated in such a way that the attacker can imperfectly predict 2 bits of it, then it still has at least 254-bit security, and 2^254 is still stupidly huge. In practice it's more complicated than that, because you don't actually do a brute force attack, instead you do various more complicated attacks which I don't fully understand, and those don't require the full 2^256 attempts in the first place. But that's OK, because even a much "smaller" number like 2^100 is still too big for a practical attack.

Random numbers and virtual-machine forks

Posted Mar 12, 2022 7:02 UTC (Sat) by wtarreau (subscriber, #51152) [Link] (2 responses)

It's always only a matter of the difficulty to predict these numbers. True randomness today will only be considered as moderately predictable today. What we qualify as random is something which behaves in a way that depends on too many factors for us to analyze, model and predict. For example, a century ago, some might have decided that shooting stars could be a perfect source of random numbers. Nowadays with good observation and modeling these could be considered as highly predictable for some. Today we consider radioactive decay as a good random source. Maybe in 100 years someone comes with a model that makes them highly predictable.

That's why in the end it's important to combine as many dependencies as possible. Even if some are observable by some, as long as you mix as many different ones as possible, you progressively remove the ability for any observer to observe the whole model.

In the situation here, I think that jitter entropy and general system activity will quickly make the pools diverge. Probably that after a few seconds to minutes the two systems will be completely independent. But for the short initial time one could be vulnerable to the analysis performed on the other one, especially if the target hardware is known and allows the observer to try to mimmick it very closely and limit the divergence.

Thus overall Jason's work on this is definitely useful.

Random numbers and virtual-machine forks

Posted Mar 14, 2022 14:31 UTC (Mon) by plugwash (subscriber, #29694) [Link]

> In the situation here, I think that jitter entropy and general system activity will quickly make the pools diverge. Probably that after a few seconds to minutes the two systems will be completely independent. But for the short initial time one could be vulnerable to the analysis performed on the other one, especially if the target hardware is known and allows the observer to try to mimmick it very closely and limit the divergence.

One immediate question that springs to mind is "what about userland"?

As you say the kernel RNGs will likely diverge quickly, so the window of opportunity for failure is fairly narrow. OTOH RNGs in userland that are rarely or never re-seeded and are used in a predictable way may remain in sync for much longer.

Crypto libraries often already have code in place to defend against fork calls, but presumably cloning the whole VM would go unnoticed by such countermeasures.

Random numbers and virtual-machine forks

Posted Apr 2, 2022 7:05 UTC (Sat) by sammythesnake (guest, #17693) [Link]

> Today we consider radioactive decay as a good random source. Maybe in 100 years someone comes with a model that makes them highly predictable.

What you're describing here is what's called a "hidden variable model" in which there are unknown substrates below the observed behaviour that could (in principle) become known, modelled, and predicted (you'd still have the challenge of making suitable observations in the face of engineering limitations and the Heisenberg uncertainty principle of course...)

I understand that there are proofs for various quantum phenomena that preclude the possibility of such hidden variables. Off the top of my head, I'm not 100% certain that nuclear decay is one of the phenomena covered by these proofs, but I think it is.

I once tried to follow a description of such a proof, but ended up glassy eyed and needing a nap...

Random numbers and virtual-machine forks

Posted Mar 12, 2022 20:47 UTC (Sat) by ballombe (subscriber, #9523) [Link] (7 responses)

One should not speak about random numbers, but about random numbers generators (RNG).
They are:
1/ ideal RNG: each bit is an independent random variable. We have no way to build that.

2/ hardware RNG: each bit is given by a physical process we expect to be random (slow)

3/ Pseudo-RNG (PRNG): deterministic state machine that produce output that look statically like an ideal RNG for batchs of length at most something (e.g 2^32) (usually fast)

4/ Cryptographicaly-secure PRNG: a PRNG where it is believed it is computationnaly impossible to recover the state from a batch of outputs. (Note that If you have unlimited computational power, you can break it by trying all possible state).

5/ Backdoored PRNG: as 4, but there is a key that allow to recover the state from a medium-sized batch of output.

6/ Mixed RNG: mix of 2/ and 3/ as used by the kernel. A physical process is used to change the state of the PRNG from time to time. Compromise between speed, security and hardware integrity.

Random numbers and virtual-machine forks

Posted Mar 13, 2022 5:04 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (4 responses)

Another way of characterizing a PRNG: PRNGs "stretch out" a small amount of entropy (the seed) over a large number of bits (the output). This creates the intuitive concern that, if you have too little entropy in the seed, even the best PRNG algorithm will struggle to produce high-quality randomness from it (because you can't manufacture entropy out of thin air) - and that is indeed a real (solvable) problem! The solution is, essentially, "don't do that" - you have to put enough entropy into the seed initially, and periodically re-seed with HRNG output often enough, or else the PRNG will eventually become predictable based on its past outputs. This turns out to be sufficiently complicated, difficult, and hardware-specific that it's worth building into the kernel.

Random numbers and virtual-machine forks

Posted Mar 13, 2022 9:06 UTC (Sun) by atnot (subscriber, #124910) [Link] (3 responses)

For reference, if I'm looking at the right numbers, the amount of rng iterations you'd need to read from the current kernel crng for the seeding entropy not to be sufficient is roughly 2^232, which should be more than enough to extract thousands of petabytes for every atom on earth. If you're interested in reading any less than that, reseeding isn't really relevant from a cryptographic or entropy perspective. Of course it doesn't really hurt to just mix some more bits into the pool before that point in case the original pool was compromised via other means (side channels, vulnerabilities, vm forks), but that's really just a defense in depth measure.*

* I am not a cryptographer, this is not cryptography advice

Random numbers and virtual-machine forks

Posted Mar 13, 2022 22:39 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (2 responses)

I haven't looked into the Linux kernel's implementation in particular, but that sounds implausible to me. The problem is that we have two conflicting goals here:

1. We want each individual output to be as unpredictable as possible. That means it should contain the maximum possible amount of entropy.
2. We want each individual output to provide as little information about the PRNG's internal state as possible. That means it should contain the *minimum* possible amount of entropy.

Obviously, you can't satisfy both goals, so you have to compromise on one or the other, or else you have to use something other than the PRNG's internal state (e.g. RDRAND, jitter, etc.) to increase the entropy of each individual output from a non-PRNG-derived source. Now, I'm assuming that the kernel does use such a non-PRNG source, but that's mathematically equivalent to expanding the entropy pool to include other factors outside the control of the PRNG itself - so you still run into the same "we're running out of entropy" problem as before. The net amount of entropy entering the entire system as a whole must equal or exceed the net amount of entropy leaving the system via PRNG outputs.

Random numbers and virtual-machine forks

Posted Mar 14, 2022 11:20 UTC (Mon) by atnot (subscriber, #124910) [Link]

I think your analysis is strictly mathematically correct. However it does also illustrate why using the concept of entropy in the context of random number generation creates huge misunderstandings and further mistakes like blocking /dev/random.

Cryptographic hash functions are designed such that if you have a hash function of size n you need on the order of 2^n inputs to determine any bit of the plaintext with odds >50%. For BLAKE-256 the best known attack is 2^232, which is the number I was citing. So while it is mathematically true that entropy is leaving the pool through the hash function, cryptography means that entropy is infeasible to recover. Every bit is so heavily entangled with every other bit that you get no information until you've broken effectively the entire hash.

If you like, although it's not completely accurate, you can think of it like the bits of information gained from the output of the cryptographic hash function being smeared very thin across it's layers and layers of hidden inputs and outputs, never thick enough to get over 50% at the output until you collect a truly extraordinary number of them.

Another way to think about it is via encryption. If you encrypt 16 bytes with a key and the output is 16 bytes, that output must appear random to an outside observer, otherwise the algorithm would leak information. If you used that to encrypt the pool with itself and handed that out, would you be removing 16 bytes from the pool? The bits are correlated, sure, but what does that mean you if you can not determine the correlation? Tracking entropy "movement" isn't really a useful concept at that point.

Random numbers and virtual-machine forks

Posted Mar 14, 2022 11:23 UTC (Mon) by cesarb (subscriber, #6266) [Link]

> Now, I'm assuming that the kernel does use such a non-PRNG source, but that's mathematically equivalent to expanding the entropy pool to include other factors outside the control of the PRNG itself - so you still run into the same "we're running out of entropy" problem as before. The net amount of entropy entering the entire system as a whole must equal or exceed the net amount of entropy leaving the system via PRNG outputs.

The trick is that, even though you might have in theory exhausted the entropy of the system, the only known way to take advantage of it is a brute force search. With a large enough state (around a couple hundred bits), brute force search becomes infeasible.

Random numbers and virtual-machine forks

Posted Mar 14, 2022 13:25 UTC (Mon) by kleptog (subscriber, #1183) [Link]

Conceptually it's useful to consider RNGs on the basis of: what's my chance of guessing the next value given some information X.

Ideal RNGs are unpredictable, no matter what information you have. That's why cryptographers love them: the proofs become very simple.

CS-PRNGs are intended to be totally predicable if you know the internal state, but not if you just know some of the output.

So in the case of forked VMs, supposed you started one VM up and got a bunch of random numbers. Then you start a second fork. If you are given the two VMIDs, plus the first stream of random numbers, can you predict the random numbers for the new VM? In principle: sure. But you need to think about the computing power required. And you need to decide on a computing model: does it matter if you have a quantum computer?

Like the discussions about mixing in timing information. Maybe it's predictable, but it matters only if the attacker knows what's being mixed in. So you're better off just mixing everything in and soon as there's one source that the attacker doesn't know exactly, you've already made it computationally impossible to reverse.

Random numbers and virtual-machine forks

Posted Mar 14, 2022 20:19 UTC (Mon) by wittenberg (subscriber, #4473) [Link]

1/ ideal RNG: each bit is an independent random variable. We have no way to build that.

We do. You can buy off-the-shelf random number generators which depend on quantum processes for their randomness. The first google hit for one is: https://www.quintessencelabs.com/products/qstream-quantum... which claims to offer 1 Gb/sec (after whitening, 8 Gb/sec before whitening). If quantum processes are not random, we have a huge hole in our understanding of the physics which underlies chips. Possible, but unlikely.

I don't know about this particular device, but last I checked, similar devices cost about $5000 (US).

--David

Random numbers and virtual-machine forks

Posted Mar 12, 2022 4:25 UTC (Sat) by Otus (subscriber, #67685) [Link] (4 responses)

How much of a problem is this really? On current (5.16) kernels every call to the CRNG will use a different nonce from RDRAND.

Sure, there are CPUs that don't support it, but how many of them are being used for VM cloning? Seems to me that just mixing in the cycle counter in that case would solve things.

Random numbers and virtual-machine forks

Posted Mar 13, 2022 22:32 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (3 responses)

Unfortunately, there are/were CPUs that don't properly implement RDRAND, but lie and say they do: https://arstechnica.com/gadgets/2019/10/how-a-months-old-...

Random numbers and virtual-machine forks

Posted Mar 14, 2022 6:06 UTC (Mon) by Otus (subscriber, #67685) [Link] (2 responses)

I believe the kernel marks those as not supporting RDRAND these days? Again, even a cycle counter would be enough here, since it's not about the entropy.

Random numbers and virtual-machine forks

Posted Mar 14, 2022 8:43 UTC (Mon) by Wol (subscriber, #4433) [Link] (1 responses)

Read the article. It's NOT enough.

All it takes is an application (which doesn't know) to access RDRAND directly, and you're in trouble.

The article gives the example of Wireguard, which when it hits one of these, locks the system HARD. Actually, that could possibly be behind why my system locks up every now and then ...

Cheers,
Wol

Random numbers and virtual-machine forks

Posted Mar 14, 2022 16:11 UTC (Mon) by Otus (subscriber, #67685) [Link]

Ah yes, it's combining two things. The CRNG reseed could be done differently. The application reseed must happen and requires new functionality.

I do wonder if the latter should be something more generic (not tied to vmid), since I can easily imagine other cases where you might want to tell everyone to reseed. For example, if you are using something like systemd-random-seed.service to feed entropy you trust more than the kernel's entropy collection during boot.

But anyway, that's just academical, clearly something like this is required. Thanks.

Random numbers and virtual-machine forks

Posted Mar 12, 2022 5:58 UTC (Sat) by mtthu (subscriber, #123091) [Link]

Never really thought about this scenario. One other way to mix in entropy would be to get a random number from the hypervisor somehow. This could also be made available via a driver but would of course not resolve the problem with notification about a fork. A "real" system has more sources of entropy than a VM, thus a mix in into the VMs could make sense.

Random numbers and virtual-machine forks

Posted Mar 12, 2022 16:45 UTC (Sat) by rwmj (subscriber, #5474) [Link]

virt-sysprep can reinject a new random seed into guest disk images: https://libguestfs.org/virt-sysprep.1.html#random-seed

Random numbers and virtual-machine forks

Posted Mar 17, 2022 7:42 UTC (Thu) by gasche (subscriber, #74946) [Link]

We had a similar problem in the OCaml standard library, which is to decide how to evolve the Random Number Generator (RNG) state when spawning a new thread. The Random library exposes a "global random generator", but in fact there is one random state per thread; the question is how to initialize the random state of a new thread.

RNG algorithm designers have devised "splittable" generators, which make it possible to separate a single RNG state into two states that look independent. Our solution was to change our PRNG algorithm to use a "splittable" one (we chose LXM: https://dl.acm.org/doi/abs/10.1145/3485525 ). On `spawn`, the parent thread "splits" its PRNG state, keeps one state and passes the other to the child.

This has the very good property that the if threads are created (and create threads themselves etc.) and consume random numbers without communication/synchronizing with each other, then the PRNG states and random numbers produced are deterministic -- for a given tree of spawns and draws, the results depend only on the initial PRNG state of the ancestor thread. So, for example, debugging strategies involving fixing a random seed at program startup for reproducibility still work in a multi-threaded scenario. (In contrast, if you initialize new threads PRNG states by re-seeding, or if you maintain a single PRNG state protected by a mutex, the results will depend on reseeding input or the non-deterministic scheduling of the threads with respect to each other.)

Of course, this works because we have full control of the OCaml runtime: we can add specific code when a new thread is spawned, that is run on the parent at spawn time. In the situations of VMs as described, this would correspond to notifying the VM that it is about to be cloned, and giving it the ability to change its own state and also decide parts of the new clone state. There is no mechanism providing this amount of control, and working around this is the main point of the article. Unfortunately, the vmgenid approach means that the child will find out after the fact that it was cloned, but the parent is never told, so the "splitting" approach is not applicable -- the splitting operation modifies both the parent and the child PRNG state.

Some PRNG generators support a "jump" primitive where the PRNG state advances by many iterations (for example 2^128) at once. Upon finding out that it is a cloned child, a VM could "jump" by a fixed amount. Unfortunately, this does not solve the problem of guaranteeing independence of PRNG states. It works if you clone a VM (1 jump), then clone the cloned VM again (2 jumps), etc., but it does not work if you clone the original VM several time: all copies will have jumped the same amount, so they end up in the same state. With this little amount of control over the cloning protocol, it looks like reseeding is the only option.

Random numbers and virtual-machine forks

Posted May 23, 2022 8:29 UTC (Mon) by cortana (subscriber, #24596) [Link]

This 'vm generation id' value can be of use to userspace as well. For instance, Active Directory Domain Services use it to ensure the correctness of replicated & uniqueness of generated data in the event of a virtualized domain controller being snapshotted & rolled back.

https://docs.microsoft.com/en-us/windows-server/identity/...

Random numbers and virtual-machine forks

Posted Jan 12, 2023 4:22 UTC (Thu) by kmeyer (subscriber, #50720) [Link]

We added a similar vmgenid driver in FreeBSD recently (2019), too.[1]. VM generation change events are published to the rest of the system via the FreeBSD EVENTHANDLER subsystem's acpi_vmgenc_event event. I'm super pleased to see all of the work Donenfeld has been doing to improve the Linux RNG lately.

[1]: https://github.com/freebsd/freebsd-src/commit/ffac39deae0...