LWN.net Logo

When should a process be migrated?

The performance of modern computers is heavily influenced by how well they use the processor's memory cache. Going to main memory is a slow operation (from a processor's point of view); an operating system which forces main memory accesses too often will run slowly. One of the things the Linux kernel does to optimize cache use is to try to avoid moving processes between CPUs if it is likely that those processes have a fair amount of useful data in the cache. When a process moves, it leaves its cached data behind and must begin populating the new CPU's cache from the beginning. That repopulation requires memory accesses and slows things down.

The metric used by the kernel to decide whether moving a particular task is advisable is a scheduling domain parameter called cache_hot_time. If the process has run in the current processor within the "hot time," it is considered to have significant data in the cache and is not moved unnecessarily. In recent kernels, cache_hot_time for processors on non-NUMA, SMP systems is 2.5ms.

Kenneth Chen recently did some tests to see if that value makes sense. On his four-processor system, he found that workload throughput with a 2.5ms hot time was 12% below its peak level - which happens with a 10ms value. As it turns out, 10ms was once the default value for the cache hot time; Kenneth proposes that this value be restored. Others have, instead, suggested that a new tunable parameter be provided so that administrators could find and set the optimal value for their systems.

Ingo Molnar has come up with a different approach - have the computer figure out for itself what the optimal "cache hot" time is. To this end, his code performs the following steps for each pair of processors on the system:

  1. The first processor fills a large, shared buffer with data, thus populating its own cache with (some of) the contents of that buffer.

  2. The second processor fills a private buffer, filling its own cache.

  3. The second processor then overwrites the shared buffer, moving the contents of that buffer into its own cache.

The time required for the third step is, to an approximation, a worst case scenario for what it costs to move a process when it has filled the local cache with data. Ingo tested the code on a few systems and got optimal values which vary from 5ms (on a four-processor Pentium 4 system) to 87ms (for an eight-processor, semi-NUMA, Pentium 3 system). Clearly, one default value for all systems is not the right answer. This also looks like a good number for the computer to find for itself - assuming subsequent tests show that this patch (or a successor) is finding something close to the optimal value.


(Log in to post comments)

When should a process be migrated?

Posted Oct 7, 2004 6:40 UTC (Thu) by bradfitz (subscriber, #4378) [Link]

Whoa, that's pretty slick. Much better than a tunable, if it proves to work out well.

Process migration timing

Posted Oct 7, 2004 13:27 UTC (Thu) by Duncan (guest, #6647) [Link]

Slick indeed.

In any case, it would appear 2.5ms is to short. While auto-config timing
continues development, it would appear upping that 2.5ms time to 5-10ms
would be in order, 5ms would in all cases be better than 2.5, while 10ms
would appear to be better in most cases, and a good compromise between the
5ms cases and the 87ms cases.

Also, the article points out that the current 2.5ms time is applied in
non-NUMA cases. A couple sentences or a paragraph mentioning how NUMA
treatment differs would have been in order. It should obviously be a
higher time, 87ms as mentioned in the one case, as NUMA means potentially
moving data in main memory as well as cache, but is that also a set value
and what is it (a single sentence mentioning the NUMA value), or is that
already calculated dynamically (up to a paragraph if the dynamic
calculation can be explained easily at a high level, a couple sentences
giving an expected range for comparison purposes and stating further
detail is beyond the scope of the article, otherwise)?

This is of particular interest here, as I'm running a dual Opteron in NUMA
mode, 512 MB hung off of each CPU.

Duncan

When should a process be migrated?

Posted Oct 7, 2004 10:51 UTC (Thu) by PhilHannent (guest, #1241) [Link]

Just some side questions: When does this test occur? During boot? Will the test be rerun after a CPU has been hot swapped?

This is an interesting way of optimising a system.

When should a process be migrated?

Posted Oct 7, 2004 21:51 UTC (Thu) by simlo (subscriber, #10866) [Link]

What about latencies? The mentioned tests are only for through-put. In some cases where you rather want a responsive system you might want to migrate threads faster than on a system where you want the maximum through-put.

When should a process be migrated?

Posted Oct 8, 2004 16:57 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

This is measuring a red herring. Apparently, the method uses the "cost to move a process" number -- 5ms and 87ms in the example -- as the cache_hot_time. The two numbers have nothing to do with each other. cache_hot_time is about determining how likely it is that a process has stuff in cache. If it does, there's reason to try to keep it on the same CPU; if not, moving it is free.

What it costs to move a process is another parameter that one could use in conjunction with cache_hot_time to weight the decision whether or not to move a process.

A self-tuning cache_hot_time, on the other hand, would look like this: Keep a process running that regularly accesses some typical amount of memory and sleeps for varying times. Keep it on the same CPU. Have it determine each time it wakes up, by timing, if its memory was still cached. If it was, then cache_hot_time should be greater than the most recent sleep time; if not, it should be less.

When should a process be migrated?

Posted Oct 14, 2004 6:40 UTC (Thu) by Russell (guest, #1453) [Link]

What would work even better would be counters in the MMU that counted RAM accesses. It would provide the scheduler with good estimates on how much data a particular process had in each CPU's cache.

Copyright © 2004, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds