Memory-management testing and debugging
Testing
Davidlohr Bueso started a session on testing by saying that he has been
working on
improving the mmtests
benchmark suite to improve its ability to detect changes across kernel
versions. To that end, he has looked at a couple of test suites that are
being used in academia: Mosbench
and Parsec. There were
questions about how well these tests worked for testing the kernel in
particular, but, Davidlohr said, these suites do contain some useful tests.
Andi Kleen said there is a new suite out there that is promising despite being named, inevitably, "cloudbench."
Davidlohr asked if anybody else had workload tests that they would like to contribute to mmtests. Laura Abbott said that she would like to see a good set of tests for mobile systems. Scalability tests, she said, tend to be oriented toward scaling up, but mobile developers need tests that focus on scaling down.
Hard conclusions from this session were hard to come by; Davidlohr will continue to work on integrating and documenting other tests aimed at memory-management scalability.
Debugging
Memory-management debugging was the topic of another session run by Dave
Jones, Sasha Levin, and Dave Hansen. Dave Hansen started off by saying
that, while developers have added a number of debugging features to the
memory-management subsystem, they have so far left an important technology
on the table. He was talking about Intel's MPX
mechanism, which is able to check pointer accesses and ensure, in
hardware, that they don't go outside a set of defined boundaries. The nice
thing about MPX is that it has almost no runtime cost, so it can be enabled
on production systems.
Of course, developers may have some excuse for not making much use of MPX so far. It requires the (not yet released) GCC 5 compiler to instrument code properly, and hardware that actually implements MPX is not yet available. So, he said, there is still time to get our act together.
There was some immediate interest in using MPX with the slab allocator in the kernel. That would take some work, though, since the kernel would have to be changed to load the appropriate MPX registers before accessing a given slab object. Christoph Lameter asked if access to all slab objects could be monitored with MPX. It turns out that there's a small practical difficulty there: a typical running kernel has many thousands of slab-allocated objects, but there are only four sets of registers in the MPX hardware. So tracking more than four objects requires juggling information into and out of those registers.
Peter Zijlstra suggested that MPX could be applied to the kernel stack. It is not clear, though, that MPX-based stack checking would provide advantages over the explicit stack-overflow checks done in the kernel now. Still, it may be possible to dedicate one of the registers to the kernel stack and gain some extra protection.
Andy Lutomirski asked if the MPX registers could be written to while running
in atomic context. That turns out to be tricky, since setting up these
registers involves doing a memory allocation. Andy also suggested that
MPX could be used to block direct access to user-space addresses from the
kernel. Laura asked about checking
of DMA operations, but MPX only applies to accesses from the CPU.
Sasha shifted the discussion to the VM_BUG_ON() macro. This macro, which comes in a few variants, dumps out a bunch of information specific to the memory-management subsystem; it is thus useful for identifying memory-management bugs. Sasha would like to add more VM_BUG_ON() instances in the kernel, but he is worried about complaints of false positives. These complaints have kept debugging code out in the past; the result, he said, was that users suffered from a number of race conditions that could otherwise have been caught.
There was some talk about additional information that could be printed out by VM_BUG_ON(), but few conclusions. It was suggested that a full kernel memory dump would be helpful — but that, of course, is rather a large amount of data to print into the kernel log. Dave Jones would, instead, like more information about how the system got into the bad state; that would require adding some sort of transaction log. It was suggested that Intel's upcoming Processor Trace functionality could be helpful in this regard.
Dave Hansen then asked if there were any developers with sets of
memory-management tracepoints that could be considered for merging? It
seems that some exist, but Andi said that, rather than adding more
tracepoints now, it would be better to focus on improving the documentation
of existing tracepoints. Andrea Arcangeli questioned the value of
memory-management tracepoints in general; he does his memory-management
development on
virtualized systems and wonders why anybody would do anything else. When a
system is run under virtualization, it can be examined with an ordinary
debugger. But others argued that there are a lot of problems that only
show up on bare-metal systems, so there will always be a place for
debugging infrastructure that works in that environment.
Fernando Vasquez Cao noted that his group uses SystemTap heavily for memory-management debugging. Among other things, it is handy for injecting faults at specific locations, making it easier to get at hard-to-reproduce problems. Dave Jones agreed that the tools have made life better; it is, he said, a miracle that we were able to solve anything five years ago. He also wondered why there was not more use of the existing fault-injection framework; when he turns it on, he said, "everything breaks," so he concludes that nobody else is doing so. Fernando responded that the injection framework does not allow sufficiently specific fault injection. Besides, he said, when you turn it on everything else breaks, making it hard to focus on the specific problem at hand. It was agreed that somebody (currently unnamed) should fix those problems.
KASan
One tool that has been merged relatively recently is the kernel address sanitizer (or KASan). This tool uses a "shadow memory" array to track which memory the kernel should legitimately be accessing; it can then throw an error whenever the kernel goes out of bounds. KASan developer Andrey Ryabinin led a session on this tool and how it might be improved.
The first idea that came out was to enable KASan to properly validate
accesses to memory obtained with vmalloc(). Doing so would
require putting hooks into vmalloc() itself and creating a new,
dynamic shadow memory array. The amount of work required is not huge; it
is much like tracking slab allocations, except that shadow memory for slab
can be allocated at boot time. There were, unsurprisingly, no objections,
so this work should go forward soon.
A slightly trickier problem is memory that is freed and quickly reallocated to a new user. That memory looks fine to KASan, but quick reallocation can mask use-after-free bugs in the code that previously owned it. The proposed solution here is to put freed memory into a "quarantine" area for a period, delaying its availability to the rest of the system. Memory would emerge from quarantine after a defined period; alternatively, a shrinker could be used to remove memory from quarantine when the system starts to run low. There are concerns that delaying free operations in this way could create a certain amount of memory fragmentation. Andrey is not quite sure how to move forward with this feature, and the group did not appear to have a lot of fresh ideas to share.
Then there is the possibility of catching reads of uninitialized memory. It is possible to get the compiler to instrument code to make this testing possible, but the results include a lot of false positives that are hard to get rid of. Among other things, memory initialized in assembly code must be annotated manually. Andrey has tried doing this and found the result difficult to support. He's afraid that developers will turn the feature on, see all the false positives, and just give up on the whole thing.
Another possibility is using KASan to find data races; there are some tools out there to help with this now. But, he said, it involves some "crazy overhead" — four bytes of shadow memory for every byte of normal memory. There's also a need for a lot of manual annotation; large numbers of false positives are also a problem. The end result is that this feature does not appear to be useful for now.
Other ideas for the more distant future include a quarantine for the page allocator (and not just the slab allocator), and the instrumentation of some inline assembly operations like the atomic bit operators.
Sasha made a plea for developers to enable KASan when they are running their own tests. It has turned up a lot of bugs, he said; the code is in the upstream kernel, it's easy to turn on, and the overhead is low. The only catch is that GCC 5 is needed to gain all of the features, though 4.9 works with reduced functionality.
The final question in this session was: now that we have KASan, is there
still a need to maintain the older kmemcheck utility? Kmemcheck only works on
single-processor systems, it is painful to use, and it is slow. It seems
that nobody is actually making use of it. The consensus of the group was
that kmemcheck should be removed. (It should be noted that Sasha's attempt to implement this decision ran into
some opposition from developers who still use kmemcheck, so it may stay
around for a while yet).
| Index entries for this article | |
|---|---|
| Kernel | Development tools/Testing |
| Kernel | KASan |
| Conference | Storage, Filesystem, and Memory-Management Summit/2015 |
Posted Mar 19, 2015 3:42 UTC (Thu)
by kenmoffat (subscriber, #4807)
[Link] (3 responses)
It's nice to see photos of the various devs inserted into the article but in many case I cannot tell who the photo represents. Perhaps there is a guide somewhere, which I have missed ? On the browser I'm using (qupzilla), the photos appear at the side of the article, taking space from the previous/next paragraphs. Sometimes, an article only really mentions one developer and is reasonable clear. In others, such as this one, there are multiple developers and (rather fewer) photos.
Posted Mar 19, 2015 7:02 UTC (Thu)
by alonz (subscriber, #815)
[Link] (1 responses)
Posted Mar 19, 2015 17:28 UTC (Thu)
by kenmoffat (subscriber, #4807)
[Link]
Posted Mar 19, 2015 7:09 UTC (Thu)
by johill (subscriber, #25196)
[Link]
Memory-management testing and debugging
AFAICS hovering over the pictures will show the names, as will navigating through them.
Memory-management testing and debugging
Memory-management testing and debugging
Memory-management testing and debugging
