User: Password:
|
|
Subscribe / Log in / New account

testing and code coverage

testing and code coverage

Posted May 31, 2006 14:21 UTC (Wed) by jreiser (subscriber, #11027)
In reply to: Re: detecting all possible deadlock conditions by mingo
Parent article: The kernel lock validator

Put differently: it's not bad idea to run through every possible branch of code at least once anyway ;-) If some code is never triggered in practice, why is the code there and how was it tested to begin with?

First, probably it never has been tested systemmatically; this problem is endemic to the Linux development process. Second, the problem is not only "branches of code [coverage of basic blocks]." The problem is conditional execution paths through multiple basic blocks, including re-convergent fanout (correlated conditions) and partial correctness with "don't care" conditions.


(Log in to post comments)

testing and code coverage

Posted May 31, 2006 14:34 UTC (Wed) by mingo (subscriber, #31122) [Link]

First, probably it never has been tested systemmatically;

i think the answer to that is in the section you apparently overlooked:

Projects like LTP do that systematically, they use kernel instrumentation and visualization tools to see which code wasnt executed yet and create testcases to trigger it. Plus chaos helps us too: if LTP doesnt trigger it, users will eventually - and will likely not trigger the deadlock itself, but only the codepath that makes it possible. (hence there is much better debug info and a still working system.)

and thus this IMO misleading proposition results in a misleading conclusion:

this problem is endemic to the Linux development process.

;-)

LTP has a long way to go

Posted May 31, 2006 15:12 UTC (Wed) by jreiser (subscriber, #11027) [Link]

I did not overlook "Projects like LTP do that systematically." The Linux Testing Project is a drop in the bucket: far too little. Given its size, the kernel should have over ten thousand individual tests that are run and analyzed [automation helps!] before each release.

LTP has a long way to go

Posted May 31, 2006 15:55 UTC (Wed) by mingo (subscriber, #31122) [Link]

I did not overlook "Projects like LTP do that systematically." The Linux Testing Project is a drop in the bucket: far too little. Given its size, the kernel should have over ten thousand individual tests that are run and analyzed [automation helps!] before each release.

the LTP testsuite's 'testcases/' directory sports 7000+ files, most of which are individual testcases. Testcase files often contain more than 1 testcase. LTP is being run not "before each release" but on Linus' nightly GIT trees - and yes, it's all automated.

furthermore, there are random automated testing efforts as well like scrashme, which can (and do) hit bugs by chance as well.

while i dont claim that LTP is perfect (if it were we'd have no bugs in the kernel), it is certainly alot more than "a drop in the bucket".

but i digress ...

LTP has a long way to go

Posted Jun 6, 2006 19:34 UTC (Tue) by Blaisorblade (guest, #25465) [Link]

The problem of LTP is that it, in itself, tests syscall semantics, and many tests are unit tests. They can help in some cases (especially for strange arches, like say UML) - and they're very useful when a functionality is introduced, especially if the implementation is complex (that's my experience - I wrote together the implementation and a very complex testcase of it in one project).

nanosleep() tests aren't going to catch many bugs - nanosleep() uses are very simple.

But, say, run LTP on every possible drivers set, with a dmraid setup on "normal" RAID volumes on IDE and SCSI and SATA physical disks (since many bugs are in hardware drivers)...

Or combine networking tests with analisys of packet capture data and match sent data with on-wire data (supposing it's possible - you actually need a full TCP/IP stack to run on captured data, with additional correctness checking features)...

Then you'll find real bugs. However this starts being difficult.

Another possibility is to extract testcases from atypical applications.

UserModeLinux (on which I work) has been an excellent test-case against ptrace behaviour.

It found subtle bugs in ptrace existing in three kernel releases, in the recent past (one bug lived for 2.6.9 and 2.6.10, affecting security; the other affected x86-64 from 2.6.16.5 to 2.6.16.19).

But in this case, who coded the patches didn't run it.

testing and code coverage

Posted May 31, 2006 14:54 UTC (Wed) by mingo (subscriber, #31122) [Link]

Second, the problem is not only "branches of code [coverage of basic blocks]." The problem is conditional execution paths through multiple basic blocks, including re-convergent fanout (correlated conditions) and partial correctness with "don't care" conditions.

yes - code coverage and testing is alot more than just basic coverage of all existing code.

but coverage that triggers locking is alot simpler than full coverage, because locks are almost always taken in simple ways, without too many branches. So while in theory code like this:


void function1(unsigned long mask)
{
        if (mask & 0x00000001)
                spin_lock(&lock0);
        if (mask & 0x00000002)
                spin_lock(&lock1);
        if (mask & 0x00000004)
                spin_lock(&lock2);
        if (mask & 0x00000008)
                spin_lock(&lock3);
        if (mask & 0x00000010)
                spin_lock(&lock4);
...
        if (mask & 0x80000000)
                spin_lock(&lock31);
}

could exist in the kernel and would require 4 billion values of 'mask' to cycle through all the possible locking scenarios, in practice the kernel is full of much simpler locking constructs:

void function2(unsigned long mask)
{
        spin_lock(&lock);
        ...
        spin_unlock(&lock);
}

where covering the function once is probably enough to map its locking impact. In fact, "tricky", non-straight locking code is being frowned upon from a review and quality POV, which too works in favor of validation quality. To stay with the function1() example, such code will very likely be rejected at a very early review stage.

and finally, even theoretical full code coverage is alot simpler than the possible combinations of codepaths on an multiprocessor system that could trigger deadlocks.

naturally, being a dynamic method, the lock validator can only claim correctness about codepaths it actually observes (any other code doesnt even exist for it), and thus it still depends on how frequently (and with what memory state) those codepaths get triggered.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds