User: Password:
|
|
Subscribe / Log in / New account

The kernel lock validator

The kernel lock validator

Posted May 31, 2006 6:28 UTC (Wed) by flewellyn (subscriber, #5047)
Parent article: The kernel lock validator

Y'know, it's funny. Back in college, in my Operating Systems course, we learned that it was
impossible to design code which would reliably detect all possible deadlock conditions in a
kernel.

We also learned that schedulers could not run in constant time.

That's twice now that Ingo Molnar has basically rewritten the literature on us.


(Log in to post comments)

The kernel lock validator

Posted May 31, 2006 6:54 UTC (Wed) by smurf (subscriber, #17840) [Link]

Umm, the validator doesn't detect all *possible* deadlock conditions -- just those that, given current and past locking history, might occur. It won't find tomorrow's deadlock if the code that could possibly cause it didn't run today.

Oh, and the scheduler doesn't run in constant time. It just delegates the job to small trivial chunks that run when each process starts+stops using the CPU, so technically it's still O(n), not O(1).

Re: detecting all possible deadlock conditions

Posted May 31, 2006 7:17 UTC (Wed) by mingo (subscriber, #31122) [Link]

Well, the validator does reduce the incredibly complex QA task of "trigger all possible locking races that can lead to deadlocks, assuming an arbitrary number of CPUs and interrupt sources" to "trigger all locking codepaths at least once".

In other words, we've reduced a highly probabilistic parallel QA (and manual code review) task to a comparatively _much_ simpler task of serial code coverage.

But yes, to increase the 'reach' of the formal proof of correctness we need to increase code coverage during testing. Projects like LTP do that systematically, they use kernel instrumentation and visualization tools to see which code wasnt executed yet and create testcases to trigger it. Plus chaos helps us too: if LTP doesnt trigger it, users will eventually - and will likely not trigger the deadlock itself, but only the codepath that makes it possible. (hence there is much better debug info and a still working system.)

Put differently: it's not bad idea to run through every possible branch of code at least once anyway ;-) If some code is never triggered in practice, why is the code there and how was it tested to begin with? There is a world of a difference between code that is never executed even once and narrow 1-instruction races that need many CPUs to even trigger.

So i'd say "total locking correctness" is not an impossible dream anymore, but is attached to another effort (code coverage) that we want to make as complete as possible anyway.

testing and code coverage

Posted May 31, 2006 14:21 UTC (Wed) by jreiser (subscriber, #11027) [Link]

Put differently: it's not bad idea to run through every possible branch of code at least once anyway ;-) If some code is never triggered in practice, why is the code there and how was it tested to begin with?

First, probably it never has been tested systemmatically; this problem is endemic to the Linux development process. Second, the problem is not only "branches of code [coverage of basic blocks]." The problem is conditional execution paths through multiple basic blocks, including re-convergent fanout (correlated conditions) and partial correctness with "don't care" conditions.

testing and code coverage

Posted May 31, 2006 14:34 UTC (Wed) by mingo (subscriber, #31122) [Link]

First, probably it never has been tested systemmatically;

i think the answer to that is in the section you apparently overlooked:

Projects like LTP do that systematically, they use kernel instrumentation and visualization tools to see which code wasnt executed yet and create testcases to trigger it. Plus chaos helps us too: if LTP doesnt trigger it, users will eventually - and will likely not trigger the deadlock itself, but only the codepath that makes it possible. (hence there is much better debug info and a still working system.)

and thus this IMO misleading proposition results in a misleading conclusion:

this problem is endemic to the Linux development process.

;-)

LTP has a long way to go

Posted May 31, 2006 15:12 UTC (Wed) by jreiser (subscriber, #11027) [Link]

I did not overlook "Projects like LTP do that systematically." The Linux Testing Project is a drop in the bucket: far too little. Given its size, the kernel should have over ten thousand individual tests that are run and analyzed [automation helps!] before each release.

LTP has a long way to go

Posted May 31, 2006 15:55 UTC (Wed) by mingo (subscriber, #31122) [Link]

I did not overlook "Projects like LTP do that systematically." The Linux Testing Project is a drop in the bucket: far too little. Given its size, the kernel should have over ten thousand individual tests that are run and analyzed [automation helps!] before each release.

the LTP testsuite's 'testcases/' directory sports 7000+ files, most of which are individual testcases. Testcase files often contain more than 1 testcase. LTP is being run not "before each release" but on Linus' nightly GIT trees - and yes, it's all automated.

furthermore, there are random automated testing efforts as well like scrashme, which can (and do) hit bugs by chance as well.

while i dont claim that LTP is perfect (if it were we'd have no bugs in the kernel), it is certainly alot more than "a drop in the bucket".

but i digress ...

LTP has a long way to go

Posted Jun 6, 2006 19:34 UTC (Tue) by Blaisorblade (guest, #25465) [Link]

The problem of LTP is that it, in itself, tests syscall semantics, and many tests are unit tests. They can help in some cases (especially for strange arches, like say UML) - and they're very useful when a functionality is introduced, especially if the implementation is complex (that's my experience - I wrote together the implementation and a very complex testcase of it in one project).

nanosleep() tests aren't going to catch many bugs - nanosleep() uses are very simple.

But, say, run LTP on every possible drivers set, with a dmraid setup on "normal" RAID volumes on IDE and SCSI and SATA physical disks (since many bugs are in hardware drivers)...

Or combine networking tests with analisys of packet capture data and match sent data with on-wire data (supposing it's possible - you actually need a full TCP/IP stack to run on captured data, with additional correctness checking features)...

Then you'll find real bugs. However this starts being difficult.

Another possibility is to extract testcases from atypical applications.

UserModeLinux (on which I work) has been an excellent test-case against ptrace behaviour.

It found subtle bugs in ptrace existing in three kernel releases, in the recent past (one bug lived for 2.6.9 and 2.6.10, affecting security; the other affected x86-64 from 2.6.16.5 to 2.6.16.19).

But in this case, who coded the patches didn't run it.

testing and code coverage

Posted May 31, 2006 14:54 UTC (Wed) by mingo (subscriber, #31122) [Link]

Second, the problem is not only "branches of code [coverage of basic blocks]." The problem is conditional execution paths through multiple basic blocks, including re-convergent fanout (correlated conditions) and partial correctness with "don't care" conditions.

yes - code coverage and testing is alot more than just basic coverage of all existing code.

but coverage that triggers locking is alot simpler than full coverage, because locks are almost always taken in simple ways, without too many branches. So while in theory code like this:


void function1(unsigned long mask)
{
        if (mask & 0x00000001)
                spin_lock(&lock0);
        if (mask & 0x00000002)
                spin_lock(&lock1);
        if (mask & 0x00000004)
                spin_lock(&lock2);
        if (mask & 0x00000008)
                spin_lock(&lock3);
        if (mask & 0x00000010)
                spin_lock(&lock4);
...
        if (mask & 0x80000000)
                spin_lock(&lock31);
}

could exist in the kernel and would require 4 billion values of 'mask' to cycle through all the possible locking scenarios, in practice the kernel is full of much simpler locking constructs:

void function2(unsigned long mask)
{
        spin_lock(&lock);
        ...
        spin_unlock(&lock);
}

where covering the function once is probably enough to map its locking impact. In fact, "tricky", non-straight locking code is being frowned upon from a review and quality POV, which too works in favor of validation quality. To stay with the function1() example, such code will very likely be rejected at a very early review stage.

and finally, even theoretical full code coverage is alot simpler than the possible combinations of codepaths on an multiprocessor system that could trigger deadlocks.

naturally, being a dynamic method, the lock validator can only claim correctness about codepaths it actually observes (any other code doesnt even exist for it), and thus it still depends on how frequently (and with what memory state) those codepaths get triggered.

Re: detecting all possible deadlock conditions

Posted May 31, 2006 15:08 UTC (Wed) by NAR (subscriber, #1313) [Link]

if LTP doesnt trigger it, users will eventually - and will likely not trigger the deadlock itself, but only the codepath that makes it possible. (hence there is much better debug info and a still working system.)

If this feature is turned on in the user's kernel - isn't the performance hit too much for users to keep this configuration option turned on?

If some code is never triggered in practice, why is the code there and how was it tested to begin with?

Error handling (especially hardware-related error handling) might be triggered extremely rarely, but I don't think it should be thrown out. The code could be tested by manually setting some error flag, but obviously the setting of this flag is not included in the production kernel.

Bye,NAR

Re: detecting all possible deadlock conditions

Posted Jun 8, 2006 8:53 UTC (Thu) by jmansion (guest, #36515) [Link]

>In other words, we've reduced a highly probabilistic
>parallel QA (and manual code review) task to a
>comparatively _much_ simpler task of serial code coverage.

Hardly. If you can legitimately *ever* lock two objects of the same type then there will be an exception that turns off the 'false positive' that will also turn off checking whether you acquire A then B and also on some code paths B then A. Which is exactly the sort of deadlock you normally get in a database server, for example.

Which is not to say that the approach isn't useful, just that its not a panacea. It would have been handy to have a compare/contrast with FreeBSD's WITNESS facility.

Re: O(1) scheduler

Posted May 31, 2006 7:55 UTC (Wed) by mingo (subscriber, #31122) [Link]

Oh, and the scheduler doesn't run in constant time. It just delegates the job to small trivial chunks that run when each process starts+stops using the CPU, so technically it's still O(n), not O(1).

this statement doesnt parse for me. Sure, if you execute an O(1) operation N times that makes it O(N). What matters is the overhead of context-switching and wakeup, and those are O(1)!

just look at an example: sys_getpid(), the simplest system-call in existence, does:

return current->tgid;

that's as O(1) as it gets - a handful of instructions. But by your argument, since applications may execute getpid() lots of times, in reality it's O(N)? By that argument there would be code at all that would qualify as O(1) code!

Re: O(1) scheduler

Posted May 31, 2006 8:08 UTC (Wed) by frankie (subscriber, #13593) [Link]

The O(1) for the scheduler refers the number of processes. The scheduling time is indeed not dependent on the number of processes. That's all.

The kernel lock validator

Posted May 31, 2006 9:19 UTC (Wed) by simlo (guest, #10866) [Link]

The old scheduler was O(number of processes on the run-queue). The scheduler in 2.6 is O(number of priorities)=O(140)=O(1).

The kernel lock validator

Posted May 31, 2006 7:09 UTC (Wed) by piman (subscriber, #8957) [Link]

In fact it's trivial to identify all functions that could deadlock (if you heard otherwise, maybe you should've been paying more attention in class :). What's impossible is identifying all and only those functions that deadlock.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds