Finding kernel problems automatically

[Posted June 1, 2004 by corbet]

In past years, this page has looked at the work done by the "Stanford checker," which analyzes code in search of various types of programming errors. The checker has found a lot of problems over the years, with the result that a lot of problems have been fixed before they had a chance to bite users of production kernels.

The only problem with the Stanford checker is that it is not free software; it is, in fact, completely unavailable to the world as a whole. Rather than release the code, the checker group went off and formed Coverity to commercialize the checker software (now called "SWAT" and touted, ominously, as being "patent pending"). Developers at Coverity still occasionally post reports of potential bugs found by SWAT, but, for the most part, their attention seems focused on potential revenue opportunities.

It is hard to complain about this outcome. Before heading on this course, the Coverity folks uncovered vast numbers of bugs, and all Linux users benefited from that work. They also demonstrated how valuable static code testing tools can be. The community, however, was left in the position of having to actually write its own checker if it wanted one. Fortunately, this is the sort of thing the community can be good at.

A while back, none other than Linus Torvalds started work on his own tool, which came to be called "sparse." There has recently been a flurry of new activity around sparse, so it seems like a good time to take a look.

sparse is normally obtained by cloning the BitKeeper repository at bk://kernel.bkbits.net/torvalds/sparse. For those who don't use BK, a checked-out version is available (as a bunch of SCCS files) on kernel.org. There is a low-bandwidth sparse mailing list as well.

Essentially, sparse is a parsing and analysis library for the C language. One could put a number of different backends onto it; for example, a code-generation backend would turn it into a simple compiler. For the purposes of the kernel, however, the backend of interest is the analysis code which looks for various types of errors. The analyzer checks for quite a few different types of errors. Many of these (many sorts of type mismatches, for example) are also found by the compiler, but other tests are unique to sparse.

The core test done by sparse is still the check for improper use of user-space pointers. A quick look through the kernel will turn up liberal use of a type attribute called __user; for example, the read() method invoked from system calls is prototyped as:

    ssize_t (*read) (struct file *, char __user *, size_t, loff_t *);

When the kernel is being compiled, __user is defined as the empty string, so gcc doesn't see it at all. When sparse is being used, instead, it marks the pointer as (1) being in a separate address space, and (2) not being legal to dereference. sparse will use those flags to catch any mixing of user- and kernel-space pointers, and any attempt to directly dereference user-space pointers.

These checks have turned up a surprising number of errors. The kernel normally sets up the virtual address space in such a way that direct dereferencing of user-space pointers actually works - most of the time. Using user-space addresses in this way will fail, however, if the user page is not actually resident in memory at the time. More importantly, perhaps, this sort of direct dereferencing bypasses the normal access controls; every such error could, thus, become a security hole.

Catching such mistakes automatically seems like a good idea. It does require, however, that every variable holding a user-space pointer be marked with the __user attribute. Since much of the kernel (including every device driver) deals with user-space pointers, this is not a trivial job. This job is proceeding, however; several dozen patches adding __user annotations (and fixing problems found on the way) have been merged for 2.6.7.

Other checks performed include finding constants which are overly long for their target type, mistakes in embedded assembly language code, empty switch statements, assignments in conditionals, and so on. Its output is rather noisy still, but one assumes that will improve over time. If you have sparse installed, running it on the kernel is simply a matter of adding "C=1" to the make command. External modules can also be checked in this way.

sparse is still clearly far behind the Stanford checker in terms of the variety of errors it can find. Unlike the checker, however, sparse is free software. The core parsing infrastructure is in place, so the addition of new checks should be relatively straightforward. All that's needed is the application of a bunch of developer time.

Index entries for this article
Kernel	Debugging
Kernel	Development tools/Sparse
Kernel	sparse
Kernel	__user

This makes me think...

Posted Jun 3, 2004 5:06 UTC (Thu) by etrusco (guest, #4227) [Link] (5 responses)

Has there ever been a tentative to push a "hardened C" standard, just C plus strong type-checking?

This makes me think...

Posted Jun 3, 2004 10:47 UTC (Thu) by dvdeug (guest, #10998) [Link] (2 responses)

Why would you want a version of C with strong-typing? It wouldn't be C anymore; it would probably be no easier to port code from C to new-C than it would to port it to Ada or some other real language with strong typing. If you want Ada, or Java, or Modula-3, or even C++, you know where to find them.

This makes me think...

Posted Jun 3, 2004 23:27 UTC (Thu) by bronson (subscriber, #4806) [Link] (1 responses)

Except that, with a few preprocessor defines to remove the strong-C keywords, you could compile strong-C on a weak-C compiler. The kernel source code could even be a mix of strong and weak C as it transitions to strong-C over the period of a few years. There are very real benefits to simply extending an existing language -- witness C++.

This makes me think...

Posted Jun 4, 2004 10:38 UTC (Fri) by dvdeug (guest, #10998) [Link]

Yes, witness C++. Part of C++'s problem is that it's an incredible pain to parse correctly (notice that GCC ended up completely rewritting the C++ parser to properly support standard C++.) Part of the reason is that C++ was designed based on the already complex C syntax.

Another problem with C++ is that C--especially real world C, but even standard C--and C++ aren't completely compatible either way. Stroustroup had to make a few minor tweaks that lost compatibility to make C++ work right. A strong-C is more than likely going to have a few edge cases where things don't work exactly the same.

To use a few preprocessor defines to remove the strong-C is going to hamstring your efforts; the differences are just not going to be that minimal, and that tightly limits your syntax. What you're asking for is basically the system described in the article, not a new language.

This makes me think...

Posted Jun 3, 2004 11:34 UTC (Thu) by bdixon (guest, #1055) [Link]

Yes... It is called MISRA (Motor Industry Software Reliability Association) C and it amounts to many rules asking you to do this and not do that. MISRA C compliance can be checked using static analyzers. For example, rule 33 says that the right hand operand of && or || shall not contain side effects.

Some of the MISRA rules, however, may not be compatible with systems programming.

This makes me think...

Posted Jun 3, 2004 15:12 UTC (Thu) by mmarsh (subscriber, #17029) [Link]

You might want to check out http://www.cs.cornell.edu/projects/cyclone/ ,
as well. It's certainly not a standard, but it's a safe C dialect.

I can complain ...

Posted Jun 3, 2004 22:34 UTC (Thu) by AnswerGuy (guest, #1256) [Link] (1 responses)

... that the Stanford Checker was built around gcc (xgcc) and thus should be obligated by the GPL if they ever distribute it.

So, if Coverity is selling (distributing) a derivative of this as a product ...

Now, it's possible that they took *their* code and ported it to some other compiler environment; or it's possible they are billing this as a service and not as a product.

However, it doesn't seem like they are acting in the spirit of free software in either case.

I can also complain since I sent them a number of messages politely asking for information about the project and it's progress and inquiring about the the possibility having anyone from their team speak at a local technical group (BayLISA) and never received the courtesy of a response --- not even a curt note to the effect that they are far to busy and important to speak to the likes of us sysadmins.

Oh well! C'est le vie!

JimD

I can complain ...

Posted Jul 16, 2004 15:37 UTC (Fri) by arafel (subscriber, #18557) [Link]

I sent a couple of notes, and got replies saying that they'd release it when they'd tidied the code up. Doesn't look like that's going to happen.

Regarding the other bit - they'll port the checker code to whichever compiler you want. If it's gcc, then I can't see that they have any way of stopping you from redistributing it afterwards - but you'd have to have enough money to pay for the port.