LWN.net Logo

DHS gears up for research phase of open source bug hunt (Linux.com)

Linux.com looks at a security project that has used the Coverity bug checker to perform security audits on open source software. "It's been nearly a year since the US Department of Homeland Security (DHS) announced the "vulnerability discovery and remediation open source hardening project," a $1.24 million, three-year grant through its research and development arm, the Directorate for Science and Technology. Now, the security project is entering its research phase."
(Log in to post comments)

DHS gears up for research phase of open source bug hunt (Linux.com)

Posted Dec 8, 2006 19:23 UTC (Fri) by bluefoxicy (subscriber, #25366) [Link]

What I want to see is Coverity pointed at OpenBSD; last time I went looking for open source alternatives, Theo de Raadt personally e-mailed me to banter about how these tools never increase code quality and just teach programmers to satisfy the tool and not actually look for bugs.

DHS gears up for research phase of open source bug hunt (Linux.com)

Posted Dec 8, 2006 23:40 UTC (Fri) by madscientist (subscriber, #16861) [Link]

This sort of comment is indicative of ignorance of what this tool actually does... which is not surprising as it's not free. Apparently Theo views Prevent as a sort of glorified version of lint, and if that were the case he'd be somewhat correct: sometimes people just learn to "code around" issues that lint raises without actually producing better quality code. Of course, sometimes the simplest way to "code around" the warning is to write better code, even in lint!

However, Prevent is not like that. I've see the output, on even very complex proprietary code, and you are not getting notes about questionable style that can be worked around with extra parentheses, etc. Prevent generates a complete database of all the possible codepaths in your entire source tree (not just one file at a time), then it does a top-down analysis of virtually every possible codepath, using all available static information. Note I'm not just talking about function call hierarchies, but every conditional within a function as well.

Then it presents potential errors to you in a very simple to use web interface, where it shows the code in your browser, annotated by which decisions needed to be taken at every conditional to generate the bug, and walking back up the function call hierarchy. The code is similar to LXR output in that you can click symbols and jump to that location, etc. It's really amazing (and humbling) the obscure, but generally totally legitimate, types of errors Prevent points out to you. Not only that but the tool maintains a history so you can say that something is not a bug, and then it won't be presented to you when you run the tool again. Much nicer than lint etc. where you have to annotate the code with special comments telling it to ignore various issues.

Of course there can be some setup, especially if your code uses non-standard memory allocation and/or exit ("non-returning") functions; you have to teach Prevent about that. However, if your custom functions eventually call malloc or exit or similar, Prevent will understand that without help.

And BTW, I'm not a Coverity employee or anything; in fact the company I worked for when I organized the demo of Coverity elected to not purchase the tool (over my objections). It's not exactly inexpensive. But it's a fabulously useful and _usable_ tool and you have to give a tip 'o the hat to Coverity for making it available to F/OSS projects at no charge.

DHS gears up for research phase of open source bug hunt (Linux.com)

Posted Dec 9, 2006 5:46 UTC (Sat) by JoeBuck (subscriber, #2330) [Link]

Not quite right, of course: the number of code paths is exponential, so Coverity and similar tools have to do pruning. This means that they don't follow all paths (particularly in very large functions). You can also get false positives because the tool can't look deeply enough to see that some path through the code is not possible: you get a report saying that if A is true, and B is true, and C is false, you dereference a null pointer, but it turns out that it's not possible for this combination of conditions to be true at once.

Tools for Software Quality

Posted Dec 9, 2006 18:12 UTC (Sat) by Junior_Samples (guest, #26737) [Link]

If a function is too complex for automated static analysis, then it is probably a flawed design. Complexity is a software flaw, and tools exist to measure it. The McCabe metric along with Halstead metric are pretty good indicators cyclomatic complexity This will indicate if a procedure is too complex or not. A nasty flawed procedure can almost always be re-factored into several smaller less complex procedures.

Theo seems to be a self-appointed "expert" in software quality, in the same way that Jesse Jackson is a "reverend". I see no evidence that Theo has a clue about what real software quality entails. I ran some automated metrics on OpenBSD and was shocked to see how gawdawful the code really is.

Theo is mostly content to play in the sandbox he knows, and is unwilling to update his skills or open his mind to better possibilities. I would love to see OpenBSD rewritten in Ada or better yet, SPARK. I would love to see OpenBSD employ complexity metrics to identify problem modules. Theo and his buffer overflow obsession has blinded him to the myriad of other problems which affect software quality. Buffer overruns are low hanging fruit, and are a direct result of a poor choice of programming language for mission critical software.

Tools for Software Quality

Posted Dec 9, 2006 21:46 UTC (Sat) by JoeBuck (subscriber, #2330) [Link]

By your criterion, I would expect that almost every compiler front end in existence is a flawed design. The number of paths is very large. Exhaustive analysis of every path is infeasible.

Look, the problem is not as simple as you claim it is, or we would all be producing provably correct C today. Too bad.

Multithreading makes things worse, though there are some promising approaches based on predicate abstraction that can prove that certain specific safety properties hold.

Tools for Software Quality

Posted Dec 10, 2006 8:42 UTC (Sun) by khim (subscriber, #9252) [Link]

By your criterion, I would expect that almost every compiler front end in existence is a flawed design

Yup. Ask any compiler designer - and he (or she) will readily agree. Number of strange codepaths, subtle errors and so on - is staggering! Why the hell can we ever trust the compiler then ? It's easy: usual solution for compiler is introduction of "internal error". If input triggers something strange - crash the compiler and be done with it. If the compiler can not compile 90% of standard design patterns but what it does compile it does correctly - it's "an Ok" compiler. Not good compiler, but we can use it. Carefully. Usable approach for the compiler, but not really useful for an operation system where you don;t have total control over input.

Tools for Software Quality

Posted Dec 11, 2006 3:44 UTC (Mon) by JoeBuck (subscriber, #2330) [Link]

But the huge numbers of branching paths in the compiler are pretty much required by the huge numbers of branching paths in the language specification. Decomposition into small functions doesn't help, because a tool that explores all paths must inline everything. Unless the language has a dead-simple structure, like Lisp, you're pretty much doomed by the problem specification.

Tools for Software Quality

Posted Dec 11, 2006 12:48 UTC (Mon) by fergal (subscriber, #602) [Link]

One could argue that this case is covered in the OP's "then it is probably a flawed design" statement. That is, if the langage you are compiling can only be compiled by a compiler with incomprehensible codepaths then it's because of a design flaw - in the language.

And its the easy case..

Posted Dec 11, 2006 9:40 UTC (Mon) by eru (subscriber, #2753) [Link]

Yup. Ask any compiler designer - and he (or she) will readily agree. Number of strange codepaths, subtle errors and so on - is staggering! Why the hell can we ever trust the compiler then ?

And this in spite of a language implementation really being a much better-defined problem than most other programming tasks! After all, you have a pretty complete specification of the input and expected results, usually expressed in formal or semi-formal way, and no concurrency problems. The compiler can typically read the input at its own pace, not worrying about timing. And still you get bugs...

Tools for Software Quality

Posted Dec 9, 2006 22:46 UTC (Sat) by k8to (subscriber, #15413) [Link]

The problem occurs not at the function level, but at the whole-application level. Often times modern static analysis tools will have to traverse paths that at runtime will not even occur, because of constratins that are not visible in the code, or because of the difficulty of the problem's complexity and current tools limitations. But even without those cases, any significantly large program will contain, throughout its entire call hierarchy, sufficient numbers of overall logic paths that it will not be possible to traverse them all.

On on modern 32 bit systems, you usually hit the address space ceiling first, but analysis time isn't very far behind. Sure, you can scan every logical path in oracle, but how many years or decades are willing to wait for the job to complete?

Tools for Software Quality

Posted Dec 10, 2006 15:57 UTC (Sun) by drag (subscriber, #31333) [Link]

I am no expert, not even close.

But this whole discussion makes me think that tight and very modular code is the key to security and being relatively bug-free.

I wouldn't say so much as 'object oriented' or whatnot, but at least discrete code blocks that can be fully investigated as individual items rather then having to analize a monolythic application.

Sort of along the lines of classic Unix-y vs Windows-y way of doing things with the 'at tool does one job and does it well' vs 'all functionality should be made aviable as a whole' sort of thing.

Because in that way a tool like coventry (or even individuals doing cod auditing) would be much more usefull as you would be able to break down the problem much easier hunks.

Tools for Software Quality

Posted Dec 11, 2006 10:32 UTC (Mon) by viro (subscriber, #7872) [Link]

Show me a single usable general-purpose kernel written in Ada. Poor
choice or not, what you suggest is completely unsuitable one. As in
"unfit for that kind of tasks as far as experimental data shows".
And that beats all other considerations. Reality matters.

Tools for Software Quality

Posted Dec 11, 2006 11:49 UTC (Mon) by nix (subscriber, #2304) [Link]

It *might* be practical for parts of the system above the kernel. Of course, reality matters, and there aren't many Ada systems programmers around, although Ada embedded programmers could probably move into systems programming if they wanted to. But they *haven't*... so reality still wins.

Tools for Software Quality

Posted Dec 12, 2006 14:26 UTC (Tue) by Junior_Samples (guest, #26737) [Link]

Certainly Ada has proved itself in embedded systems. It is an ideal language for low level control of the bare hardware. But specific choice of language is not really the issue. The issue is that OSS software remains stuck in software development practices which haven't changed in 20 years.

This whole thread demonstrates that OSS software development continues to make alibis and excuses for maintaining the status quo. Making assertions that "reality" matters, and then ignoring the proven reality of useful methods for improving software quality takes great hubris indeed.

I'm reminded of the transition from K & R C to ANSI C. What there was of the free software world at the time had to be dragged kicking and screaming into a better way of doing things.

At the time the mark of a hot-shot macho programmer was to use C in an idiomatic and cryptic way just to prove that they were a "real programmer". They didn't need no new fangled ANSI C to "get in the way". Gosh, even large scale projects like BSD remained in the K & R world for many years after a better way of doing things was available. Likewise with GNU software.

Today there exist many good and proven tools to improve software quality. It is unfortunate that they are dismissed so easily by the free software community.

Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds