LWN.net Logo

Buried in warnings

The 2.6.19-rc4 prepatch release did not go quite as well as the developers might have liked; some confusion over the return type for an internal function led to an undesirable mixing of pointer and integer types in the depths of the block layer. As it turns out, gcc noticed this problem and duly issued warnings about it, but nobody saw them before the mistaken patch was merged and the resulting kernel shipped. This is, in other words, a problem which should have been easily avoidable.

Linus responded this way:

And I have SYSFS enabled, so I should have seen this warning.

But I've become innoculated against warnings, just because we have too many of the totally useless noise about deprecation and crud, and ppc has it's own set of bogus compiler-and-linker-generated warnings..

At some point we should get rid of all the "politeness" warnings, just because they can end up hiding the _real_ ones.

A few kernel developers were doubtless wondering just why it took so long to reach this point - there have been complaints about excessive warnings for some time now. There is a lot of support for having the computer find problems whenever possible, and that has led to an increasing number of "must check" annotations and other changes which cause warnings to be issued whenever something looks suspicious. On top of that, gcc generates a fair number of warnings in situations where no real problems exist. The end result is that warnings which refer to real problems tend to get lost in the flood.

Patches which address many of the spurious "this variable might not be initialized before being used" warnings have been circulating for some time. There is resistance to applying them, however; some developers resent cluttering up the code (and bloating the kernel) with unneeded initializations to deal with what they see as a gcc bug. There is no real sign that this latest episode has changed the thinking on that score; the initialization patches may well continue to languish.

A different approach has been taken by Al Viro. He has developed a little tool called "remapper" which tracks how blocks of code move around from one kernel version to the next. Using the generated information, a set of compiler warnings from an old kernel can be remapped to their line numbers in a newer kernel. Then, a tool like diff can be used to compare the output from old and new compiles; the end result is a listing of the warnings which first appear in the new kernel - and only those. With this filtered output, developers can quickly find places where the compiler has pointed out real problems.

Remapper can be had via git from:

    git://git.kernel.org/pub/scm/linux/kernel/git/viro/remapper.git

Dave Jones also makes daily snapshots available.

Use of remapper is relatively straightforward: after building the remap-log tool, one starts with a command like this:

    diff-remap-data 2.6.19-rc2 2.6.19-rc3 > 2-to-3.map

The resulting "map" file is full of file names and numbers; they simply map line numbers from the old directory tree to the new one - and mark blocks of code which were removed altogether. There is another tool (git-remap-data) which performs the same task for two commits in a git repository; in this case, file renames can be handled properly as well.

The remap-log tool can then be used to move old compile logs into the present:

    remap-log 2-to-3.map < 2.6.19-rc2.log > 2.6.19-rc2-remapped.log

If the new log is then compared to the output from a 2.6.19-rc3 build with diff, the only output will be any warnings (or errors) which have appeared or disappeared between the two kernel versions. Those which have only moved due to changes elsewhere in the file will be filtered out. The short documentation file packaged with the code offers some other potential uses, such as carrying forward annotated grep output as an ongoing "to do" list.

Some developers swear by this tool. Jeff Garzik, however, is not entirely pleased; in an earlier discussion he said:

I think it's both sad, and telling, that the high level of build noise has trained kernel hackers to tune out warnings, and/or build tools of ever-increasing sophistication just to pick out the useful messages from all the noise.

Jeff has, instead, put together a separate kernel tree with many of the bogus warnings silenced. It is a labor-intensive task - each warning must be investigated and shown to be spurious before being quieted. This work is not intended for merging; instead, it's meant to help create a development platform in which the useful warnings can actually be seen. This set of changes has been part of the -mm tree since 2.6.18-mm3.

Yet another approach to the "may be uninitialized" warnings was floated last May; it introduces a special macro which "initializes" a variable without actually doing anything. That silences the warning without adding to the size of the kernel. The macro is only supposed to be used in cases where the code paths have been audited. The objection that was raised at the time was that, while the current use of a variable might be correct, future changes to the code could introduce a path where that variable is, indeed, used without initialization. The warning would still be suppressed, however, and the bug might not be caught until much later. So the patch was never merged.

Compiler bugs can, perhaps, eventually be fixed. But the increasing interest in the use of automated tools to find potential bugs all but guarantees that there will continue to be a stream of spurious warnings for developers to deal with. If those automated warnings are to lead to real fixes - before somebody gets burned - ways of keeping the noise level down will have to be found.


(Log in to post comments)

Buried in warnings

Posted Nov 2, 2006 5:20 UTC (Thu) by JoeBuck (subscriber, #2330) [Link]

A typical example of a false uninitialized variable warning gcc will generate is the following:
bool flag = false;
some_type* pointer;

if (some_condition_is_true()) {
   flag = true;
   pointer = expensive_allocation_function();
}
do_something_else();
if (flag) {
   use_the_fine(pointer);
}

GCC will report that pointer might be used uninitialized, because it does not track the association between the flag variable and the state of pointer. Doing checks of this kind would require rather sophisticated analysis; gated static single assignment would work, but even then, there are cases that a human being can immediately see that the compiler will not. However, use of an uninitialized object can be such a disaster that most consider false positives better than false negatives.

Sometimes suppressing the warning will require a minor time and space penalty: say, an instruction to set some object to zero. Unless you're dealing with the most time-critical of inner loops, I suggest that you're better off trying for a clean compilation with -Wall, even if there is a minor cost, because a microscopically faster but more buggy program isn't worth the cost.

Buried in warnings

Posted Nov 2, 2006 6:26 UTC (Thu) by viro (subscriber, #7872) [Link]

There is such thing as bogusly initialized variable. And I would
argue that it's worse than even genuine uninitialized one; the latter
at least gets you a warning. The former is hidden...

Buried in warnings

Posted Nov 2, 2006 7:26 UTC (Thu) by avik (guest, #704) [Link]

The latter gets you a warning and data corruption. The former is hidden,
and maybe wastes a cycle when run.

I think that a bogus initialization is better than a genuine uninitialized
variable. It's only worse if you never run the kernel in question.

Buried in warnings

Posted Nov 2, 2006 8:48 UTC (Thu) by viro (subscriber, #7872) [Link]

Bogus as in "with value that doesn't make sense". Suppose you used
to have a declaration, then several places assigning to variable,
then several places using it. All paths to the latter actually
pass through the former, so we are fine. gcc is too dumb to prove
that, so it gives a warning. Fine, some kind idio^Wsoul slaps = 0
into declaration. Everything's fine. Until a modification of code
creates a path that *really* does use without assignment. Suddenly
(and without any warning from gcc whatsoever) we get a case that
gets us to use of variable when utter crap is stored in it; the crap
in question is that 0 supplied by helpful idiot several months ago.

Worse yet, code review finding a code path that leads to use without
assignment => OK, we've definitely found a bug. Code review finding
a code path that leads to use of variable explicitly initialized with
something that doesn't make much sense in that place => scratching
head for a long time and trying to figure out whether it's a bug or
not and WTBleedingF was supposed to be done in that place.

Buried in warnings

Posted Nov 2, 2006 15:16 UTC (Thu) by evgeny (guest, #774) [Link]

I'd vote for the bogus initialization. If it does cause a bug later on, it is at least reproducible. Tracking down uninitialized vars typically takes much longer. YMMV, of course.

Buried in warnings

Posted Nov 2, 2006 16:58 UTC (Thu) by nevyn (subscriber, #33129) [Link]

You are saying that the uninitialized vars aren't hidden, but this entire article proves otherwise. There are so many warnings when you compile the kernel that noone is looking at the ones that are being output.

Personally I think there is a huge amount of middle ground, for instance the example code Joe posted could declare the pointer as NULL allocate to it in the if and then just check if the pointer is not NULL later on (Ie. pointer also takes on the job of the seperate boolean).

Finally with decent usage of ASSERT/nonnull you can _very_ easily detect when pointers are still NULL from declaration time.

Buried in warnings

Posted Nov 2, 2006 10:46 UTC (Thu) by Asebe8zu (subscriber, #24600) [Link]

Couldn't you use the following to make it correct?
some_type* pointer;

if (some_condition_is_true()) {
   pointer = expensive_allocation_function();
   do_something_else();
   use_the_fine(pointer);
}
else
{
   do_something_else();
}

Buried in warnings

Posted Nov 2, 2006 15:05 UTC (Thu) by etienne_lorrain@yahoo.fr (guest, #38022) [Link]

That is just what you did not want to do, for instance when do_something_else() will be inlined if it is called once...

Either you would replace:
{
if (some_condition_is_true()) {
some_type* pointer = expensive_allocation_function();
do_something_else();
use_the_fine(pointer);
} else {
do_something_else();
}
}

By (definning a new GCC extern_local keyword):
{
if (some_condition_is_true()) {
some_type* pointer = expensive_allocation_function();
}
do_something_else();
if (some_condition_is_true()) {
extern_local some_type* pointer;
use_the_fine(pointer);
}
}

Or by a new GCC attribute:
{
some_type* pointer __attribute__((used_if(some_condition_is_true())));

if (some_condition_is_true())
pointer = expensive_allocation_function();
do_something_else();
if (some_condition_is_true())
use_the_fine(pointer);
}

By the way, to kill "used if not initialised" warning, you do it by
the recognised: "variable = variable;" feature, not by "variable = 0;".

Buried in warnings

Posted Nov 2, 2006 16:44 UTC (Thu) by NAR (subscriber, #1313) [Link]

By the way, to kill "used if not initialised" warning, you do it by the recognised: "variable = variable;" feature, not by "variable = 0;".

I think this still hides the case when a new execution path is added and the variable is failed to get initialized in the new path. I guess the real solution is to initialize at declaration (i.e. with a constructor) and do whatever case handling needed in that constructor - altough I'm not sure it can be done easily in C.

Bye,NAR

Buried in warnings

Posted Nov 3, 2006 1:07 UTC (Fri) by nix (subscriber, #2304) [Link]

I hope that's an assertion. you can't verify that attribute in the general
case without not just solving the halting problem but foretelling the
future. :)

And in that case you've now moved the problem from bogus initializations
to bogus assertions. The problem remains.

(And it's insoluble in the general case, anyway. Hence GCC's use of the
word `may', to emphasise that FPs from this warning are a
quality-of-implementation issue, sure, but not a compiler bug.)

Buried in warnings

Posted Nov 2, 2006 16:10 UTC (Thu) by NAR (subscriber, #1313) [Link]

I know it's just a made-up code, but if the pointer is really not used in do_something_else(), couldn't you just write:

do_something_else();
if (some_condition_is_true()) {
   some_type* pointer = expensive_allocation_function();
   use_the_fine(pointer);
}

I guess if the pointer is not used in do_something_else(), the function can't really use the things allocated there. Or does this syntax only work in C++? In this case it's too bad...

Anyway, just to reflect to the questioning of code quality in closed software (in a separate thread), we've had quarter-million lines of C++ code compiling without a single warning...

Bye,NAR

Buried in warnings

Posted Nov 3, 2006 1:09 UTC (Fri) by nix (subscriber, #2304) [Link]

C99 supports this. C90 does not.

C90 vs C99

Posted Nov 3, 2006 1:24 UTC (Fri) by xoddam (subscriber, #2322) [Link]

Not so. Declarations with initialisations have always been legal at the
beginning of a block.

C99 also permits declarations to follow statements, like C++.

C90 vs C99

Posted Nov 14, 2006 20:52 UTC (Tue) by nix (subscriber, #2304) [Link]

In C90 the initializations must be literals: you can't put an arbitrary
function call in there and expect it to work. (That's a GNU C extension,
copied from C++ and also found in C99: look up `Non-Constant Initializers'
in the manual.)

Reconvergent fanout

Posted Nov 2, 2006 16:19 UTC (Thu) by jreiser (subscriber, #11027) [Link]

The situation JoeBuck describes is a kind of "re-convergent fanout" which the people who do static timing analysis for digital circuit design figured out how to handle 20 years ago.

Reconvergent fanout

Posted Nov 3, 2006 2:59 UTC (Fri) by pimlott (guest, #1535) [Link]

A few more minutes of writing might have made that post a whole lot more educational for the rest of us. :-)

Reconvergent fanout

Posted Nov 3, 2006 19:40 UTC (Fri) by bronson (subscriber, #4806) [Link]

I don't think so... Reconvergent fanout is similar to this problem much like a banana is similar to a school bus because they are yellow.

Reconvergent fanout is a way to make VLSI static timing analysis less pessimistic. Each time a signal passes thorugh a node, a bit of uncertainty is added to the timing delay. However, if two (ore more) signals pass through the same node, follow separate paths, and then return to the same point, you can remove a fair amount of uncertainty because you know that one point they shared a common path. Whatever the uncertainty was, it was identical for both.

So, yes, both techniques trace signals through a system. Beyond that, though, it seems to me that they're totally different. I hope somebody will clue me in if I'm missing something; it's been many years since I was a VLSI hack.

Buried in warnings

Posted Nov 2, 2006 6:25 UTC (Thu) by viro (subscriber, #7872) [Link]

Two notes:
1) documentation is a bit obsolete in one respect: _all_ instances
of <pathname of modified file>:<line number> are replaced, not just
ones in the beginning of the line. IOW, something like
drivers/scsi/BusLogic.c:584: warning: 'check_region' is deprecated (declared at include/linux/ioport.h:133)
might get two replacements. Old version used to leave tons of noise
whenever something like ioport.h got modified. Current one deals
with that just fine.

2) Whatever Jeff thinks of level of gcc warnings, stuff like endianness
warnings is all over the tree simply because many parts hadn't been
touched yet. There the ability to see what had changed e.g. after
rebase (or your own changes to headers done so far) is critical and
you really don't want to drown in the fsckloads of noise. The same
thing happens whenever we get a new class of annotations, obviously.

Buried in warnings

Posted Nov 2, 2006 8:39 UTC (Thu) by pcdavid (subscriber, #4295) [Link]

Since version 1.5 and the addition of annotations, Java compilers support a @SuppressWarnings annotation. It can be attached to almost any element in the code (type, method, parameter, local variable, etc.) and is used to silence specific kinds of warnings related to this element. The exact kinds of warnings which can be silenced depend on the compiler, but some are mandatory. For example, if you have a pre-1.5 class which uses non-generic collections, you can add @SuppressWarnings("unchecked") to the class. This essentially tells the compiler "This class uses operations which might seem unsafe to you, but I know what I'm doing so don't bother me about it."

Couldn't the sparse tool be extended to support this kind of things? It could then filter out the output of GCC and remove the warnings which have been accounted for by the developers.

Buried in warnings

Posted Nov 2, 2006 10:39 UTC (Thu) by Asebe8zu (subscriber, #24600) [Link]

Another very useful feature in Java without equivalent in C
is the final modifier for local variables.
When used, the compiler ensures that the variable is set
once and only once before being used.

I don't know how you would implement this in C though, without
introducing some new keyword.

Java 'final', C 'const'

Posted Nov 2, 2006 15:20 UTC (Thu) by jreiser (subscriber, #11027) [Link]

If the final value is (or could be) assigned at declaration, then the C keyword const suffices.

My own favorite candidate for compiler enhancement is a message, "warning: 'const' omitted". The compiler detected that there would be no complaints if the programmer supplied the keyword const.

Java 'final', C 'const'

Posted Nov 2, 2006 21:16 UTC (Thu) by jzbiciak (✭ supporter ✭, #5246) [Link]

The problem I run into time and time again with const in C is that there are too many functions in too many libraries which take a non-const pointer, even though they do not modify the pointed-to entity. Very annoying.

Pointer to pointer to const

Posted Nov 3, 2006 1:40 UTC (Fri) by xoddam (subscriber, #2322) [Link]

That would be a fine idea, if you didn't run into "C++ Gotcha #32" all the time when trying to do this (it's just as much a gotcha in C as it is in C++). As long as you don't try to take the address of your const pointer, you're safe. But you can't mix const and non-const indirect pointers, because of aliasing problems.

Of course said aliasing problems might all be hidden deep inside the library (and/or the imagination of the compiler) and users shouldn't have to care at all -- but if it causes a problem for the developer, they're far more likely to drop 'const' than to suppress compiler warnings with casts and hope everything is otherwise correct.

Copyright © 2006, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds