LWN.net Logo

What every C Programmer should know about undefined behavior #2/3

What every C Programmer should know about undefined behavior #2/3

Posted May 16, 2011 15:54 UTC (Mon) by gowen (guest, #23914)
Parent article: What every C Programmer should know about undefined behavior #2/3

void contains_null_check(int *P) {
  int dead = *P;
  if (P == 0)
    return;
  *P = 4;
}

This one is interesting because (a) something very much like it caused a real security hole in the linux kernel recently and (b) the ONLY reason it exists is because of C's "declarations go at the start of the block" rule.

Someone wants to declare a variable, and knows its good practice to initialise it and, in the interest of style wants to avoid

void contains_null_check(int *P) {
  if (P == 0) return;
  {  
    int dead = *P;
    *P = 4;
  }
}
Result: bug.

C++ (since RAII strongly encourages initialise-at-declaration) and c99 (should it ever catch on) should make this one considerably less common.


(Log in to post comments)

What every C Programmer should know about undefined behavior

Posted May 16, 2011 16:37 UTC (Mon) by cfischer (subscriber, #3983) [Link]

Why not just say:
void contains_null_check(int *P) {
    int dead;
    if (P == 0)
        return;
    dead = *P;
    /* do with 'dead' whatever you want */
}
Wouldn't that be the more 'natural' C style?

Methinks, the

    int dead = *P;
in the example is a misguided result of the urge to 'initialize', as opposed to 'assign'. In C++ those might be different, but in old-style C, for simple datatypes, avoiding the assignment for such reasons could already be seen as bad influence by new-fangled object-oriented hip languages.

So the problem seems more one of mixing styles to me.

What every C Programmer should know about undefined behavior

Posted May 16, 2011 17:43 UTC (Mon) by ledow (guest, #11753) [Link]

I have to agree - that's how I'd write it.

But then, I'd also explicitly use NULL too, rather than 0.

What every C Programmer should know about undefined behavior

Posted May 16, 2011 18:28 UTC (Mon) by iabervon (subscriber, #722) [Link]

I think it's actually a reflex of being used to C99 (or gnu89), primarily. If you can avoid declaring a variable before where you initialize it, there is nowhere that the variable is in scope and uninitialized, meaning you (or someone else touching the code later) can't accidentally use it uninitialized. If you then have to write C89, the easy thing is to move the whole line, instead of splitting it, particularly when you don't think you care about the ordering of this code with respect to other code.

Of course, there's also the situation of a programmer who doesn't think C99 has caught on modifying a C99 codebase, and moving the line with the declaration up to the start of the block; the result may not be "natural" C89 to write, but it might be a "natural" C89 change to a "natural" C99 block.

What every C Programmer should know about undefined behavior

Posted May 16, 2011 19:57 UTC (Mon) by jreiser (subscriber, #11027) [Link]

If you then have to write C89, the easy thing is to move the whole line, ...

Often it is better to make a block by inserting the braces which enclose the intended scope.

What every C Programmer should know about undefined behavior #2/3

Posted May 16, 2011 21:35 UTC (Mon) by cmccabe (guest, #60281) [Link]

Ok, I know I'm replying to a blatant troll, but let me just point out that:

1. Initializing something before checking if it's NULL has nothing to do with whether you decide to put declarations at the start.

2. C99 doesn't have a "declarations go at the start of the block rule."

That being said, a lot of people consider it good style in C to put the declarations at the start of the block in C, because it encourages you to keep functions short and sweet.

What every C Programmer should know about undefined behavior #2/3

Posted May 17, 2011 5:36 UTC (Tue) by gowen (guest, #23914) [Link]

> Ok, I know I'm replying to a blatant troll

I don't think that word means what you think it means. Clue: It doesn't mean "someone who doesn't share my opinion".

> 1. Initializing something before checking if it's NULL has nothing to do
> with whether you decide to put declarations at the start.

Combined with a coding rule that variables must be initialised when declared, it really kind of does. And thats a common coding rule, because using an unitialised variable is - surprise - undefined behaviour. The vertical space between a variables declaration and its initialisation represent a region in which using that variable is a bug. It is good practice to keep that space as small as possible ("but no smaller" being the extra advice thats forgotten in this case). Yes, its not an unavoidable bug. But its an unnecessary vector for bug transmission.

So, as mentioned above this clash of good advice - or rather, a slightly blind application of usually-good advice - combined with C89's archaism on variable declaration resulted in a bug.

For the recent Linux hole, why do *you* think the (presumably experienced) coder initialised the variable on declaration - which sadly preceded the NULL check?

> 2. C99 doesn't have a "declarations go at the start of the block rule."

I did actually mention that. So zero out of ten for reading the whole post.

What every C Programmer should know about undefined behavior #2/3

Posted Jun 7, 2011 21:14 UTC (Tue) by cmccabe (guest, #60281) [Link]

The reality is that code written in C, C++, and other unsafe languages will always have "holes." We can minimize the holes by careful sandboxing and code inspection, but never quite get them to zero.

Cherry-picking one example of a hole and then using it to justify your style preferences is fairly silly. I could equally well find an overflow caused by signed overflow and say "aha! signed numbers are teh evil."

The reason why I prefer the C89 style of initializing the variables at the top of the block is that it tends to lead to shorter, clearer functions. If you end up with a page worth of declarations, it makes it clear even to the dullest programmer that his function is getting too long. It also makes it clearer how much stack space is actually being used, which is nice when you're really going for performance. And if you're not going for performance, what are you doing using C?

I understand the arguments for the C99/C++ "declare right before use" style. In C++ it's almost a must, because declarations trigger constructors to run, consuming CPU cycles. It also works well with C++'s RAII style. It also can move the definition closer to the use, making it easier to see the type. But again, that assumes you are writing gigantic, multi-page functions, which you *should not do*.

So basically, I think we are going to have to agree to disagree, for C at least. For C++, yes, you should declare as close as possible to where you use a variable.

What every C Programmer should know about undefined behavior #2/3

Posted Jun 8, 2011 13:39 UTC (Wed) by nix (subscriber, #2304) [Link]

If you end up with a page worth of declarations, it makes it clear even to the dullest programmer that his function is getting too long.
You have too much confidence in dull programmers. I have worked on functions with ten pages of variable declarations at the top. (The functions themselves were ten thousand lines long, which *anyone* should have realized was too long, but they had grown slowly to that length and nobody wanted to take the 'risk' of splitting them.)

What every C Programmer should know about undefined behavior #2/3

Posted Jun 16, 2011 0:05 UTC (Thu) by cmccabe (guest, #60281) [Link]

Heh, that sounds like a story from thedailywtf.com

Anyway, lazy or careless people can always find a way to do lazy or careless things. It is nice if you get a helpful hint that what you are doing is wrong, though. For example, using 4 or 8 space indents tends to give you a wakeup call that 20 levels of nesting might be more than the human mind can understand in C or C++. Using 1 or 2 space tabs does not. etc.

What every C Programmer should know about undefined behavior #2/3

Posted May 17, 2011 8:40 UTC (Tue) by marcH (subscriber, #57642) [Link]

I think the main issue is having a dead variable in the first place. If "dead" is not dead, then there is no problem, correct?

What every C Programmer should know about undefined behavior #2/3

Posted May 17, 2011 14:55 UTC (Tue) by nye (guest, #51576) [Link]

>If "dead" is not dead, then there is no problem, correct?

Incorrect - it's not a result of dead code elimination. This bug arises even if dead is used for something later on, because the initial assignment invokes undefined behaviour in the case that P is NULL. It's therefore always conformant for the compiler to assume that P is *not* NULL and remove the check, since if that assumption is incorrect then the behaviour is undefined and the compiler can do whatever it damn well pleases.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds