|
|
Subscribe / Log in / New account

Moving the kernel to modern C

Moving the kernel to modern C

Posted Feb 25, 2022 11:14 UTC (Fri) by Wol (subscriber, #4433)
In reply to: Moving the kernel to modern C by wtarreau
Parent article: Moving the kernel to modern C

You're confusing "use" and "initialise".

Don't allow mixing USE and declaration. But DO allow the *compiler* to set the initial value.

Cheers,
Wol


to post comments

Moving the kernel to modern C

Posted Feb 25, 2022 15:13 UTC (Fri) by wtarreau (subscriber, #51152) [Link] (13 responses)

I'm not sure we're speaking about the same thing. I'm speaking about not making this monstrosity possible, where I'd say "good luck" for figuring the type of "i" depending on the line you're reading, and its bounds:

#include <stdio.h>
#include <unistd.h>

int blah(long x, int j)
{
long i = x ? x : -1;
int k = i;

for (int i = 1; i < j; i++) {
k += i * 2;
char i = (k & 1) ? 'O' : 'E';
int pid = getpid();
printf("i=%d j=%d pid=%d\n", i, j, pid);
}
return k;
}

PS: sorry for the formatting, I didn't find how to make a code block.

Moving the kernel to modern C

Posted Feb 25, 2022 16:16 UTC (Fri) by farnz (subscriber, #17727) [Link] (1 responses)

Making a code block on LWN needs two tags in HTML formatting: <pre> to indicate that formatting matters, and <tt> to indicate that you want monospaced fonts. Below is <pre><tt> followed by your code (with indentation added by my brain), followed by </tt></pre> - I've also had to escape special characters with HTML escapes (but it's a simple matter to write code to do this for you).


#include <stdio.h>
#include <unistd.h>

int blah(long x, int j)
{
    long i = x ? x : -1;
    int k = i;

    for (int i = 1; i < j; i++) {
        k += i * 2;
        char i = (k & 1) ? 'O' : 'E';
        int pid = getpid();
        printf("i=%d j=%d pid=%d\n", i, j, pid);
    }
    return k;
}

Moving the kernel to modern C

Posted Feb 28, 2022 10:15 UTC (Mon) by wtarreau (subscriber, #51152) [Link]

> I've also had to escape special characters with HTML escapes

Thanks. That was the thing that made me think I was heading the wrong direction and that possibly there was something simpler in order to just paste a piece of code.

Moving the kernel to modern C

Posted Feb 25, 2022 16:51 UTC (Fri) by Wol (subscriber, #4433) [Link] (1 responses)

I thought that was allowed in ancient C ...

Unless you mean actually declaring inside the "if" statement ... but I thought declaring after a { was permitted anywhere. I dunno, it's ages since I've programmed C in anger.

But it would be nice to say you can ONLY declare after a {, but that includes things like "int i = 1". You shouldn't be able to do things like "int i; i=1; int j; j=2;", though.

Cheers,
Wol

Moving the kernel to modern C

Posted Feb 25, 2022 18:44 UTC (Fri) by nybble41 (subscriber, #55106) [Link]

In C89 and GNU89 declarations must occur before statements within each block. C89 additional requires initializers to be compiler-time constants. However, it's not as if this equivalent GNU89 code is any easier to follow:

#include <stdio.h>
#include <unistd.h>

int blah(long x, int j)
{
    long i = x ? x : -1;
    int k = i;
    {
        int i;
        for (i = 1; i < j; i++) {
            k += i * 2;
            {
                char i = (k & 1) ? 'O' : 'E';
                {
                    int pid = getpid();
                    printf("i=%d j=%d pid=%d\n", i, j, pid);
                }
            }
        }
    }
    return k;
}

The real lessons here are "use meaningful names" and "avoid shadowing".

Moving the kernel to modern C

Posted Feb 26, 2022 0:21 UTC (Sat) by camhusmj38 (subscriber, #99234) [Link]

This is called variable shadowing and is possible in C89 as well. It can be very ugly which is why it is discouraged although a good compiler should warn on shadowing.
A good practice in modern languages is to combine declaration and initialisation so that you reduce the chance of accessing an uninitialised value. It also encourages locality in the code which makes it easier to comprehend.

Moving the kernel to modern C

Posted Feb 28, 2022 11:48 UTC (Mon) by ianmcc (subscriber, #88379) [Link] (7 responses)

main.cpp:11:6: error: redeclaration of ‘char i’
   11 | char i = (k & 1) ? 'O' : 'E';
      |      ^
main.cpp:9:10: note: ‘int i’ previously declared here
    9 | for (int i = 1; i < j; i++) {
      |          ^

Moving the kernel to modern C

Posted Feb 28, 2022 13:56 UTC (Mon) by jem (subscriber, #24231) [Link] (6 responses)

You seem to have a faulty C compiler, or you didn't copy the code correctly. The curly bracket at the end of line 9 starts a new block, and it's perfectly legal to declare a new 'i' variable inside that block.

Moving the kernel to modern C

Posted Feb 28, 2022 17:06 UTC (Mon) by ianmcc (subscriber, #88379) [Link] (4 responses)

That might be valid C (although I don't know why, but it doesn't give any errors in an online C compiler). It isn't valid C++. The scope of the control variable declared in the for loop is the loop itself, so you can't declare another variable with the same name in the same scope.

for (int i = ..)
{
int i = 2; // not valid C++. There is already a variable 'i' declared in this scope
}

Moving the kernel to modern C

Posted Feb 28, 2022 20:10 UTC (Mon) by nybble41 (subscriber, #55106) [Link] (3 responses)

> The scope of the control variable declared in the for loop is the loop itself, so you can't declare another variable with the same name in the same scope.

What you say agrees with the C++ standard, but it makes me wonder why the standard authors appear to have been competing to come up with the most Byzantine special cases and exceptions they could think of to integrate into the standard rather than taking the simplest and least surprising route. Syntactically, the body of the for loop is a single statement which may be a compound statement. That part is the same as C. The braces are *not* part of the syntax for the loop. If the scope of the control variable were in fact the loop itself, and the body were treated the same as any other statement, then the compound statement would be an independent scope nested *within* that for-loop scope, and declarations within the compound statement would shadow any declarations scoped to the for loop (as they do in C). Instead the standard pierces the abstraction and treats compound statements in a for loop body differently than compound statements located elsewhere. There is no logic to this that I can see, just a bald statement that "If a name introduced in an init-statement or for-range-declaration is redeclared in the outermost block of the substatement, the program is ill-formed."

Moving the kernel to modern C

Posted Mar 1, 2022 16:27 UTC (Tue) by ianmcc (subscriber, #88379) [Link] (2 responses)

In C++ the declaration and the body of the loop are the same scope. In C, initializer in the for loop establishes its own scope, so there are actually two scopes created with a C for loop. This was unintended behavior in C, and a defect report was raised about it, but it seems it wasn't seen as important enough. http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2257.htm...

After all, who would write such code anyway? Its a strange thing to take issue with.

Moving the kernel to modern C

Posted Mar 1, 2022 21:58 UTC (Tue) by nybble41 (subscriber, #55106) [Link] (1 responses)

> In C++ the declaration and the body of the loop are the same scope.

for (init-statement condition[opt] ; expression) statement

The body of the for loop is just *statement*. If the declaration were in that scope it wouldn't survive from one iteration of the loop to the next or be visible in *condition* or *expression*. Declarations in *init-statement* are scoped over the entire for loop, not just the body.

Normally a compound statement within *statement* would introduce its own separate block scope *below* the level of *statement*, but in C++ the lines are blurred between the body of the for loop and the *inside* of the compound statement. In other words, I would expect this to be a redeclaration error, because `char i` and `int i` are declared in the same scope (note that all the examples in the standard are of this form):

for (int i = 0; i < N; ++i)
    char i = 7;

but not this, because `char i` is declared in the new *nested* scope created by the compound statement and not directly in the body of the for loop:

for (int i = 0; i < N; ++i) {
    char i = 7;
}

Contrast this with the following code which the standard (C++20 draft) claims is "equivalent" to the second example "except that names declared in the init-statement are in the same declarative region as those declared in the condition, and except that a continue in statement (not enclosed in another iteration statement) will execute expression before re-evaluating condition":

{
    int i = 0;  /* init-statement */
    while (i < N  /* condition */) {
        { char i = 7; }  // statement
        ++i;
    }
}

In the "equivalent" while loop version there is clearly no redeclaration error—the `char i` declaration is within not just one but two levels of compound statements under the while loop and the scope where `int i` was declared.

> After all, who would write such code anyway? Its a strange thing to take issue with.

Whether you would write that by hand or not, it's an unnecessary (and IMHO completely pointless) complication which moreover breaks compatibility with C. Redeclaration conflicts could appear as a result of macro expansion or other code generation, not just in hand-written code.

Moving the kernel to modern C

Posted Mar 2, 2022 8:56 UTC (Wed) by ianmcc (subscriber, #88379) [Link]

You've got the history the wrong way around. The behaviour of C++ here hasn't changed since it was first standardized in 1998. At that time, C didn't allow a declaration in a for statement. C99 borrowed the wording from the C++ Annotated Reference Manual, without realizing that the wording had been updated during the C++ standardization process. So C introduced an incompatibility with C++, not the other way around. The C standards committee documents are very clear that this was accidental, not intentional.

The bottom line is that C++ will flag an error in some instances of very dubious code that is most likely a bug anyway (i.e. declaring a variable that shadows the loop control variable) where C99 would allow it. None of the standards committee see it as something worth the bother of fixing. If you really did intend to introduce a shadow declaration, the simple fix is to enclose it in another compound statement.

Moving the kernel to modern C

Posted Feb 28, 2022 17:14 UTC (Mon) by ianmcc (subscriber, #88379) [Link]

See also http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1865.htm


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds