Fixing programmers

Posted Mar 13, 2019 20:38 UTC (Wed) by rweikusat2 (subscriber, #117920)
In reply to: Fixing programmers by rahulsundaram
Parent article: Cook: security things in Linux v5.0

JFTR: This was already a feature of the original lint dating back to about 1977.

Fixing programmers

Posted Mar 13, 2019 20:45 UTC (Wed) by rahulsundaram (subscriber, #21946) [Link] (49 responses)

> JFTR: This was already a feature of the original lint dating back to about 1977.

Lots of tooling existed before but being part of the compiler makes them more readily available and even better if it is the default.

Fixing programmers

Posted Mar 13, 2019 21:38 UTC (Wed) by rweikusat2 (subscriber, #117920) [Link] (48 responses)

Clang/ gcc copying features from lint doesn't make these any more interesting than they were before. The idea that programmers should add cute little comments like

/* fallthru */

to express that they were actually planning to use a certain language feature in a machine-readable way is seriously old. The effect of this is that features someone considers particularly disagreeable become somewhat more complicated to use and that the code becomes cluttered (to a degree) with completely useless comments. Presence or absence of such comments doesn't indicate anything about correctness or incorrectness of the code.

Fixing programmers

Posted Mar 14, 2019 6:33 UTC (Thu) by epa (subscriber, #39769) [Link] (42 responses)

That would be true if someone were mindlessly adding ‘fall through’ markers in every place the language feature is used. That would indeed be a completely pointless exercise. But as I understand it, before adding the marker you first audit the code to make sure falling through is deliberate and not an oversight. Given the number of bugs that have been found in the past this auditing is a good use of time and likely to uncover and fix more bugs. When new code is written, having to add the marker serves as a reminder to the programmer to think about whether falling through is wanted. If programmers were machines this reminder would not be helpful. But experience shows that even the most highly skilled programmers occasionally make mistakes, and this is one of the more common of then, yet easy to prevent with a bit of tooling.

Fixing programmers

Posted Mar 14, 2019 16:19 UTC (Thu) by rweikusat2 (subscriber, #117920) [Link] (41 responses)

This "reminder" that a very vocal group of people has been disagreeing with pretty much any C design decision since 197x isn't useful. A much better idea would be to push for a language extension with the desired semantics, assuming some set of desired semantics can ever be agreed on. The sorry mess of Perl given/ when which will probably forever have someone's weird ideas about DWIM operators ("dumb matching") tied to its ankles should serve as a warning here. This would help to avoid the problem (if there is a problem) instead of just "keep on nagging for a few more decades!".

Fixing programmers

Posted Mar 14, 2019 18:31 UTC (Thu) by HelloWorld (guest, #56129) [Link] (38 responses)

> This "reminder" that a very vocal group of people has been disagreeing with pretty much any C design decision since 197x isn't useful. A much better idea would be to push for a language extension with the desired semantics, assuming some set of desired semantics can ever be agreed on.
How is a "fallthrough comment" at the end of a case branch not a language extension, assuming that its presence is enforced by the compiler?

Besides, this has been standardized in C++17. The only reason it's not in C is that the C committee is run by a bunch of stick-in-the-muds.

Fixing programmers

Posted Mar 14, 2019 21:19 UTC (Thu) by rweikusat2 (subscriber, #117920) [Link] (35 responses)

What's so wonderful about always trying to misunderstand something?

It is claimed that the semantics of switch, specifically, that it provides a way to jump to a certain location in a block but no automatic exit from there before the code tagged with the next case-label starts, are problematic. Assuming this problem exists (cf next paragraph), adding a metasyntactical sort-of language extension requiring an explicit something in both cases doesn't solve it. It just punishes people who need the fallthrough semantics. What would solve it would be to add another syntactic construct providing an actual multiway conditional, where the code blocks associated with the cases are independent of each other and at most one will be executed.

As to the "problematic", are we perhaps totemically repeating some "But that's not how I think it should be done!"-complaints someone initially made at least 40 years ago which since keep being repeated out of habit? According to the numbers in the text, about 99% of the warnings were false positives. Comparing the number of case-labels in a current -stable tree with the number of missing breaks indicates that this aspect of the code was correct in about 99.99% of all cases. This doesn't look like a "common" problem to me.

Fixing programmers

Posted Mar 14, 2019 23:03 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (30 responses)

> What would solve it would be to add another syntactic construct providing an actual multiway conditional, where the code blocks associated with the cases are independent of each other and at most one will be executed.
Why? Switch is fine for this. In 99% of cases it's NOT used as computed goto, so implicit fallthrough is actively harmful.

That's why other more modern languages (C#, Go, ...) have ditched that stupid semantics.

And yes, it's stupid. So stupid that the very first linter for C had the "forgotten break" inspection.

Fixing programmers

Posted Mar 15, 2019 15:56 UTC (Fri) by rweikusat2 (subscriber, #117920) [Link] (29 responses)

"Switch is fine except that it's not"?

I agree that the fallthrough is mostly useless and that all the time people and computers wasted by typing or processing

break;

statements could have been put to better uses. But overloading the syntax supposed to be used for inline documentation so that yet more useless code has to be added is not the answer, it's the problem taken to the next power.

Fixing programmers

Posted Mar 15, 2019 20:46 UTC (Fri) by myxie (subscriber, #127909) [Link] (28 responses)

If only the original designers of C had thought of using a different keyword to indicate continuing to the next case... but that would have meant another keyword to add to the language... or would it have? :-)

I'm almost scared to look up the semantics of continue inside a switch statement.

Fixing programmers

Posted Mar 15, 2019 21:15 UTC (Fri) by mpr22 (subscriber, #60784) [Link]

continue inside a switch has no special semantics - if you're inside a loop construct then it interacts with the loop, otherwise it's an error.

Fixing programmers

Posted Mar 15, 2019 21:36 UTC (Fri) by madscientist (subscriber, #16861) [Link] (26 responses)

Continue and return work the same inside a switch statement as outside a switch statement. They are not special.

The main problem is K&R chose the wrong "default" behavior. Usually you DON'T want to continue, so that should have been the default (that is, the current case is automatically terminated by the next case or the end of the statement), and there should be a special keyword used only when you DO want to fall through.

The one exception is empty cases ("case a: case b: case c: foo(); break;") A special dispensation could have been made that the keyword is not needed to fall through unless there are other statements in the case: empty cases don't need the keyword.

Of course, you need a different keyword. They could have used "continue", but both "break" and "continue" are already useful from within switch statements (to break out of or continue an enclosing loop) so using either one of them as meaningful inside a switch was (IMO) not a good decision. I've definitely run into situations where I wanted to break out of a loop from within a switch statement and having "break" overloaded is annoying.

A "fallthrough" or "nobreak" keyword would have been fine. Or if you REALLY didn't want to add a keyword you could have played tricks with "goto", maybe, allowing "goto case X;" where "case X:" existed somewhere in the current switch statement, plus "goto default;"

But that's not the language we have, so FALLTHROUGH-type comments that are recognized by the compiler are great things and everyone should use them. Even if they're weren't recognized by the compiler, they're recognized by the next programmer to come along and read the code, and make clear that you didn't forget something, so they should still always be used.

Fixing programmers

Posted Mar 15, 2019 21:44 UTC (Fri) by rweikusat2 (subscriber, #117920) [Link] (22 responses)

> But that's not the language we have, so FALLTHROUGH-type comments that are recognized by the compiler are great things and
> everyone should use them. Even if they're weren't recognized by the compiler, they're recognized by the next programmer to come > along and read the code, and make clear that you didn't forget something, so they should still always be used.

They make clear that someone was using lint (or similar). Not particularly interesting and technically useless as the default case is - well - the default case. Whether or not this is correct is a different conversation, but that's something which can only be determined by looking at the code. There's no more reason to assume that two cases with fallthrough are correct or incorrect than two cases seprated by a break;. I've actually written code where I forgot to omit the break in the past :-).

Fixing programmers

Posted Mar 15, 2019 22:42 UTC (Fri) by madscientist (subscriber, #16861) [Link] (20 responses)

> They make clear that someone was using lint (or similar). Not particularly interesting and
> technically useless as the default case is - well - the default case. Whether or not this is
> correct is a different conversation, but that's something which can only be determined by
> looking at the code. There's no more reason to assume that two cases with fallthrough
> are correct or incorrect than two cases seprated by a break;. I've actually written code
> where I forgot to omit the break in the past :-).

I'm not going to get drawn too far into this conversation because it's ridiculous, but I'll say that the above is at best missing the point.

A fallthrough comment is not trying to tell you whether or not the code will fall through. Obviously it will, there's no break! The comment is telling you that someone thought about it and INTENDED the fallthrough to happen.

If lint or the compiler warns someone about a missing break and they want a break, they'll obviously just add a break. If they get a warning and they intended to fallthrough, they will add a fallthrough comment. If they didn't add either one then you don't know whether they thought about it and intended the fallthrough, or they just forgot.

Of course, even with the comment the fallthrough behavior could be a bug. Code has bugs. But the comment is not telling you the code is bug-free, it's telling you that it's intentional, rather than an omission. That's still a big benefit.

Fixing programmers

Posted Mar 17, 2019 20:29 UTC (Sun) by rweikusat2 (subscriber, #117920) [Link] (19 responses)

> A fallthrough comment is not trying to tell you whether or not the code will fall through. Obviously it will, there's no break! The
> comment is telling you that someone thought about it and INTENDED the fallthrough to happen.

It's usually safe to assume that code came to be in a certain form because someone was convinced it would be the right way to solve a particular problem. But that's not particularly interesting. Interestsing is "is it the right way".

Fixing programmers

Posted Mar 18, 2019 13:49 UTC (Mon) by mathstuf (subscriber, #69389) [Link] (18 responses)

Depends. Sometimes "uninteresting" parts of a patch are done at a late (or early) hour and is just more prone to typos and errors. Reviews are invaluable to every coder, from those learning anew all the way up to Linus. One of the first things I do after opening a merge request is do another look over the code. I've lost count of how many 5-minute-later pushes I've made because of that. Or if I find a bigger issue, I at least leave a comment that something looks weird and might need more thought.

Fixing programmers

Posted Mar 18, 2019 16:14 UTC (Mon) by jezuch (subscriber, #52988) [Link]

I do that too, *and* I do the initial commit using git commit -p as a kind of self-review. Catches plenty of stuff I wanted to do or did temporarily but forgot etc.

Fixing programmers

Posted Mar 18, 2019 19:40 UTC (Mon) by rweikusat2 (subscriber, #117920) [Link] (16 responses)

I don't understand what that's supposed to mean in the given context.

I was trying to express two things:

1) Noting that a break is absent isn't sufficient grounds to assume that this must have been an oversight unless there's an a priori conviction that this is a very likely cause for an absent break. Considering the "99% false positives", such a conviction doesn't seem sensible to me.

2) Computers execute code as it was written, not as it was intended to be written. An absent break which was an oversight isn't necessarily an error. And neither is a conscious omission or a present break necessarily correct.

Fixing programmers

Posted Mar 18, 2019 19:53 UTC (Mon) by mathstuf (subscriber, #69389) [Link] (15 responses)

As a reviewer, I would want a positive indication of intent for either behavior. A lack of either break or a fallthrough comment/attribute is an automatic request for "please clarify". The goal isn't to make code writer's jobs easier, but those who have yet to come and read the code.

Fixing programmers

Posted Mar 18, 2019 21:46 UTC (Mon) by rweikusat2 (subscriber, #117920) [Link] (14 responses)

As I've already explained: Intent doesn't matter. Correctness does. YMMV.

Fixing programmers

Posted Mar 18, 2019 22:11 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (9 responses)

Actually you got it wrong. Correctness doesn’t matter. The intent does.

Fixing programmers

Posted Mar 18, 2019 22:29 UTC (Mon) by rweikusat2 (subscriber, #117920) [Link] (8 responses)

If you think so, that's your opinion and not mine.

As person who spends a seriously lot of time working with other people's code (and has done so for about 15 years), I can assure you that I don't give $random_small_quantity_of_money for documentation of "programmer intent", especially not in form of otherwise uninformative comments. I need to know what the code does, not what someone believed it should be doing.

Fixing programmers

Posted Mar 18, 2019 22:33 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]

> I need to know what the code does, not what someone believed it should be doing.
So you need to know the intent. Duh. You're just deluding yourself at this point.

Correctness checking is TRIVIAL, it's not even worthy of mentioning. Checking of intent is anything but. And that's exactly why modern computer languages try to make it easier for developers to express their intent through code.

Fixing programmers

Posted Mar 18, 2019 22:39 UTC (Mon) by sfeam (subscriber, #2841) [Link]

I believe you have this backwards. What the code does can be determined in the absence of comments. The intent, not so much. That is why comments are valuable for bug-finding. Any place where the documented intent does not match the observed actual behavior is candidate for causing problems.

Fixing programmers

Posted Mar 19, 2019 12:24 UTC (Tue) by anselm (subscriber, #2796) [Link] (5 responses)

The goal of computer programming is to write code that does what it is supposed to be doing. You can use the code itself to figure out what it does, but you can't use the code itself to figure out whether it does what it is supposed to be doing.

It's very easy to write code that does something. Writing code that does what it it supposed to do is a lot harder, and requires outside context so you can determine when you're done. This is why in Real Life™ we have comments, specifications, unit tests, and so on – all to be able to figure out whether code does what it is supposed to do.

Fixing programmers

Posted Mar 19, 2019 14:14 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (4 responses)

There are a few fields where I can see the actual behavior of the code is of utmost importance, damn the intent. Static analysis and zero day hunting to name a few. In these cases, what the intent was actually doesn't matter and correctness is all that counts. However, these fields usually result in changes to the analyzed code when something "interesting" is found, so intent can matter again at that point.

Fixing programmers

Posted Mar 19, 2019 14:31 UTC (Tue) by NAR (subscriber, #1313) [Link] (3 responses)

what the intent was actually doesn't matter and correctness is all that counts.

How can you check correctness when you don't know what that code is supposed to do?

Fixing programmers

Posted Mar 19, 2019 14:43 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (2 responses)

For zero-day hacking, the intent doesn't matter. All that matters is that you can thread your special arguments through to some internal state that allows you access to something else that you normally wouldn't be able to access. In *fixing* the issue, intent again enters the picture since just closing the hole naïvely can block some use case that is supposed to be intended (or break some subtle backwards compatibility). Static analysis is similar. It's goal is to detect what the actual code does, match it against code smells and flag such code. Intent then comes in when you either suppress the notice or change the code to make the code not wrong.

Fixing programmers

Posted Mar 19, 2019 14:55 UTC (Tue) by NAR (subscriber, #1313) [Link] (1 responses)

that allows you access to something else that you normally wouldn't be able to access.

You're making the assumption that this particular access was not intended. So intent matters.

Fixing programmers

Posted Mar 19, 2019 14:59 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

Granted, but I feel that the intent that matters is a much higher-level construct in black hat hacking than "should this case statement have a break statement?".

Fixing programmers

Posted Mar 19, 2019 10:02 UTC (Tue) by NAR (subscriber, #1313) [Link] (3 responses)

Intent doesn't matter. Correctness does.

Let's say I have this code snippet: a = b + c; Is it correct?

Fixing programmers

Posted Mar 19, 2019 21:44 UTC (Tue) by neilbrown (subscriber, #359) [Link] (2 responses)

> Let's say I have this code snippet: a = b + c; Is it correct?

Yes it is.
Your statement "I have this code snippet: a = b + c" is self-evidently correct.

Fixing programmers

Posted Mar 19, 2019 22:03 UTC (Tue) by pizza (subscriber, #46) [Link] (1 responses)

By that definition, the Therac-25 "correctly" killed three people.

Fixing programmers

Posted Mar 19, 2019 23:24 UTC (Tue) by neilbrown (subscriber, #359) [Link]

> By that definition, the Therac-25 "correctly" killed three people.

Sorry to correct you, but I think you mean "It is correct that 'The Therac-25 killed three people' ".

And this is the point - when people start using broad terms like "intent" and "correct" without ensuring that all corespondents are using them in the same sense, you can hardly expect a useful conversation to result.

(wouldn't it be great if people would think about what they write, instead of just writing about what they think).

Fixing programmers

Posted Mar 19, 2019 10:37 UTC (Tue) by HelloWorld (guest, #56129) [Link]

> There's no more reason to assume that two cases with fallthrough are correct or incorrect than two cases seprated by a break;. I've actually written code where I forgot to omit the break in the past :-).
There are actually two reasons:
– the case where you want to use break is vastly more common than the case where you want to fall through
– it's easier to forget adding the break keyword than it is to accidentally add the fall-through attribute

> I've actually written code where I forgot to omit the break in the past :-).
The article mentions 20 cases of missing break statements that were uncovered by this work, while it doesn't mention any cases like the one you describe. It's safe to say that missing break statements are a much more frequent occurrence.

Fixing programmers

Posted Mar 15, 2019 23:15 UTC (Fri) by mjg59 (subscriber, #23239) [Link]

C# blocks implicit fallthrough and requires an explicit goto statement referencing another switch label. goto might be considered harmful in general, but its use there seems consistent with the way it's generally used in the kernel.

Fixing programmers

Posted Mar 16, 2019 23:53 UTC (Sat) by Wol (subscriber, #4433) [Link] (1 responses)

Maybe we could add an optional keyword?

switch nofallthrough {
case a:
case b:
}

As for break being overloaded, can't remember the language but it was something like

labela: for i = 1 to 10
labelb: for j = 10 to 1 step -1
do something
if x continue labela
next
next

so break or continue could take labels to indicate which construct they were meant to break out of.

Cheers,
Wol

Fixing programmers

Posted Mar 17, 2019 0:28 UTC (Sun) by ABCD (subscriber, #53650) [Link]

Among possibly other languages, Java allows this:

labela:
for (int i = 1; i <= 10; i++) {
    labelb:
    for (int j = 10; j >= 1; j--) {
        do_something(i, j);
        if (x) continue labela;
    }
}

Fixing programmers

Posted Mar 15, 2019 7:11 UTC (Fri) by HelloWorld (guest, #56129) [Link]

> Assuming this problem exists (cf next paragraph), adding a metasyntactical sort-of language extension requiring an explicit something in both cases doesn't solve it.
Actually it does.

Fixing programmers

Posted Mar 15, 2019 13:33 UTC (Fri) by anselm (subscriber, #2796) [Link] (2 responses)

What would solve it would be to add another syntactic construct providing an actual multiway conditional, where the code blocks associated with the cases are independent of each other and at most one will be executed.

That would help. But programs using this construct would no longer be C programs. If sticking to a popular and well-understood language standard is of value to you (which makes sense in a large community like the Linux kernel crowd), then this is something not to be thrown away lightly.

OTOH, code that uses the “metasyntactical sort-of language extension” is valid C whether the compiler pays attention to it or not. But if you have a compiler that does pay attention, it's probably a good idea to exploit that, much like we exploit our compiler's paying attention to all sorts of other little things that might otherwise slip us by (think “if (foo = bar) …”).

Like a surgical scalpel, the C programming language is conceptually pretty simple but using it still requires considerable care and attention to detail. I haven't done lots of C programming recently but when I do I'm humble enough to appreciate all the assistance I can get.

Fixing programmers

Posted Mar 15, 2019 15:51 UTC (Fri) by rweikusat2 (subscriber, #117920) [Link] (1 responses)

Linux makes heavy use of gcc extensions already, so, their would be little, additional damage done here. Further, my opinion on this is that C would benefit more from proper multiway conditional than from other, invasive additions like threading support, IOW, if someone would implement support for something like this in a compiler I could use, I'd certainly be using it and expect that the standardized language would eventually catch up.

BTW, I've come to dislike if (a = b) warnings as well: Mistyping == as = is another, extremely rare error and some things will always have to be found and fixed via code rewiev and/ or testing. Algorithmic errors are far more common than any kind of syntax misuse.

Fixing programmers

Posted Mar 15, 2019 17:52 UTC (Fri) by anselm (subscriber, #2796) [Link]

Mistyping == as = is another, extremely rare error and some things will always have to be found and fixed via code rewiev and/ or testing.

You want code review and testing, but you also want reasonable compiler warnings. If a developer is made aware of a “==” vs. “=” typo by a compiler warning while they're writing the code in the first place, the issue doesn't even come up in code review or testing (where it would be more expensive, in terms of developer time, to detect and fix). This is what in security circles we call “defense in depth”.

Algorithmic errors are far more common than any kind of syntax misuse.

Yes, but the syntax problems still exist and are often easier to detect and fix. This is like saying garbage in the street is not important because there are millions of children starving in Africa.

Fixing programmers

Posted Mar 15, 2019 15:28 UTC (Fri) by jrigg (guest, #30848) [Link] (1 responses)

> the C committee is run by a bunch of stick-in-the-muds.
This could be considered a feature.

Fixing programmers

Posted Mar 16, 2019 9:22 UTC (Sat) by HelloWorld (guest, #56129) [Link]

> This could be considered a feature.
Yes, by other stick-in-the-muds.

Fixing programmers

Posted Mar 15, 2019 8:53 UTC (Fri) by epa (subscriber, #39769) [Link] (1 responses)

OK, so I can only speak for my personal experience here. When writing switch statements I sometimes forget the 'break;' at the end of one case. This happens by accident rather more frequently than the times when I deliberately want to use fallthrough. So for me, it saves time to have a warning or error from the compiler when implicit fallthrough is used -- it saves my time in headscratching and debugging when the code doesn't work, and may in some cases stop buggy code from being used in production systems. This has nothing to do with "disagreeing" with the C standard. There are plenty of other cases where the standard allows a particular construct but I prefer to get a warning, because it is empirically more likely to indicate a mistake on my part rather than a deliberate choice, and so the warning helps me be a more productive programmer. YMMV.

A language extension is a great idea. GCC effectively provides a GCC-specific extension to the C language here -- or perhaps you could call it a restriction, since the code allowed with -Wimplicit-fallthrough is a subset of that without the warning. Either way, it's useful in practice to many programmers, and Linux has started making use of it. Pushing it upstream to the ISO C standard would be better still of course.

Fixing programmers

Posted Mar 15, 2019 10:28 UTC (Fri) by HelloWorld (guest, #56129) [Link]

> Pushing it upstream to the ISO C standard would be better still of course.
Or people could just switch to C++ where is this already standardized, along with dozens of other useful features.
https://en.cppreference.com/w/cpp/language/attributes/fal...

Fixing programmers

Posted Mar 16, 2019 5:09 UTC (Sat) by scientes (guest, #83068) [Link] (4 responses)

It is also a bad idea to parse comments. It breaks distcc for example, so I submitted a patch fixing this stupidness

https://lkml.org/lkml/2019/3/15/792

Fixing programmers

Posted Mar 16, 2019 9:25 UTC (Sat) by HelloWorld (guest, #56129) [Link] (3 responses)

How does this break distcc?

Fixing programmers

Posted Mar 16, 2019 21:32 UTC (Sat) by zlynx (guest, #2285) [Link] (2 responses)

If I remember correctly (it's been many years since I last used distcc), distcc sends each compiler a preprocessed source file. It does this to avoid the need to send every include file along with the source file.

Preprocessed files do not include comments.

To fix this, you can of course use whatever compiler flag disables this switch case check.

Fixing programmers

Posted Mar 17, 2019 10:49 UTC (Sun) by HelloWorld (guest, #56129) [Link]

gcc also supports using an attribute instead of a comment, that will not be removed by the preprocessor.

Fixing programmers

Posted Mar 17, 2019 12:30 UTC (Sun) by madscientist (subscriber, #16861) [Link]

GCC provides options that allow the preprocessor to leave comments in its output files.

However, using attributes is probably a better way to go in general.