LWN.net Logo

Not Again

Not Again

Posted Apr 17, 2012 9:12 UTC (Tue) by juliank (subscriber, #45896)
In reply to: Not Again by deepfire
Parent article: PHP: a fractal of bad design (fuzzy notepad)

OK, formally it's not undefined but unspecified. But for evaluation order, that basically doesn't make much difference. The draft I have (N1256, which is basically C99 + TC3) specifically says so in section 6.5:

"The grouping of operators and operands is indicated by the syntax.74) Except as specified later (for the function-call (), &&, ||, ?:, and comma operators), the order of evaluation of subexpressions and the order in which side effects take place are both unspecified."


(Log in to post comments)

Not Again

Posted Apr 17, 2012 20:24 UTC (Tue) by jzbiciak (✭ supporter ✭, #5246) [Link]

Yep, and that bit me when I ported Doom to one of our DSPs. The code used the idiom P_Rand()-P_Rand() all over the place to get signed random numbers that were roughly normally distributed. GCC evaluated left-to-right, DSP compiler right-to-left.

While it wouldn't really affect the game play much, it was enough of a change that the prerecorded demo loops didn't function properly, because they expected the engine to be 100% deterministic. Once I found and fixed this error, they started working.

C is predictable when your program is bug free and relies on no implementation defined, unspecified or underspecified behavior. Such a program is extremely tedious to write, and next to impossible even for experienced practitioners. That's because implementation defined behavior is pervasive and useful.

I remember coming across a long, tiresome thread in comp.lang.c once where the goal was to bit-reverse a large integer array in 100% portable ANSI C. I don't recall if anyone succeeded, but the thread went far longer than I expected. Everyone made subtle assumptions about the environment, despite the guarantee that unsigned int is at least guaranteed to provide modulo-wraparound 2s complement arithmetic.

Not Again

Posted Apr 19, 2012 14:19 UTC (Thu) by nye (guest, #51576) [Link]

>C is predictable when your program is bug free and relies on no implementation defined, unspecified or underspecified behavior.

Okay, I know what 'undefined' means (nasal demons, etc.); I think I know what 'implementation defined' means; I guess 'underspecified' means that the standard writers didn't consider something quite precisely enough? But what - exactly - does 'unspecified' mean?

The first few Google hits don't distinguish sufficiently clearly for my liking between 'implementation defined' and 'unspecified'.

I'm guessing it's something like "a given compiler implementation can do whatever it likes, including behave non-deterministically, but your program remains valid so all other well-defined constructs remain well-defined", whereas 'implementation defined' is the same but without "behave non-deterministically"?

Not Again

Posted Apr 19, 2012 14:43 UTC (Thu) by anselm (subscriber, #2796) [Link]

»Implementation-defined« means that there are various options out of which the implementation needs to pick one and adhere to it. For example, an implementation can set the size of »short« arbitrarily (as long as it is not smaller than that of »char« and not larger than that of »int«) but the choice, whatever it turns out to be, must be consistently enforced.

With undefined behaviour the implementation can do anything it wants and doesn't have to do the same thing twice even in the same situation.

Not Again

Posted Apr 19, 2012 17:22 UTC (Thu) by nix (subscriber, #2304) [Link]

... and doesn't have to do what the Standard says elsewhere in the program either, as you are clearly no longer using the Standard as reference. (Or that's what it says, though as a QoI issue compilers try not to produce a program that reformats the disk on every little error.)

Not Again

Posted Apr 23, 2012 8:37 UTC (Mon) by ekj (guest, #1524) [Link]

Why are compilers not FORBIDDEN from compiling programs that contain instructions which are "undefined". Given that a valid program can format the hard-disk when containing undefined instructions, what are the odds that the programmer actually intended to say: "at this spot, do whatever random thing" ?

Not Again

Posted Apr 23, 2012 9:48 UTC (Mon) by mpr22 (subscriber, #60784) [Link]

With regard to your first sentence: because people want to be able to compile programs which contain functions that perform arithmetic on signed integers, and the results of signed integer arithmetic overflow may be outside the implementation's reasonable ability to control. (Unsigned integer arithmetic, on the other hand, has strictly-defined overflow behaviour summarized as (UINT_MAX + 1) == 0)

With regard to your second sentence: given a contradiction, everything is true.

Not Again

Posted Apr 23, 2012 9:57 UTC (Mon) by ekj (guest, #1524) [Link]

Wouldn't arithmethic operations on signed integers that overflow be unspecified rather than undefined ? I was thinking mostly of constructs that are by themselves -always- undefined, not subject to "undefined *if* the sum of these two overflow" which the compiler cannot generally know about beforehand.

What is the rationale for letting "void main(void)" compile and produce a program that you can run (if you dare!) despite the fact that it means, according to the C-spec: "Do nothing, or anything whatsoever."

Not Again

Posted Apr 23, 2012 10:40 UTC (Mon) by anselm (subscriber, #2796) [Link]

What is the rationale for letting "void main(void)" compile and produce a program that you can run (if you dare!) despite the fact that it means, according to the C-spec: "Do nothing, or anything whatsoever."

According to the C standard, the prohibition on prototypes for »main« other than »int main(void)« and »int main (int, char **)« applies only to what the standard calls a »hosted environment«, i.e., an operating system like Linux. The standard makes certain stipulations about how such an environment is supposed to call into a C program, and this is where the restrictions on »main()« come from. The output from a C compiler could, however, be useful in what the standard calls a »freestanding environment«, where – among other differences – the implementation defines how a program is actually started. It could force a different prototype for »main()« or call a differently-named function altogether. (An obvious example of a »freestanding environment« would be the Linux kernel, which runs on the bare machine, without the benefit of an underlying operating system, since of course it is supposed to be the operating system that would make up part of a »hosted environment« for ISO C.)

Having said that, it is probably safe to say that 99%+ of programs compiled with, say, GNU C, are intended to be run in the hosted environment, which is why, in the highly recommended »-Wall« mode, gcc emits warnings complaining about non-conforming definitions of »main()« unless the »-ffreestanding« option is specified on the command line. If you're serious you could use the »-Werror=main« option to turn this warning into an error.

Not Again

Posted Apr 23, 2012 10:42 UTC (Mon) by mpr22 (subscriber, #60784) [Link]

The result of void main(void) is only formally undefined if you're targeting a hosted implementation (which, admittedly, application programmers generally are). If you're using a freestanding implementation, then both the type and the name of your program's entry point are implementation-defined, so main might not be magic and even if it is, it might legitimately have a return type of void.

(Note to self: check whether it's defining void main(/*whatever*/) or returning from main having done so, that crosses the undefined-behaviour threshold on hosted implementations.)

Not Again

Posted Apr 20, 2012 5:19 UTC (Fri) by jzbiciak (✭ supporter ✭, #5246) [Link]

Here, go read the doc: www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf

Here's my understanding, such that it is:

  • Implementation defined: We expect implementations to pick a behavior, tell the user about it, and stick to it. It's something you can rely on, but only in that implementation. Example: whether unadorned char is signed or unsigned.
  • Undefined: Here lie demons. An implementation is entirely within its rights to call system("nethack") or something else equally capricious when it sees one of these. Example: void main(void).
  • Unspecified: The environment must behave "reasonably", as in, it's not allowed to reformat your hard drive. But, it doesn't have to document its choices, and it can behave differently compile-to-compile. It can do whatever is convenient with best effort. Example: f() + g() Which gets called first?

Make sense?

Not Again

Posted Apr 24, 2012 10:56 UTC (Tue) by nye (guest, #51576) [Link]

>Make sense?

Yes, thank you.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds