LWN: Comments on "Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks" http://lwn.net/Articles/544123/ This is a special feed containing comments posted to the individual LWN article titled "Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks". hourly 2 Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/545645/rss 2013-04-02T22:56:40+00:00 jwakely <div class="FormattedComment"> It should be called -Over9000 though<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/545637/rss 2013-04-02T21:18:39+00:00 hummassa <div class="FormattedComment"> It's on my default makefile since forever, because clang does not like -O5 and beyond and I am too lazy to look up which is the biggest effective level for each compiler...<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/545619/rss 2013-04-02T20:08:02+00:00 nix <div class="FormattedComment"> I've seen people use -O4, -O6, -O64 ("it's a nice round number and higher than 3" he said, so at least he knew what he was aiming for), and of course glibc, of all things, used -O99 for donkey's years. GCC obviously needs an "-Olots" for these people.<br> <p> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/545610/rss 2013-04-02T19:17:20+00:00 jwakely <div class="FormattedComment"> You know GCC doesn't have a -O4 optimisation level, right? ;)<br> <p> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/545595/rss 2013-04-02T17:30:50+00:00 hummassa I am answering to my own comments in this subthread, and it really feels like I am losing my mind... :-D<br/> anyway, I tried this with -O4 and both<pre> for(auto x: d) satd += abs(x); </pre>and<pre> auto satd = accumulate(begin(d), end(d), 0, [](int a, int x) { return a+abs(x); }); </pre>generated the same code, roughly:<pre> movl (%rax), %edx movl (%rax), %ecx addq $4, %rax sarl $31, %edx xorl %edx, %ecx subl %edx, %ecx leaq 64(%rsp), %r addl %ecx, %ebx cmpq %rax, %rdx jne .L3 #, </pre>which seemed nice to me. Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/545454/rss 2013-04-01T16:57:20+00:00 hummassa <div class="FormattedComment"> Actually, "properest" version would be<br> <p> auto satd = accumulate(begin(d), end(d), 0, [](int a, int x) { return a+abs(x); });<br> <p> But I suppose that has the potential to be less efficient, at least it involves some function calls here...<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/545448/rss 2013-04-01T16:19:32+00:00 hummassa <div class="FormattedComment"> isn's the "proper" c++ version<br> <p> for(auto x: dd)<br> satd += abs(x);<br> <p> ?? (generates the same code, no errors and READABLE...)<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/545422/rss 2013-04-01T14:51:48+00:00 khim Screen real estate is not measured in tokens. Inches, centimeters, may be pixels, but most definitely not tokens. But this measure it's shorter. Is it worth it? That's debatable and depends very much on the individual, but of course it's separate issue. Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544741/rss 2013-03-27T22:27:45+00:00 HelloWorld <div class="FormattedComment"> <font class="QuotedText">&gt; and the mistake in the chain of reasoning is the very first one where it assumes that the loop variable will never be out of range.</font><br> That's not the chain of reasoning. The reasoning is that if the loop variable is out of range, the program's behaviour is undefined, thus not testing the variable is just as valid as testing it or doing something else entirely. <br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544695/rss 2013-03-27T19:23:25+00:00 HelloWorld Use the string.h, Luke! <pre> static inline int framelen(char s[]) { return strnlen(s, 256); } </pre> ;) Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544691/rss 2013-03-27T19:17:04+00:00 HelloWorld <div class="FormattedComment"> <font class="QuotedText">&gt; It may not run faster, but it certainly is shorter</font><br> Uh, no it's not. The obfuscated version is 25 tokens, the sensible one is 22 tokens. Sure, if you use conventional formatting, you'll end up with one more line for the normal version, but nobody says you have to do that...<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544662/rss 2013-03-27T17:43:29+00:00 dlang <div class="FormattedComment"> The balance between clear and concise will vary depending on how familiar you are with the language in question.<br> <p> the obfuscated C contest shows clear examples where concise is far more important than clear.<br> <p> But the line itself if rather fuzzy.<br> <p> hijacking an example from elsewhere. If you have a bucket filled with water and start punching holes in the bottom, when does it stop being a bucket that leaks and start being a sieve? At some point it will be very clear that you have passed the line, but exactly where the line is is hard to define.<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544619/rss 2013-03-27T15:16:22+00:00 redden0t8 <div class="FormattedComment"> Interesting observation, it really made me think.<br> <p> I'm definitely a C-thinker, but over-shortening code (like in this article's example) really makes me cringe. I really don't understand the drive to make pieces of code as short as possible. I'm more along the lines of "clear and concise", with "clear" being more important than "concise". I guess you could think of it as writing code so as to optimize the time it takes to read and follow, rather than optimizing the line count.<br> <p> Then again maybe this comes from being a hobbyist programmer and not a professional... maybe I just have a different definition of "clear" lol.<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544607/rss 2013-03-27T13:32:39+00:00 nye <div class="FormattedComment"> Oh please. Take your macho trolling somewhere else.<br> </div> access off end of array http://lwn.net/Articles/544482/rss 2013-03-26T16:52:35+00:00 brouhaha Yes, but the 6809 came along much later than the PDP-11, so it's not relevant to discussion of where the C pre/post-increment/decrement operators came from. access off end of array http://lwn.net/Articles/544481/rss 2013-03-26T16:26:16+00:00 tjc <div class="FormattedComment"> I think a warning flag would be a step in the right direction, and maybe as far as things should to go. -Wall doesn't warn against this sort of thing, but since the "all" in -Wall is not really all, there might already be a flag for this.<br> <p> </div> access off end of array http://lwn.net/Articles/544480/rss 2013-03-26T16:04:28+00:00 hummassa <div class="FormattedComment"> I know for a fact the 6809 microprocessors had some instructions "load/store from pointer with post/pre-auto-increment/decrement" so that one of:<br> <p> a = *b++<br> a = *++b<br> a = *b--<br> a = *--b<br> *b++ = a<br> *++b = a<br> *b-- = a<br> *--b = a<br> <p> was a single instruction; they made easy to implement real fast stacks and queues, and zero-terminated strings (because "a = *b++" &amp;c set the Z flag if the char was zero).<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544441/rss 2013-03-26T09:44:13+00:00 khim <p>It may not run faster, but it certainly is shorter and a lot of guys (including me) always try to make code shorter.</p> <p>It's funny, really: it looks like I've finally found where these clashes come from. Less them two years ago I had no idea and <a href="http://lwn.net/Articles/470853/">struggled to understand</a>, but now, after a lot of discussions with other guys on an important piece of code in our project, I know that there are two types of programmers: the ones who think about their program in C (C++, C#, Java, JavaScript (uh-oh), PHP (ugh), Python, etc) and the ones who think about their program in English (Hebrew, Mandarin, Russian, whatever).</p> <p>For the "C thinkers" size of the code is very important (the shorter it is the easier to observe large chunks of code at once) and most comments are just useless distraction (and/or admission of defeat: what, you mean this piece of code is so convoluted and cryptic that you can't understand it just from a C code... gosh I think it's time to give up and add couple of comments). Sure, high-level interface must be described in human language (C is great for low-level bit manipulations, but for description of relationship between HTML document and DOM tree, created from said document it's too low-level), but everything below it must be understandable from the code.</p> <p>For "English thinkers" comments are vital piece of the information: they expect to fully understand the program from comments alone and perceive the need to actually read C code as something degrading (or as necessary evil when something does not work). Even if they read C code they usually just compare it to the comment near it (and they become angry when they found no comments to compare the code to). For them size of code is less important (because they only ever perceive it in small pieces) and verbose style is, actually, better (it makes it easier to compare code to comments).</p> <p>I'm not sure which style is better, but I found that C thinkers usually produce fast and efficient code which may contain small, localized bugs (the code in article is prime example) while English thinkers produce code which is verbose and slow yet still contain plethora of bugs - but these bugs are distinctly different: instead of off-by-one errors or simple "++" vs "--" mixup we have cases where one module produces subtly broken object which is mishandled by another module and then everything blows up in a third one.</p> <p>Easy to understand why: there are no "safety net" in C thinkers code thus localized bugs are easy to miss, but interfaces are very narrow and well-defined while English thinkers produce the code which is locally correct but globally they are hopeless because there are so many interactions between different pieces of code. Think XBox or Wii bootloader code (few bugs in the initial runs which were eventually ironed out and now there are no new bugs in sight) vs JVM code (there are endless bugs without the end in sight - and most of them are because different pieces of code interact "quirckly").</p> access off end of array http://lwn.net/Articles/544438/rss 2013-03-26T09:00:29+00:00 khim <blockquote><font class="QuotedText">Actually, the ++/PDP-11 connection is urban legend -- see "More History", paragraph 2 at this link:<br /><br /> <a href="http://cm.bell-labs.com/cm/cs/who/dmr/chist.html">The Development of the C Language</a></font></blockquote> <p>Well, your own link shows that it's not an "urban legend" but more like oversimplification: <font class="QuotedText">This is historically impossible, since there was no PDP-11 when B was developed. The PDP-7, however, did have a few `auto-increment' memory cells, with the property that an indirect memory reference through them incremented the cell. This feature probably suggested such operators to Thompson; the generalization to make them both prefix and postfix was his own.</font></p> <p>While factually incorrect (C design predates PDP-11) both "++" in C and "(RX)+" in PDP-11's assembler come from the same source.</p> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544436/rss 2013-03-26T08:39:32+00:00 mlopezibanez <div class="FormattedComment"> No, the assumption is how C programs work. It is in general impossible to tell if there is going to be an out-of-bounds access without checking every access. If you want code that checks that, then wrap every array access in the equivalent of vector.at() and let the compiler try to remove redundant checks.<br> <p> Of course, GCC could do better at static analysis and warning about such cases, but that is a different problem from optimization, and GCC needs new developers that are interested in such things.<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544434/rss 2013-03-26T08:26:20+00:00 mlopezibanez <div class="FormattedComment"> The code that is always true might not have been even written by the user, but come from a system header file, a macro expansion (but not a constant expression), or generated code (very common in C++), from transformations of the code that don't match any original code, etc.<br> <p> Regardless, there are very few GCC developers, so if you think you could do something better, you should give it a try. Sometimes you realize how difficult your obvious thing turns out to be, and other times you realize that it was indeed obvious but nobody had time to do it before. I can tell you from personal experience that there is a lot of the latter in GCC.<br> </div> access off end of array http://lwn.net/Articles/544432/rss 2013-03-26T08:09:51+00:00 alankila <div class="FormattedComment"> I guess there would be many ways to improve C, which largely are about breaking expressions that used to work but which are ugly, confusing and sometimes semantically broken. Perhaps GCC can slowly over time nudge people away from doing multiple things in a single statement -- that definitely sounds like an improvement.<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544431/rss 2013-03-26T08:02:09+00:00 jezuch <div class="FormattedComment"> <font class="QuotedText">&gt; Exactly. Looks like obfuscated C contest stuff ...</font><br> <p> I guess it's a result of a *very* popular misconception that the more you cram into a single statement the faster it is ;)<br> <p> Seeing how the compiler unwinds all of this stuff is an eye-opening experience. We, humans, have a very limited operating memory; the compiler can analyze much, much larger structures than we imagine it can.<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544429/rss 2013-03-26T05:31:07+00:00 bronson <div class="FormattedComment"> I think ifdefs would fall apart too quickly to be of much use.<br> <p> Compiler warnings would too... If the compiler emits 22 "might be true" warnings, and 21 of them are spurious, what are the chances I'll catch the meaningful one? Knowing me, probably next to nil.<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544427/rss 2013-03-26T05:10:23+00:00 dlang <div class="FormattedComment"> and the mistake in the chain of reasoning is the very first one where it assumes that the loop variable will never be out of range..<br> <p> The rest of the optimizations make sense, but that first one is optimistic thinking on the part of the compiler writer.<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544423/rss 2013-03-26T04:12:28+00:00 iabervon <div class="FormattedComment"> It's not actually able to prove that (or, really, it's not set up to consider proving that type of thing). It's actually just making a series of optimizations: first, it assumes that there won't be an out-of-bounds access, then it determines that this means that the loop can't exit normally (like in my example), then it finds that the return is unreachable, then it finds that the value being calculated is unused, then it finds that nothing is needed except for the infinite loop. Each of these optimizations improves the performance of some correct code, and it doesn't have the deeper analysis to notice that it can prove that the loop executes 16 times in violation of the assumption.<br> <p> It doesn't really have an overall knowledge set that could find contradictions; it's got patterns that produce warnings and patterns that produce optimizations, and so it can't tell when optimizations are leading to total nonsense.<br> <p> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544422/rss 2013-03-26T03:28:54+00:00 butlerm <div class="FormattedComment"> The difference is that in the original example the compiler can prove that an out of bounds array access will occur under all input conditions. This should be considered an error. Dividing by a constant zero is a similar example. What possible good could come from the compiler just making something up in a situation like that?<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544417/rss 2013-03-26T00:57:33+00:00 nix <div class="FormattedComment"> The 'nonsense', btw, is that the optimization pass which is coming to conclusions about the conditional test does not care where that test is located: in particular, it doesn't know nor care that it might be bounding a for loop. Generality in optimizations is generally a good thing...<br> <p> </div> access off end of array http://lwn.net/Articles/544402/rss 2013-03-26T00:34:27+00:00 tjc Actually, the ++/PDP-11 connection is urban legend -- see "More History", paragraph 2 at this link: <p><a href="http://cm.bell-labs.com/cm/cs/who/dmr/chist.html">The Development of the C Language</a></p> <p>I think i++ is fine from a syntax point of view, so long as it's a stand-along statement, where it produces the same code as i += 1. But I try to avoid embedding increment operators within expressions that produce easily overlooked side effects.</p> access off end of array http://lwn.net/Articles/544392/rss 2013-03-25T23:13:07+00:00 HelloWorld <div class="FormattedComment"> The only reason i++ and --i were invented is that that made it possible to generate more efficient code for the PDP-11 with simple-minded compilers. There's no reason for them nowadays as i +=1 is almost as short and most languages today also feature a proper for loop.<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544388/rss 2013-03-25T21:09:16+00:00 mansr <div class="FormattedComment"> The problem is that the source code is self-contradicting when interpreted strictly. While your observation is correct, the source also implicitly (through the d[k] array access) promises that k &lt; 16. When given conflicting information like this, the interpretation is unpredictable. That is (part of) what undefined behaviour means. Get over it.<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544385/rss 2013-03-25T20:37:32+00:00 xorbe <div class="FormattedComment"> OTOH, the compiler should know:<br> #1 k starts as zero<br> #2 k is incremented every loop, which is its only assignment point.<br> #3 k is 16 when the loop stops.<br> <p> Therefore, it also can't be an infinite loop.<br> <p> <font class="QuotedText">&gt; It doesn't know that the test is used for exiting the loop.</font><br> <p> What nonsense is this. "k&lt;16" is the exact test in the for loop! Anyways, glad to know the situation is better than the initial blog post.<br> </div> access off end of array http://lwn.net/Articles/544374/rss 2013-03-25T19:27:39+00:00 tjc <div class="FormattedComment"> If you trace this back to the root problem you may come to the conclusion that changing the state of a variable in a non-assignment expression can be more trouble than it's worth, especially if you're dealing with concurrency. A language that allows<br> <p> i++;<br> <p> as a stand-alone statement would be a useful compromise, since it could still be used as the increment statement in a for loop, which is by far the most common idiom for this construct.<br> <p> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544333/rss 2013-03-25T15:43:29+00:00 hthoma <div class="FormattedComment"> <font class="QuotedText">&gt; &gt; for (dd=d[k=0]; k&lt;16; dd=d[++k])</font><br> <p> <font class="QuotedText">&gt; That's.... horrifying.</font><br> <p> Exactly. Looks like obfuscated C contest stuff ...<br> <p> I would guess that if you write it the "normal" way, i.e.<br> <p> for(k = 0; k &lt; 16; k++) {<br> dd = d[k];<br> satd += (dd &lt; 0 ? -dd : dd);<br> }<br> <p> the compiler would not optimize the code to an infinite loop and get a better chance to optimize.<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544321/rss 2013-03-25T13:41:43+00:00 dlthomas <div class="FormattedComment"> It's not constraining the lookup, which would make valgrind miss it, but constraining its later assumptions about the variable.<br> <p> int d[16];<br> <p> d[k] = 10; /*A*/<br> <p> if(/*B*/ k &lt; 16) {<br> ...<br> }<br> <p> <p> The idea is that when it hits A, if that expression would be false, behavior is already undefined because of what happened at B, so let's optimize for the case that didn't segfault (or corrupt data).<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544320/rss 2013-03-25T13:37:52+00:00 dlthomas <div class="FormattedComment"> Fortunately, GCC 4.8 also keeps track of what code is the result of expansion of what macros...<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544305/rss 2013-03-25T10:12:05+00:00 jezuch <div class="FormattedComment"> <font class="QuotedText">&gt; "Because the SPEC CPU benchmarks are drawn from the compute intensive portion of real applications"</font><br> <p> Well...<br> <p> <font class="QuotedText">&gt; for (dd=d[k=0]; k&lt;16; dd=d[++k])</font><br> <p> That's.... horrifying.<br> </div> Regehr: GCC 4.8 Breaks Broken SPEC 2006 Benchmarks http://lwn.net/Articles/544293/rss 2013-03-25T07:53:52+00:00 jakub@redhat.com <div class="FormattedComment"> BTW, if you want some testcase where -fno-aggressive-loop-optimizations still makes a difference even in GCC 4.8.0 release, one testcase is e.g.:<br> int a[4];<br> <p> __attribute__((noinline, noclone)) int<br> foo (int x)<br> {<br> int i, r = 0, n = x &amp; 31;<br> for (i = 0; i &lt; n; i++)<br> r += a[i];<br> return r;<br> }<br> <p> int<br> main ()<br> {<br> int x = 255;<br> __asm volatile ("" : "+r" (x));<br> return foo (x);<br> }<br> . With -O3 and not -fno-aggressive-loop-optimizations, GCC from the loop determines the high bound to be 4 and completely unrolls the loop into 4 reads from a (+ additions for 2nd and up iteration) preceeded each by test of the n variable, so the code won't actually read beyond end of a array, while with -O3 -fno-aggressive-loop-optimizations GCC won't compute the upper bound estimate so low (VRP can figure out it is 32, while other passes just estimate INT_MAX), so the loop happily will read beyond end of a, is vectorized (which is unlikely desirable for such small number of iterations), etc. Even without the "&amp; 31" the situation is similar, and certainly for that case I don't see why a warning would be ever useful, the routine just can be valid only when called with a parameter 0 to 3, but there is no reason not to assume all the callers don't do that (the asm is optimization barrier, the compiler isn't supposed to look through it).<br> </div> access off end of array http://lwn.net/Articles/544292/rss 2013-03-25T05:09:26+00:00 cesarb <div class="FormattedComment"> Except that it is not an "exam question". It is supposed to be real code, in this case from a reference implementation of the H.264 codec.<br> <p> It was not written to stress test compilers. It is just not very optimized (and since it is only a reference implementation, it does not have to be).<br> </div> access off end of array http://lwn.net/Articles/544252/rss 2013-03-24T19:02:39+00:00 iabervon <div class="FormattedComment"> Probably the real reason to have this in a benchmark is because it's stupid. Unless your compiler does particularly good flow control analysis, it'll generate a read of d[16], which is a likely cache miss (if the compiler aligns the array, there's a good chance that d[16] will be in a different cache line from d[15] and anything else that's hot). If the compiler can figure out that dd isn't used outside the loop, and that it can therefore be set after the test instead of before, you'll get code that runs faster than if the compiler is less clever. Of course, if you wanted to get a fast result, you'd just write it the obvious way and get the optimal result on any compiler, but they want to have some compilers do better than other compilers.<br> <p> It's like writing an exam question: it would be easy to write a question that everybody would get right, but you want to write a question that people who know the material will get right more often than people who don't. Obviously, in ordinary life, you want to ask questions which will be more likely to get correct answers, and you want to write code that all compilers will make as fast as possible, but that's not the situation here.<br> </div>