ACCESS_ONCE()
ACCESS_ONCE()
Posted Aug 3, 2012 16:55 UTC (Fri) by corbet (editor, #1)In reply to: ACCESS_ONCE() by daglwn
Parent article: ACCESS_ONCE()
Nobody said the (potential) bug was anywhere else. But declaring that variable as volatile would deoptimize things much more than is necessary. Rather than do that, the developers involved used ACCESS_ONCE() and things work as they should.
Posted Aug 3, 2012 22:00 UTC (Fri)
by daglwn (guest, #65432)
[Link] (32 responses)
That's what I was responding to. As a professional compiler developer, it's really irritating to read about compilers "creating bugs" when the problem is actually with undefined, unspecified or implementation-defined behavior in the code.
Posted Aug 7, 2012 10:17 UTC (Tue)
by dgm (subscriber, #49227)
[Link] (31 responses)
Compiler developers (professional or otherwise) should take into account that not all developers are as aware as they of each and all language subtleties. Most compiler users use their common sense, not the language specification, to infer what the code does.
If something has the potential of doing harm, and the user is perfectly capable of doing it by hand anyway if really needed, then it's best for the compiler not to do it. I believe that the case at hand is a perfect example of that.
Posted Aug 7, 2012 11:40 UTC (Tue)
by gioele (subscriber, #61675)
[Link] (24 responses)
> Compiler developers (professional or otherwise) should take into account that not all developers are as aware as they of each and all language subtleties. Most compiler users use their common sense, not the language specification, to infer what the code does.
The famous --do-what-I-mean-not-what-I-write compiler switch.
It is a risky thing to do to abandon the written spec to start developing against unwritten common sense stashed in the brain of developers.
This is dangerous because it assumes something that is even more difficult to achieve than making the developers know the spec by heart: it assumes that all the developers around the globe know the same part of the spec and all use it in the same, exact, consistent way.
There will always be differences in the way people use tools and languages. These differences will create conflicts. I think that the best way to resolve conflicts is to point out the relevant part of the spec that dictates what to do. Yes, the spec may be not well written or ambiguous, but those are problems that can be solved. The alternative is to argue in a bug report and let the most vocal person (or the one backed by the biggest implementer) win, ignoring the fact that there may be heaps of people out there using the language in the way the less vocal debater think it should be used.
Posted Aug 8, 2012 13:31 UTC (Wed)
by dgm (subscriber, #49227)
[Link] (23 responses)
That would be a good switch, yes. By default the compiler should be in --do-what-I-write-not-what-some-ambiguous-line-of-the-spec-allows-you-to-interpret, though. It would be better for everybody.
Posted Aug 8, 2012 16:51 UTC (Wed)
by daglwn (guest, #65432)
[Link] (22 responses)
How can the compiler possibly know what you meant when you wrote something outside the language spec?
Take a very simple but common issue: undefined data. There are countless compiler analyses and transformations that mode code around, reallocate stack space, etc. that cause the undefined variable to have different (garbage) values. What might happen to work one day most assuredly will not when compiled with a compiler 2-3 versions newer.
What should the compiler do? Disable all code motion of and around the offending expression? That would be lunacy.
Posted Aug 9, 2012 3:47 UTC (Thu)
by mmorrow (guest, #83845)
[Link]
Posted Aug 10, 2012 7:40 UTC (Fri)
by jezuch (subscriber, #52988)
[Link] (14 responses)
Undefined data, hah. In Java it is a compilation error to use a variable that is not provably assigned in all possible code paths between declaration and use. I like this feature *a lot* and I'm always surprised that C compilers have such a great difficulty with detecting it. (The JVM verifier does this analysis as well during class loading, so it has to be *fast* as well as correct.)
Posted Aug 10, 2012 14:58 UTC (Fri)
by quanstro (guest, #77996)
[Link] (6 responses)
that is not the choice in this case. the choice is to
personally, i find this sort of code reorg by compilers
Posted Aug 10, 2012 15:40 UTC (Fri)
by daglwn (guest, #65432)
[Link] (4 responses)
This is a false choice. (a) and (b) are the same thing in the presence of undefined/unspecified/implementation-defined behavior. And it's not arbitrary. It's the decision of the compiler engineers.
> personally, i find this sort of code reorg by compilers
I hear this a lot. Then people get surprised when I show them what the compiler did to their code. Believe me, there is no reason anyone should waste time hand-optimizing code without proof of need. Either they're going to miss a lot of opportunity or they are going to screw things up and make the compiler's job harder.
If a developer were to hand-optimize code to achieve the same performance result, the source code would be unmaintainable.
We want optimizing compilers. I can't believe anyone would suggest otherwise.
Posted Aug 12, 2012 16:16 UTC (Sun)
by quanstro (guest, #77996)
[Link] (3 responses)
as i see it, the trade-off here is non-standard constructions, and the
i'm not convinced of the claim that this is always a win.
the compiler i use on a daily basis does not do strength reduction or optimize
(never mind that with modern intel cpus, strength reduction can be a loss due
i think this is a good trade off for my case because it avoids obtuse, and
just my two cents, and i realize that there are cases in the linux kernel
Posted Aug 13, 2012 12:05 UTC (Mon)
by nye (subscriber, #51576)
[Link] (2 responses)
No, the coder *was* wrong, and the assumption is *always correct in standard C*. That's the point. The programmer might have assumed semantics which are not C, but the compiler merely assumed that the programmer was writing in C, not writing in some unspecified language that looks a lot *like* C and exists only in the programmer's head.
It is axiomatic that a valid optimisation (ie. one which precisely follows C semantics, and any which don't are buggy and tend to be quickly fixed) cannot break correct valid C; if code breaks then it is because the programmer has made incorrect assumptions about the exact meaning *in C* of what they're writing.
If your variable might change between accesses, then you need to tell the compiler that, because it is not the case in the standard C model, which is why there's a keyword existing specifically for that purpose.
Posted Aug 16, 2012 14:52 UTC (Thu)
by quanstro (guest, #77996)
[Link] (1 responses)
for me, making code safe from all possible according-to-hoyle legal
if restricting the compiler from making some theoretically legal
as it is i believe there are some gcc optimizations that can break the
Posted Aug 19, 2012 19:38 UTC (Sun)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link]
Posted Aug 10, 2012 17:33 UTC (Fri)
by nix (subscriber, #2304)
[Link]
(I've seen this from various safety-critical embedded people too: they want the compiler to 'not optimize'. I've tried pointing out that this is a meaningless request, that translation necessarily implies many of the same transformations that optimization does, but they never seem to get it. What they generally *mean* is that they want the smallest possible number of transformations -- or perhaps that they want the transformations to be guaranteed bug-free.)
Posted Aug 10, 2012 15:35 UTC (Fri)
by daglwn (guest, #65432)
[Link] (6 responses)
Posted Aug 12, 2012 18:52 UTC (Sun)
by Wol (subscriber, #4433)
[Link]
We had a bunch of warnings we couldn't get rid of, hence it didn't say "fix all warnings", but "explain it away" is just as effective, if less satisfying.
Cheers,
Posted Aug 13, 2012 8:02 UTC (Mon)
by jezuch (subscriber, #52988)
[Link] (1 responses)
It is. And they do. And every time I see someone fixing a "use of undefined variable" warning by initializing it to zero at declaration point, I cringe. I've seen it in stable updates to the kernel a lot...
Posted Aug 13, 2012 16:31 UTC (Mon)
by daglwn (guest, #65432)
[Link]
Posted Aug 13, 2012 9:38 UTC (Mon)
by etienne (guest, #25256)
[Link] (2 responses)
void fct (void) {
Posted Aug 13, 2012 16:34 UTC (Mon)
by daglwn (guest, #65432)
[Link]
Still, even in this case the compiler could warn about it even if it doesn't know for sure. The code certainly looks suspicious and a warning would be appropriate. False positives are just fine if they are limited in number. The compiler can provide directives to suppress them if the programmer knows it's not a problem.
Note that gcc does just this. It warns that variables *might* be uninitialized.
Posted Aug 13, 2012 16:50 UTC (Mon)
by nybble41 (subscriber, #55106)
[Link]
Assuming "loglevel = 4" was replaced with "loglevel == 4", I would expect the compiler to generate a warning in this case, since it can't _prove_ that avar was initialized before the printf() call. I would also hope that the compiler would take advantage of the fact that avar is either 10 or undefined to simply set it to 10 regardless of loglevel, for code size and performance reasons if nothing else.
In cases like this, IMHO, it would be better to use a common flag rather than testing loglevel multiple times:
void fct (void) {
I'm not sure whether the compiler can follow this any better than before, so there may still be a warning, but at least the coupling is visible to anyone reading the code, and doesn't depend on the implementation of external functions.
Posted Aug 13, 2012 9:56 UTC (Mon)
by dgm (subscriber, #49227)
[Link] (5 responses)
The code we are arguing about, the one with ACCESS_ONCE(), is NOT outside the spec in any way, is it?
Posted Aug 13, 2012 12:11 UTC (Mon)
by nye (subscriber, #51576)
[Link] (1 responses)
> The code we are arguing about, the one with ACCESS_ONCE(), is NOT outside the spec in any way, is it?
No it isn't. The point is that, absent the ACCESS_ONCE() macro, the assumption that the compiler shouldn't pull that access out of the loop is what's outside the language spec, because the spec says it can safely do so.
Posted Aug 17, 2012 10:57 UTC (Fri)
by dgm (subscriber, #49227)
[Link]
But does the spec say it _has_ to do so? Does it more good or harm?
Not everything that is allowed is good. For example, the compiler is (in theory) allowed to do anything it wants when presented with code that raises undefined behavior. Anything. "rm -rf /" for instance would be 100% correct. Are GCC developers planing this "feature" for gcc 4.8? Of course not, that would be stupid when you could be playing nethack instead...
Posted Aug 13, 2012 16:35 UTC (Mon)
by daglwn (guest, #65432)
[Link] (2 responses)
But ignoring that, nye has the more useful response. :)
Posted Aug 17, 2012 10:37 UTC (Fri)
by dgm (subscriber, #49227)
[Link] (1 responses)
Posted Aug 20, 2012 10:07 UTC (Mon)
by etienne (guest, #25256)
[Link]
Posted Aug 7, 2012 19:42 UTC (Tue)
by daglwn (guest, #65432)
[Link] (5 responses)
My point is that a bug caused by the programmer not adhering to the language standards is not caused by the compiler. The compiler team may be willing to work around the problem but that doesn't mean the code ain't broke.
> If something has the potential of doing harm, and the user is perfectly
The problem is that the compiler has *no way* to know if the code is potentially harmful because the code is doing something outside the definition of the language. The problem ACCESS_ONCE is trying to solve is a perfect example. The compiler doesn't know which parts of code will be operating in a multi-threaded environment, much less which pieces of memory are shared. We have "volatile" to mark variables in a way that happens to work for our current threading models. But it is the only thing we have in C right now.
Compiler writers generally don't want to limit legal transformations because those transformations can help the vast majority of programmers who don't have whatever specialized problem one programmer might complain about.
Posted Aug 8, 2012 14:30 UTC (Wed)
by dgm (subscriber, #49227)
[Link] (4 responses)
It's not the code that it's harmful, but the transformation the compiler does in the optimization. To put it another way: the compiler chooses to do something that is NOT what the code says. It does so because it assumes it's safe. Because there's no way it can be sure if the transformation is safe (or not!), it should not be doing it.
> The compiler doesn't know which parts of code will be operating in a multi-threaded environment
Isn't it common sense that any code compiled today _can_ be used in a multi-threaded environment?
Posted Aug 8, 2012 16:56 UTC (Wed)
by daglwn (guest, #65432)
[Link] (3 responses)
No. The code is the specification of what the programmer wants to happen. If the programmer doesn't mark something "volatile" when it should be, there's no way the compiler can automatically infer that information.
> Because there's no way it can be sure if the transformation is safe (or
You've just killed all compiler optimization.
> Isn't it common sense that any code compiled today _can_ be used in a
And any variable might be shared. So now the compiler must assume everything is volatile.
You've just killed all compiler optimization.
Posted Aug 13, 2012 10:07 UTC (Mon)
by dgm (subscriber, #49227)
[Link] (2 responses)
Threads are supposed to share all memory, aren't they?
>You've just killed all compiler optimization.
Only the unsafe ones. There are plenty of opportunities for optimization, but I don't think this is an example of one.
Posted Aug 13, 2012 16:41 UTC (Mon)
by daglwn (guest, #65432)
[Link] (1 responses)
It depends on the threading model and since the standard doesn't specify one, the compiler cannot know in the general case.
Now, gcc has for example the -pthreads switch that tells it something about what to expect so that can help. But the kernel build system almost certainly doesn't use that.
> There are plenty of opportunities for optimization, but I don't think
When one restricts what is know about memory accesses, the compiler kills a *lot* of compiler transformation. Much much more than one would think. Spill code gets a lot more inefficient, for example, which you wouldn't expect to happen since spills are entirely local. But it does because the compiler cannot do certain peeps that can help reduce the amount of spill code.
The "big" transformations like loop restructuring, vectorization and the like are almost impossible if the behavior of memory can't be determined. These are no more "unsafe" than any other transformation. The compiler won't do them if it doesn't know it's safe according to the standard.
Posted Aug 13, 2012 16:58 UTC (Mon)
by jwakely (subscriber, #60262)
[Link]
Except that for many platforms it's equivalent to '-D_REENTRANT -lpthread' and so only affects the preprocessor and linker, which doesn't tell the compiler anything.
ACCESS_ONCE()
ACCESS_ONCE()
ACCESS_ONCE()
ACCESS_ONCE()
ACCESS_ONCE()
ACCESS_ONCE()
Take a very simple but common issue: undefined data. There are countless compiler analyses and transformations that mode code around, reallocate stack space, etc. that cause the undefined variable to have different (garbage) values. What might happen to work one day most assuredly will not when compiled with a compiler 2-3 versions newer.
A good example of this can be seen in GCC's constant propagation implem (tree-ssa-ccp.c):
And we can see this in action with e.g.
/*
...
- If an argument has an UNDEFINED value, then it does not affect
the outcome of the meet operation. If a variable V_i has an
UNDEFINED value, it means that either its defining statement
hasn't been visited yet or V_i has no defining statement, in
which case the original symbol 'V' is being used
uninitialized. Since 'V' is a local variable, the compiler
may assume any initial value for it.
...
*/
which gives
size_t f(int x){size_t a; if(x) a = 42; return a;}
f:
movl $42, %eax
ret
.ident "GCC: (GNU) 4.8.0 20120408 (experimental)"
ACCESS_ONCE()
ACCESS_ONCE()
(a) follow the standard, or
(b) do something arbitrary.
either
(a) rearrange the code in optimization, or
(b) leave it as the developer wrote it.
to be problematic as it generally makes code quite hard
to reason about. experienced developers would lift the
assignment out of the loop if it mattered, and it were
legal.
ACCESS_ONCE()
> (a) follow the standard, or
> (b) do something arbitrary.
> to be problematic as it generally makes code quite hard
> to reason about. experienced developers would lift the
> assignment out of the loop if it mattered, and it were
> legal.
ACCESS_ONCE()
loop, except if it might change. the compiler made the assumption that the
coder was wrong. that might not be a useful assumption.
principle of least surprise for performance.
away indirection. it assumes you know what you're doing. i don't notice that
it is slower. i also don't have to worry that the compiler will break my
drivers by "optimizing" them.
to microop caching.)
non-portable constructions that can be hard to remember to apply. that is,
for most code, developer time is more expensive than run time.
where complex macros need this sort of optimization. but perhaps that's
complexity begetting itself.
ACCESS_ONCE()
> loop, except if it might change. the compiler made the assumption that the
> coder was wrong. that might not be a useful assumption.
ACCESS_ONCE()
(and btw, i don't think that ACCESS_ONCE is standard c. nor can the
kernel be compiled with an arbitrary compiler; it depends on gcc.)
transformations of the code is not really interesting or useful.
i'd much rather focus on having a tractable, easy-to-use programming
environment.
code transformations reduces bugs and generally makes life easier,
then why not do it?
kernel.
ACCESS_ONCE()
ACCESS_ONCE()
personally, i find this sort of code reorg by compilers
to be problematic as it generally makes code quite hard
to reason about.
I'm curious. Why don't you say the same about register allocation? Combined with stack spilling, that can often have the same effect as code motion. How do you plan to get rid of that?
ACCESS_ONCE()
ACCESS_ONCE()
Wol
ACCESS_ONCE()
ACCESS_ONCE()
ACCESS_ONCE()
Something like (very simplified):
extern unsigned loglevel;
int avar;
if (loglevel = 4) avar = 10;
increase_loglevel();
do_some_unrelated_stuff();
if (loglevel == 5) printf ("%d\n", avar);
}
ACCESS_ONCE()
ACCESS_ONCE()
int avar;
bool use_avar;
use_avar = (loglevel == 4);
if (use_avar) avar = 10;
increase_loglevel();
do_some_unrelated_stuff();
if (use_avar) printf ("%d\n", avar);
}
ACCESS_ONCE()
ACCESS_ONCE()
ACCESS_ONCE()
ACCESS_ONCE()
ACCESS_ONCE()
ACCESS_ONCE()
ACCESS_ONCE()
> capable of doing it by hand anyway if really needed, then it's best for
> the compiler not to do it.
ACCESS_ONCE()
ACCESS_ONCE()
> does in the optimization.
> not!), it should not be doing it.
> multi-threaded environment?
ACCESS_ONCE()
ACCESS_ONCE()
> this is an example of one.
ACCESS_ONCE()