Who's afraid of a big bad optimizing compiler?
Who's afraid of a big bad optimizing compiler?
Posted Jul 25, 2019 14:13 UTC (Thu) by jerojasro (guest, #98169)Parent article: Who's afraid of a big bad optimizing compiler?
I know, the first thing the article says is: "the C standard grants the compiler the right to make some assumptions that end up causing these weird/non-obvious things if you don't guard against them", but I must ask: why can't the compiler grow an optimization mode that does the right thing (and thus avoid all of the issues documented here) in the presence of global variables and concurrent execution? that situation (globals+concurrence) is so common, that solving it in the compiler would benefit most, if not all, C users, in both kernel and user space.
are there any reasons for not fixing this issue (once) in the compiler, other than the amount of work involved (which I guess must be ... not trivial, of course), and this new optimization mode not being standards-compliant?
(I'm not attempting to troll anybody; it just seems to me that what I propose is an obvious solution, and since it's not being used, I'd like to know what obvious thing I'm missing that prevents us from using it)
Posted Jul 25, 2019 18:09 UTC (Thu)
by excors (subscriber, #95769)
[Link] (3 responses)
Probably the other main issue is performance. Particularly with concurrency where CPUs often require explicit memory barriers if you want guaranteed ordering, and if the compiler added memory barriers around every memory access just in case you were using the same memory in another thread, it would be pretty slow. Even if it was only a tiny performance regression, many C programmers care a lot about performance (especially microbenchmark performance) and won't be happy with a new compiler that's measurably slower, and users won't be happy if their application runs slower after updating their kernel. They'd rather have a dangerous but fast compiler, and rely on the programmer being smart enough to avoid those dangerous cases.
Posted Apr 9, 2020 5:45 UTC (Thu)
by gmatht (guest, #58961)
[Link] (2 responses)
Posted Apr 9, 2020 8:30 UTC (Thu)
by geert (subscriber, #98403)
[Link] (1 responses)
Posted Apr 10, 2020 17:22 UTC (Fri)
by zlynx (guest, #2285)
[Link]
In any thread situation you want an atomic access. Which might be implemented using volatile, but requires more than that such as memory barrier operations.
Posted Jul 25, 2019 19:01 UTC (Thu)
by farnz (subscriber, #17727)
[Link] (2 responses)
The short answer is performance; each of the eight transforms in the article permits the generated code to be optimized into much, much faster code, as long as the original C code did what the programmer intended it to do.
That statement ("as long as the original C code did what the programmer intended it to do") is the source of all the pain. The C standard specifies an abstract C machine that directly executes C code, and the job of a compiler is to translate C code into a machine code that has the same semantics as C running on the abstract C machine. However, most C programmers are ignorant of the abstract C machine (not all, but most, including me); instead, they either rely on "my compiler turns it into machine code that does what I want", or "the obvious translation to my preferred machine code does what I want". Each of these interpretations of "what my C means" leads to its own set of problems:
In short, it's hard, because of C's legacy as "basically one step above a macro assembler for the PDP-11".
Posted Jul 26, 2019 15:08 UTC (Fri)
by jerojasro (guest, #98169)
[Link] (1 responses)
but after reading both of your comments, I realized (hope I'm correct), that globals are *both*: things declared as global, *and also* anything allocated in the heap. And that last part is what makes automating all of these checks/guards such a performance hit, and worth bothering the programmer with handling manually the unsafe situations.
Man I'm glad I just concatenate strings for a living.
Posted Jul 26, 2019 15:32 UTC (Fri)
by farnz (subscriber, #17727)
[Link]
It's a bit worse than that - things in C are global in the sense of "need concurrency-awareness baked-in" if they are syntactically global, or if there is ever a pointer to the thing. That last covers all heap allocations, and any stack allocations whose address you take with &, and in turn means that you have to add barriers etc to all heap allocations plus some stack allocations.
And, of course, this is only going to help code that does not execute correctly on the C machine, as it ignores the semantics of the C machine. We don't want to strengthen the semantic guarantees of the C machine, since that results in the output code running slower (it needs more barriers, as more of the "internal" effects are now externally visible). So it only actually helps developers who dont actually understand what they're telling the computer to do - while this may be a common state, it's not necessarily something to encourage.
Posted Jul 26, 2019 18:18 UTC (Fri)
by andresfreund (subscriber, #69562)
[Link] (1 responses)
I'd say that C11/C++11 made a large step in that direction, by having a formalized memory model, and builtin atomics. Before that there really was no way to not rely on compiler implementation details to get correctly working (even though formally undefined) concurrent programs.
It does require you however to actually use the relevant interfaces.
I don't quite see how you'd incrementally get to a language that doesn't have any of these issues, without making it close to impossible to ever incrementally move applications towards that hypothetical version of C. I mean there's basically no language that allows to use shared memory and doesn't require escape hatches from its safety mechanisms to implement fast concurrent datastructures (e.g. rust needing to go to unsafe for core pieces). And the languages that get closes require enough of a different approach that it's hard to imagine C going towards it.
That's not to say that C/C++ have sufficiently progressed towards allowing to at least opt into safety. The C11/C++11 memory model and the atomics APIs are a huge step, but it's happened at the very least 10 years too late (while some of the formalisms where developed somewhat more recently, there ought to at least have been some progress before then). And there's plenty other issues where no proper ways are provided (e.g. signed integer overflow handling, mentioned in nearby comments).
Posted Jul 29, 2019 16:44 UTC (Mon)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link]
So there are great opportunities for innovations in the area of locating old concurrent C code in need of an upgrade! :-)
Who's afraid of a big bad optimizing compiler?
We even have an example in the article of different code making incompatible assumptions. Here the code assumes the variable Example, read once may or may not be the "right thing".
need_to_stop
will be read many times.
The following code is instead assuming that 1 while (!need_to_stop) /* BUGGY!!! */
2 do_something_quickly();
global_ptr
won't change. This could be ensured by only reading global_ptr
once.
In general it might be hard to determine which of these two contradictory assumptions the code is making.
2 if (global_ptr != NULL &&
3 global_ptr < high_address)
4 do_low(global_ptr);
Example, read once may or may not be the "right thing".
Example, read once may or may not be the "right thing".
Who's afraid of a big bad optimizing compiler?
Who's afraid of a big bad optimizing compiler?
Who's afraid of a big bad optimizing compiler?
Who's afraid of a big bad optimizing compiler?
Who's afraid of a big bad optimizing compiler?