Signed overflow optimization hazards in the kernel
Signed overflow optimization hazards in the kernel
Posted Aug 16, 2012 3:40 UTC (Thu) by gmaxwell (guest, #30048)Parent article: Signed overflow optimization hazards in the kernel
Posted Aug 16, 2012 21:53 UTC (Thu)
by daglwn (guest, #65432)
[Link] (3 responses)
Posted Aug 16, 2012 22:19 UTC (Thu)
by PaulMcKenney (✭ supporter ✭, #9624)
[Link] (2 responses)
Posted Aug 16, 2012 22:45 UTC (Thu)
by gmaxwell (guest, #30048)
[Link] (1 responses)
First, for code in the kernel, I assume that vectorizing is off the table because IIRC the kernel doesn't normally use the XMM registers to avoid saving them... so thats a chunk of the benefit that wouldn't exist in the kernel.
Branch prediction is often effective enough to make the looping cost small. But not always. I've seen easily measurable gains in numerical code performance just from reorganizing things so the compiler could find limits and unroll, but it doesn't always matter.
The most obvious thing to do would be to compile the kernel without the switches and hope that it runs long enough to run a few benchmarks. :)
Another issue is that overflow is _usually_ a bug where it exists (Yes, there are plenty of times where a programmer is using it intentionally but they are far less common than cases where an overflow is a bug). By preferring to use signed values and preventing overflow where you won't handle it, and using unsigned only where you must for range or where you intend overflow you make tools that catch bugs with dynamic instrumentation of overflow far more powerful. ( E.g. http://embed.cs.utah.edu/ioc/ ). Though since no one is applying these sorts of tools to the kernel I guess it doesn't matter currently. :)
Posted Aug 18, 2012 17:20 UTC (Sat)
by ppisa (subscriber, #67307)
[Link]
Posted Aug 16, 2012 22:47 UTC (Thu)
by cmccabe (guest, #60281)
[Link] (9 responses)
Most of the examples I've seen have been of the form "aha! I can (incorrectly) assume that this if statement that the programmer put here is a no-op!" In all of these cases I've seen, the "optimization" is something that a good programmer could have and probably would have done manually anyway.
I'd really like to see some real-world performance numbers about the effects of this optimization. Based on everything I've seen so far, this is a case where compiler writers got so carried away considering what they _could_ do, that they didn't stop to think if they _should_.
Posted Aug 16, 2012 23:10 UTC (Thu)
by cmccabe (guest, #60281)
[Link] (2 responses)
> int i;
I don't see how -fnowrapv will prevent you from unrolling the loop. You know that starting at 0 and adding 1 will get you to 5 before it will get you to overflow.
I guess you could come up with a scenario where you don't know the initial value of the counter, but you do have a constant positive increment and a set upper limit. So something like this:
> for (i = function_from_another_translation_unit(); i < 5; i++) {
Even in this case, though, you still know that either you'll run the loop no times, or there will be no overflow. You have to get to something like this to start seeing a problem:
> for (i = function_from_another_translation_unit(); i < 0x7ffffffe; i += 2) {
This is pretty exotic scenario, though. If i starts off as 0x7fffffff and -fwrapv is enabled, the loop will never terminate, whereas with -fnowrapv it's undefined. To me, this feels like buggy code in the first place, so I don't really care if it's not optimized.
Am I missing something here? Is there a good example of a real-world scenario where -fnowrapv helps well-written code?
Posted Aug 18, 2012 22:56 UTC (Sat)
by jzbiciak (guest, #5246)
[Link] (1 responses)
Where I've heard it coming up is when you have code that effectively looks like this:
The compiler wants to know it's safe to assume this loop goes around 16 times. That isn't true if "x + 16" could overflow.
Posted Aug 20, 2012 7:22 UTC (Mon)
by jezuch (subscriber, #52988)
[Link]
If you want to make it explicit, there's a flag for this in GCC: -funsafe-loop-optimizations (along with -Wunsafe-loop-optimizations if you want to know when it happens; it's a warning, though, so beware of build environments which insist on -Werror). AFACT there are two cases handled by this flag: assuming that the loop counter does not overflow, and assuming that the loop is not infinite. Don't know the exact implementation details, though.
Posted Aug 16, 2012 23:10 UTC (Thu)
by klossner (subscriber, #30046)
[Link]
Posted Aug 17, 2012 19:09 UTC (Fri)
by iabervon (subscriber, #722)
[Link]
If the compiler inlines the second call, it can drop the "i != lim" test by assuming that overflow is impossible.
Posted Aug 20, 2012 16:45 UTC (Mon)
by daglwn (guest, #65432)
[Link] (3 responses)
Posted Aug 21, 2012 7:58 UTC (Tue)
by jezuch (subscriber, #52988)
[Link] (2 responses)
Posted Aug 21, 2012 14:30 UTC (Tue)
by daglwn (guest, #65432)
[Link] (1 responses)
Not true. The Intel compiler, for example, will vectorize its brains out automatically.
A code using lots of non-restrict pointers will certainly be difficult to vectorize. Such code can sometimes be auto-parallelized but I don't believe gcc has that capability.
gcc's vectorization is also pretty weak, though it is getting better. That may be part of what you're seeing.
Posted Aug 22, 2012 10:39 UTC (Wed)
by jezuch (subscriber, #52988)
[Link]
It sorta-kinda does (-ftree-parallelize-loops) but I haven't tested it much. But I expect it to be even weaker than vectorization.
> gcc's vectorization is also pretty weak, though it is getting better. That may be part of what you're seeing.
It may be that. Or it may be that most of the code in non-HPC world does not lend itself to vectorization easily. I actually don't know, I'm just an amateur and hobbyist :)
Signed overflow optimization hazards in the kernel
Signed overflow optimization hazards in the kernel
Signed overflow optimization hazards in the kernel
Signed overflow optimization hazards in the kernel
Signed overflow optimization hazards in the kernel
Signed overflow optimization hazards in the kernel
> for (i = 0; i < 5; i++) {
> [expression not involving i]
> }
> [expression not involving i]
> }
> [expression not involving i]
> }
Signed overflow optimization hazards in the kernel
for (i = x; i < x + 16; i++)
{
/* code that does not modify either x or i */
}
Signed overflow optimization hazards in the kernel
Signed overflow optimization hazards in the kernel
Signed overflow optimization hazards in the kernel
int find(char *s, char c, int lim)
{
for (int i = 0; i != lim && s[i]; i++)
if (s[i] == c)
return i;
return -1;
}
int main()
{
find("foo", 'o', 1);
find("foo", 'o', -1);
}
Signed overflow optimization hazards in the kernel
Signed overflow optimization hazards in the kernel
Signed overflow optimization hazards in the kernel
> to the compiler, as I understand it)
Signed overflow optimization hazards in the kernel
