GCC 4.3.0 exposes a kernel bug

Posted Mar 7, 2008 21:00 UTC (Fri) by flewellyn (subscriber, #5047)
Parent article: GCC 4.3.0 exposes a kernel bug

Er, I think the new behavior can be called "correct" only for a very narrow definition of the term. It's true that the ABI does specify that DF should be cleared, but because it's widely known (or should be) among system-level hackers that many important programs do not honor this part of the ABI, GCC should ensure correct behavior in cases where the callers do not.

Okay, so the Linux and BSD kernels were not properly following this part of the ABI, and that's technically incorrect. But the thing is, the GCC developers apparently knew that, and prior to this had fixed it; adherence to standards and specifications is a good thing, but so is not breaking working code. Hopefully they'll revert this change, perhaps with a compile-time warning for the incorrect behavior.

GCC 4.3.0 exposes a kernel bug

Posted Mar 7, 2008 21:23 UTC (Fri) by zlynx (guest, #2285) [Link] (16 responses)

I don't see why they should revert it.  Maybe make a flag to change the default behavior.

But people are going to have to live with it.  Apparantly ICC has been doing this for years
already.  So the problem is already out there on anything ICC has compiled.  Like, say,
commercial versions of MySQL.  Don't those use ICC?  And I think some Linux games were built
with ICC.  And what about Oracle?

I think that having the kernel do the wrong thing and then claiming that GCC has to fix the
problem is just ridiculous.  Especially after all the years of trying to make GCC follow the
ABI standards so that it can interoperate with other compilers and libraries.

Follow the standards, don't make up your own.  That's Microsoft all over, and we open source
types are supposed to hate it.

GCC 4.3.0 exposes a kernel bug

Posted Mar 7, 2008 21:37 UTC (Fri) by flewellyn (subscriber, #5047) [Link] (8 responses)

There's following the standards, and then there's not communicating about a potentially code-breaking change. Because GCC previously was emitting cld instructions before every inlined function call, we can conclude they knew about the problem existing in the wild. They should have given the Linux and BSD developers a heads up about this.

GCC 4.3.0 exposes a kernel bug

Posted Mar 7, 2008 21:55 UTC (Fri) by daney (guest, #24551) [Link] (7 responses)

GCC changes every day.  If you are interested in what every change is, you are free to look
at: http://gcc.gnu.org/ml/gcc-cvs/

This particular change to GCC was made many months ago and underwent extensive testing on many
different platforms.  That this bug existed and was exposed only after 4.3.0 was released is
perhaps unfortunate, but you imply that someone knew about the problem and withheld that
information.  That is not the case.

GCC 4.3.0 exposes a kernel bug

Posted Mar 7, 2008 22:27 UTC (Fri) by JoeBuck (subscriber, #2330) [Link] (6 responses)

From the gcc point of view, it was simply a matter of observing that the compiler was emitting an unneeded instruction: we don't have to clear that register because it's already cleared; the standard says so, and real implementations follow the standard correctly (or so it was thought). The result was that code sequences that use the x86 string instructions are slightly smaller and faster with gcc 4.3.0.

The issue of kernels not following the rules in the case of signal handlers was not noticed until the 4.3.0 release process had already started.

If you think that kernel developers should be notified of every change of this kind, just in case it does something, they'd need to subscribe to the svn commit mailing list, and they'd be overwhelmed with messages describing small changes.

GCC 4.3.0 exposes a kernel bug

Posted Mar 8, 2008 2:55 UTC (Sat) by dlang (guest, #313) [Link] (2 responses)

this is a case where the history of the code is needed to tell what's really going on.

did older GCC versions add the instruction because some programmer in the past ran into this
bug and fixed it (in which case the changelog for the commit that introduced this would
theoretically be found), or was the original programmer of this function in GCC exercising
defensive programming by not assuming that other programs leave things in any particular state
(which is what was assumed)?

how large and how many clock cycles does this instruction use?

GCC 4.3.0 exposes a kernel bug

Posted Mar 8, 2008 19:21 UTC (Sat) by ibukanov (subscriber, #3942) [Link] (1 responses)

History may not be relevant here. It could be that in the past GCC was simply not able to
track the state of the control bit when generation the code. As such the compiler had to
insert the explicit instructions to reset the bit even if it was known that they were not
necessary from ABI point of view.

GCC 4.3.0 exposes a kernel bug

Posted Mar 9, 2008 3:37 UTC (Sun) by dlang (guest, #313) [Link]

the history is very relevant. you are listing a third option (very similar to the second one I
listed above) knowing which of these is correct (or if there is a fourth that is correct) is
significant in evaluating what needs to change.

Was it worth the trouble?

Posted Mar 8, 2008 22:46 UTC (Sat) by eru (subscriber, #2753) [Link] (2 responses)

The result was that code sequences that use the x86 string instructions are slightly smaller and faster with gcc 4.3.0.

The eliminated instruction is one byte long, executes very quickly, and string instructions are not very common in most real code anyway. When they occur, they are heavyweight operations, because the sources and count have to be set up into particular registers, and the string instruction itself usually takes much more time than simple instructions. Whether or not the direction flag instruction appears might then change the time of the string operation by perhaps 1% or less. So except for contrived programs that consist almost entirely of these string operations, I suspect it is impossible to measure any execution time reduction in actual programs that could be attributed to this compiler change.

Of course removing a redundant instruction is aesthetically the right thing to do, but in this case I think it does not have practical benefits.

Was it worth the trouble?

Posted Mar 9, 2008 20:07 UTC (Sun) by vonbrand (subscriber, #4458) [Link]

Was in worth the trouble fixing this in the kernel?: Definitely. The kernel must do "the right thing", without regard to any idiocy commited by the programs it is running. Not doing so might open vulnerabilities.
Was it really worth it in GCC?: Not so sure... but the compiler should enforce the relevant ABIs (and is also entitled to assume they are being followed).

Was it worth the trouble?

Posted Mar 19, 2008 9:38 UTC (Wed) by pharm (guest, #22305) [Link]

<i>The eliminated instruction is one byte long, executes very quickly, and string instructions
are not very common in most real code anyway. When they occur, they are heavyweight
operations, because the sources and count have to be set up into particular registers, and the
string instruction itself usually takes much more time than simple instructions. Whether or
not the direction flag instruction appears might then change the time of the string operation
by perhaps 1% or less. So except for contrived programs that consist almost entirely of these
string operations, I suspect it is impossible to measure any execution time reduction in
actual programs that could be attributed to this compiler change.</i>

Unfortunately, you're wrong. CLD can have a latency of 50+ cycles on some x86 implementations:
that's not an insignificant amount. Plus we're not just talking about "string operations",
we're talking about functions like memset() & memcpy() too, which often use them.

See: http://gcc.gnu.org/ml/gcc/2008-03/msg00360.html for some benchmarks
and http://gcc.gnu.org/ml/gcc/2008-03/msg00404.html for a link to a document which gives a
latency of 52 cycles for CLD.

GCC 4.3.0 exposes a kernel bug

Posted Mar 14, 2008 20:52 UTC (Fri) by giraffedata (guest, #1954) [Link] (6 responses)

I don't think what people knew and/or concealed is relevant, but the fact that the behavior has existed for 15 years and exists in countless systems today matters a lot. 15 years of practice is a much stronger standard than any prescriptive document. I say the standard is that DF's value is undefined at entry to a function, and Gcc 4.3.0 fails to conform.

This is a classic dilemma. You can make Gcc right or you can make it work.

If you offered both versions to the public, very few would opt for the "right" one. That's not the last word, of course. I'm sure some people believe the Gcc project has higher goals than giving its users what they want.

But traditionally, prescriptive standards nearly always bow to what actual practice demands.

GCC 4.3.0 exposes a kernel bug

Posted Mar 14, 2008 21:21 UTC (Fri) by zlynx (guest, #2285) [Link] (4 responses)

Claiming that it's standard "because GCC does it that way" completely ignores all the other
compilers that do *not* do it that way.

GCC 4.3.0 exposes a kernel bug

Posted Mar 14, 2008 22:26 UTC (Fri) by giraffedata (guest, #1954) [Link] (3 responses)

Claiming that it's standard "because GCC does it that way" completely ignores all the other compilers that do *not* do it that way.

I think you got it backward. I claim it's standard because Linux does it that way. Linux is what violates the prescribed standard.

I also didn't state the de facto standard as precisely as I could have, because Linux clearly should change to clear the DF flag. But Gcc should continue to clear it too, because old Linux exists.

GCC 4.3.0 exposes a kernel bug

Posted Mar 14, 2008 22:42 UTC (Fri) by zlynx (guest, #2285) [Link] (2 responses)

> But Gcc should continue to clear it too, because old Linux exists.

This does not buy you anything except slowing down all your code unnecessarily.  Any user
might use a binary built with some other compiler, like the precompiled commercial MySQL
server, or a game.  Software running through Wine is probably built with Visual Studio.  A JIT
like Mono or Java might generate code that doesn't reset DF.  A developer might be using TCC
for ultra-fast compiles.  There is also LLVM: I don't know, but it might not do the DF clear
either.

See what I mean about other compilers?  Do you wish to have every one of them also clear DF on
every function?

GCC 4.3.0 exposes a kernel bug

Posted Mar 15, 2008 0:07 UTC (Sat) by nix (subscriber, #2304) [Link]

ICC has apparently never cleared DF. I guess nobody's ever tried compiling 
programs that make heavy use of asynchronous signal handlers with ICC on 
Linux...

GCC 4.3.0 exposes a kernel bug

Posted Mar 15, 2008 2:23 UTC (Sat) by giraffedata (guest, #1954) [Link]

OK, I see your point.

the standard

Posted Mar 21, 2008 11:29 UTC (Fri) by gvy (guest, #11981) [Link]

> 15 years of practice is a much stronger standard
> than any prescriptive document.
...over at sco dotcom. :)

Well, IMHO trying to follow standards in a way which creates artifical and hard to debug
problems to the rest of the crowd *is* ignorance too.

GCC 4.3.0 exposes a kernel bug

Posted Mar 24, 2008 12:52 UTC (Mon) by olecom (guest, #42886) [Link]

> It is hard to see how that could be turned into a security breach,
> but it would be a mistake to assume that it can't. Other kernel bugs,
> like the one that allowed the recent vmsplice() exploit, have looked
> liked memory corruption, but were found to be more than that.

| After a bit more poking around, we discovered how to alter the page
| mappings so that sections of kernel and I/O memory were directly mapped
| into all user address spaces.[2]

[2] Talk about security holes!
(C) 1992 http://valhenson.org/synthesis/SynthesisOS/ch7.html

Not checking userspace supplied pointers is most basic security hole in
userspace + kernel memory based systems.
______

GCC 4.3.0 exposes a kernel bug

GCC 4.3.0 exposes a kernel bug

GCC 4.3.0 exposes a kernel bug

GCC 4.3.0 exposes a kernel bug

GCC 4.3.0 exposes a kernel bug

GCC 4.3.0 exposes a kernel bug

GCC 4.3.0 exposes a kernel bug

GCC 4.3.0 exposes a kernel bug

GCC 4.3.0 exposes a kernel bug

Was it worth the trouble?

Was it worth the trouble?

Was it worth the trouble?

GCC 4.3.0 exposes a kernel bug

GCC 4.3.0 exposes a kernel bug

GCC 4.3.0 exposes a kernel bug

GCC 4.3.0 exposes a kernel bug

GCC 4.3.0 exposes a kernel bug

GCC 4.3.0 exposes a kernel bug

*the* standard

GCC 4.3.0 exposes a kernel bug

the standard