|
Was it worth the trouble?Was it worth the trouble?Posted Mar 8, 2008 22:46 UTC (Sat) by eru (subscriber, #2753)In reply to: GCC 4.3.0 exposes a kernel bug by JoeBuck Parent article: GCC 4.3.0 exposes a kernel bug The result was that code sequences that use the x86 string instructions are slightly smaller and faster with gcc 4.3.0. The eliminated instruction is one byte long, executes very quickly, and string instructions are not very common in most real code anyway. When they occur, they are heavyweight operations, because the sources and count have to be set up into particular registers, and the string instruction itself usually takes much more time than simple instructions. Whether or not the direction flag instruction appears might then change the time of the string operation by perhaps 1% or less. So except for contrived programs that consist almost entirely of these string operations, I suspect it is impossible to measure any execution time reduction in actual programs that could be attributed to this compiler change. Of course removing a redundant instruction is aesthetically the right thing to do, but in this case I think it does not have practical benefits.
(Log in to post comments)
Was it worth the trouble? Posted Mar 9, 2008 20:07 UTC (Sun) by vonbrand (subscriber, #4458) [Link]
Was it worth the trouble? Posted Mar 19, 2008 9:38 UTC (Wed) by pharm (guest, #22305) [Link] <i>The eliminated instruction is one byte long, executes very quickly, and string instructions are not very common in most real code anyway. When they occur, they are heavyweight operations, because the sources and count have to be set up into particular registers, and the string instruction itself usually takes much more time than simple instructions. Whether or not the direction flag instruction appears might then change the time of the string operation by perhaps 1% or less. So except for contrived programs that consist almost entirely of these string operations, I suspect it is impossible to measure any execution time reduction in actual programs that could be attributed to this compiler change.</i> Unfortunately, you're wrong. CLD can have a latency of 50+ cycles on some x86 implementations: that's not an insignificant amount. Plus we're not just talking about "string operations", we're talking about functions like memset() & memcpy() too, which often use them. See: http://gcc.gnu.org/ml/gcc/2008-03/msg00360.html for some benchmarks and http://gcc.gnu.org/ml/gcc/2008-03/msg00404.html for a link to a document which gives a latency of 52 cycles for CLD.
|
Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.