LWN.net Logo

[PATCH] fast AND correct strncpy

From:  Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
To:  bk-commits-head@vger.kernel.org
Subject:  [PATCH] fast AND correct strncpy
Date:  Sun, 17 Aug 2003 15:48:39 +0000

ChangeSet 1.1207, 2003/08/17 08:48:39-07:00, albert@users.sourceforge.net

	[PATCH] fast AND correct strncpy
	
	This is Erik Andersen's excellent strncpy.
	It works like magic. That "if" isn't a jump;
	gcc uses a few integer instructions to wipe
	out all jumps except for the loop itself and
	the function call/return.
	
	This has been exhaustively tested against glibc.
	
	The existing code has 5 extra branches and
	is over twice as large. (my gcc, etc.)


# This patch includes the following deltas:
#	           ChangeSet	1.1206  -> 1.1207 
#	        lib/string.c	1.11    -> 1.12   
#

 string.c |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)


diff -Nru a/lib/string.c b/lib/string.c
--- a/lib/string.c	Sun Aug 17 09:06:29 2003
+++ b/lib/string.c	Sun Aug 17 09:06:29 2003
@@ -87,13 +87,12 @@
 {
 	char *tmp = dest;
 
-	while (count && (*dest++ = *src++) != '\0')
-		count--;
-	while (count > 1) {
-		*dest++ = 0;
+	while (count) {
+		if ((*tmp = *src) != 0) src++;
+		tmp++;
 		count--;
 	}
-	return tmp;
+	return dest;
 }
 #endif
 
-
To unsubscribe from this list: send the line "unsubscribe bk-commits-head" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


(Log in to post comments)

"fast AND correct strncpy" not working by itself?

Posted Aug 24, 2003 13:03 UTC (Sun) by dwheeler (subscriber, #1216) [Link]

When I slip this code into its own file and compile
it separately I don't seem to get this magic
"no jump statement" behavior. Any explanation?

I slipped this text into a file (adding headers) and used:
gcc -O3 -S strncpy.c
To see the assembly.

I still get a "jump" corresponding to the "if" statement.
Here's the relevant snippet for the loop. Notice that there's both a
"je" and "jne" instruction; I expected to only see the "jne", and
that the "je" was going to be automagically removed:

.L6:
movb (%ebx), %al
testb %al, %al
movb %al, (%edx)
je .L5
incl %ebx
.L5:
incl %edx
decl %ecx
jne .L6


This is using gcc (GCC) 3.2.2 20030222 (Red Hat Linux 3.2.2-5).
Is this later gcc not optimizing this as well? Does the Linux kernel
use additional compilation option tricks that make this work
(if so, it'd be good to know what they are)? Or, is there
a different assumption of what this "magic" is?

Thanks.

"fast AND correct strncpy" not working by itself?

Posted Aug 24, 2003 14:50 UTC (Sun) by dmantione (guest, #4640) [Link]

Try "GCC -O2 -march=athlon -S strncpy.c".
It seems the new version is not an improvement in any case.

Daniël

"fast AND correct strncpy" not working by itself?

Posted Aug 24, 2003 16:11 UTC (Sun) by IkeTo (subscriber, #2122) [Link]

> It seems the new version is not an improvement in any case.

Interesting, but real. I've tested it, and the old version is consistently faster than the old version, whether count is smaller or larger than the source string length.

"fast AND correct strncpy" not working by itself?

Posted Aug 25, 2003 2:41 UTC (Mon) by IkeTo (subscriber, #2122) [Link]

I mean, the old version is consistently faster than the new one. It is in the range of 20% faster.

"fast AND correct strncpy" not working by itself?

Posted Aug 28, 2003 9:30 UTC (Thu) by akukula (guest, #3862) [Link]

>It seems the new version is not an improvement in any case.

I agree. The same result can be yielded with different representation of this function, e.g.:
char* strncpy(char* dest, const char* src, unsigned count)
{
  char *tmp = dest;
  for (; count; --count)  if ((*tmp++ = *src))  ++src;
  return dest;
}
It's hard to imagine this sort of function to be more efficient...

I'm not a kernel hacker but why all of those functions doesn't terminate 'dest' if strlen(src) > count ???

And in case SCO is playing around, the above code is (C) 2003 Free Software Foundation, written from scratch by me, licensed under GPLv2. SCO is not permitted to use this code unless it make a donation of $1.000.000 (one million of US dollars) to FSF :)))

"fast AND correct strncpy" not working by itself?

Posted Aug 28, 2003 13:08 UTC (Thu) by IkeTo (subscriber, #2122) [Link]

> I'm not a kernel hacker but why all of those functions doesn't
> terminate 'dest' if strlen(src) > count ???

You don't need to be kernel hacker to understand that in POSIX strncpy is defined like that. In contrast, strncat does terminate the resulting string in such cases.

"fast AND correct strncpy" not working by itself?

Posted Aug 28, 2003 15:44 UTC (Thu) by dwheeler (subscriber, #1216) [Link]

Exactly right. strncpy _must_, by spec, \0 out the rest of the string. Which means that, in most code, you don't really want strncpy(); it wastes time zero'ing out bytes you probably don't care about.

If you want limited-length string operations, I suggest using the strlcpy() / strlcat() functions, which aren't in the official C standard but ARE widely available.

"fast AND correct strncpy" not working by itself?

Posted Aug 28, 2003 20:09 UTC (Thu) by akukula (guest, #3862) [Link]

Thanks for suggestions. Indeed, I found a table comparing performance of different 'str*cpy()'. Worth noting!
http://www.courtesan.com/todd/papers/strlcpy.html

> it wastes time zero'ing out bytes you probably don't care about.
Maybe someone then explain why the heck it is in kernel which I presume should use the most efficient algorithms?

Copyright © 2003, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds