LWN.net Logo

strlcpy()

strlcpy()

Posted May 29, 2003 20:27 UTC (Thu) by nas (subscriber, #17)
Parent article: strlcpy()

I wrote a public domain version of strlcpy since the BSD version is licensed with the annoying advertising clause.


(Log in to post comments)

better code

Posted May 29, 2003 22:20 UTC (Thu) by ncm (subscriber, #165) [Link]

I have posted a better implementation, also public domain.

Anyway I think it's better. You decide.

better code

Posted May 30, 2003 4:15 UTC (Fri) by tjc (guest, #137) [Link]

I timed both your implementation and Linus' through a 4 billion interation loop, and Linus has you by about 10 to 15 percent. But then he's using memcpy(), so there's an issue with overlapping source and destination strings..

better code

Posted May 30, 2003 19:36 UTC (Fri) by tjc (guest, #137) [Link]

OK, here's my implementation:

size_t strlcpy(char *dest, const char *src, size_t n)
{
        int len, i;

        if (!n)
                return 0;

        len = strlen(src);
        if (len >= n)
                len = n - 1;

        /* check for overlapping source and destination */
        if ((src < dest && src + len >= dest)
                || (dest < src && dest + n > src)
                || src == dest)
        {
                size_t i;

                for (i = 0; src[i] && i < n - 1; i++)
                        dest[i] = src[i];

                dest[i] = (char) 0;
        }
        else
        {
                memcpy(dest, src, len);
                dest[len] = (char) 0;
        }

        return len;
}

I haven't tested this extensively, but I've found that the best way to debug code is to post it on the internet. ;-) People come out of the woodwork...

better code

Posted May 31, 2003 1:39 UTC (Sat) by dododge (subscriber, #2870) [Link]


/* check for overlapping source and destination */
if ((src < dest && src + len >= dest)

Pointer comparison is undefined if the pointers are not within the same object, so this is not portable standard C.

memmove is safe for overlapping regions, and since it's part of the standard library it's allowed to use architecture-specific magic to compare arbitrary pointers. It can also make use of optimized machine code, so it's potentially more efficient than any implementation written in standard C.

but I've found that the best way to debug code is to post it on the internet.

I haven't looked closely at the rest of the function. If you want it thoroughly picked apart you could try posting it to USENET comp.lang.c [insert evil laugh] :-)

better code

Posted May 31, 2003 6:04 UTC (Sat) by tjc (guest, #137) [Link]

Pointer comparison is undefined if the pointers are not within the same object, so this is not portable standard C.

I've heard of this, but I have never read a good explanation. I just assumed that this restiction has something to do with pointer aliasing. If you understand this, now is the time to show off! ;-)

memmove is safe for overlapping regions, and since it's part of the standard library it's allowed to use architecture-specific magic to compare arbitrary pointers.

How is the performance of memmove()? Does it copy memory a double word at a time? I couldn't find a general way to do this without using architecture-specific magic as you say, so I fell back to memcpy().

BTW, s/int len, i/size_t len/ above. I noticed this about 100ns after I posted. Always use -Wall...

better code

Posted Jun 2, 2003 6:13 UTC (Mon) by eru (subscriber, #2753) [Link]

>> Pointer comparison is undefined if the pointers are not within the same
>> object, so this is not portable standard C.
>
> I've heard of this, but I have never read a good explanation. I just
> assumed that this restiction has something to do with pointer aliasing. If
> you understand this, now is the time to show off! ;-)

I always assumed the restriction in the C standard exists mainly because
in segmented memory management, the numeric values of pointers do not
necessarily correspond to their relative arrangement in memory.
Comparison is meaningful only for pointers that have the same segment
part. Since Linux does not use segmentation, at least not in an
user-visible way, there is no need to worry about this. (Few operating
systems use segments these days, but I happen to work with one that does,
even though it runs on the 32-bit versions of x86. Yes, 48-bit pointers!)

better code

Posted Jun 4, 2003 3:39 UTC (Wed) by dododge (subscriber, #2870) [Link]

Pointer comparison is undefined if the pointers are not within the same object,
I've heard of this, but I have never read a good explanation. I just assumed that this restiction has something to do with pointer aliasing.

Well, it's undefined because the standard explicitly says so :-). As to why the standard says so, there is presumably some architecture out there that C works on which cannot reliably support comparing arbitrary pointers; or allowing this comparison might make it too difficult to implement C on certain architectures. The most obvious reason for allowing this would be to implement memmove, which the standard already provides.

Portability discussions come up in comp.lang.c fairly often, and someone occasionally chimes in with an example of a real architecture they deal with where common-sense assumptions about computer architecture don't hold true. There's a lot of weird designs out there, and when you start talking about embedded devices they may even have a larger installed base than anything x86-derived. The worst case is the "DeathStation 9000", a hypothetical machine where even the most subtle undefined behavior produces catastrophic results.

How is the performance of memmove()? Does it copy memory a double word at a time?

Depends on your C library, compiler, operating system, chip architecture, etc. You'll have to examine your libc source to find out how it's done for your system. And you'll also have to check your compiler output to make sure it actually calls the libc implementation. For example gcc 2.95.3 on sparc-sun-solaris produces inline assembly for small memcpy operations rather than actually calling into libc.

Always use -Wall...

I'm rather fond of -ansi -pedantic -Wall -W myself :-)

better code

Posted May 31, 2003 20:36 UTC (Sat) by fjord (guest, #6510) [Link]

Hmm, I haven't actually read the original documentation for the strl* functions, but according to this:

http://sources.redhat.com/ml/libc-alpha/2000-08/msg00110.html

strlcpy should return the size of the source string and do nothing, if the buffer is too small.

Off by one?

Posted May 29, 2003 22:21 UTC (Thu) by raph (guest, #326) [Link]

I'm pretty sure you have an off-by-one error there - your code can write up to (size+1) bytes of dst, while from the paper it looks like the correct semantics are to write up to size bytes only - so that strlcat(buf, src, sizeof(buf)) is safe.

Off by one?

Posted May 30, 2003 4:40 UTC (Fri) by ncm (subscriber, #165) [Link]

I agree, Neil's is buggy. Walk through it with size == 1. It clobbers one beyond the end of the input array.

I wonder if we should bother about what to do if size == 0. Mine crashes spectacularly, which is a Good Thing.

Getting within 15% of memcpy is pretty damn good, in my estimation. Of course I didn't read Linus's version, or OpenBSD's; that would be cheating, and I would be tainted besides. Of course now that I have been told, via cleanroom methods, I can adjust mine to be equally fast, and maybe (one can hope) actually identical to both Linus's and OpenBSD's.

Off by one?

Posted May 30, 2003 16:28 UTC (Fri) by tjc (guest, #137) [Link]

Getting within 15% of memcpy is pretty damn good, in my estimation. Of course I didn't read Linus's version, or OpenBSD's; that would be cheating, and I would be tainted besides. Of course now that I have been told, via cleanroom methods, I can adjust mine to be equally fast, and maybe (one can hope) actually identical to both Linus's and OpenBSD's.

You're probably going to have to copy more than one char at a time to match memcpy() for speed.

strlcpy()

Posted Jun 5, 2003 14:32 UTC (Thu) by djm (subscriber, #11651) [Link]

Rubbish - the OpenBSD strlcpy has NO advertising clause. It is licensed under an ISC license, which is about as liberal as you can get.

It is licensed that was so people don't have to make stupid errors when reinventing the wheel.

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds