strlcpy()
As is often the case with this sort of security-related improvement, OpenBSD got there first. In fact, back in 1996, the OpenBSD team came up with a new string API which avoids the problems of both strcpy() and strncpy(). The resulting functions, with names like strlcpy(), have been spreading beyond OpenBSD. The basic function is simple:
size_t strlcpy(char *dest, const char *src, size_t size);
The source string is copied to the destination and properly terminated; the return value is the length of the source. If that length is greater than the destination string, the caller knows that the string has been truncated.
Linus agreed that following OpenBSD's lead was the right way forward, and
strlcpy() is in his BitKeeper repository, waiting for 2.5.71.
There has also been a flurry of activity to convert kernel code over to the
new function. By the time 2.6.0 comes out, strncpy() may no
longer have a place in the Linux kernel.
Posted May 29, 2003 16:26 UTC (Thu)
by eh (guest, #266)
[Link] (6 responses)
Posted May 29, 2003 17:05 UTC (Thu)
by cpeterso (guest, #305)
[Link] (5 responses)
I think glibc should go a step further and actually remove the dangerous functions like strcpy(). There are safe replacements. glibc should use a carrot and stick approach. Offer the safe replacements and remove (or just deprecate) the dangerous functions.
Posted May 29, 2003 17:40 UTC (Thu)
by Ross (guest, #4065)
[Link]
Feel free to use strlcpy() exclusively yourself, but don't tell everyone
Posted May 29, 2003 19:54 UTC (Thu)
by dwheeler (guest, #1216)
[Link] (3 responses)
However, it is NOT in glibc, because Ulrich Drepper doesn't want it
in there. You can see his rationale for this in the glibc
mailing list.
I don't agree with his decision.
One of the biggest security problems is STILL buffer overflows, and
strlcpy()/strlcat() can really help reduce their incidence.
And since it's not in glibc, everyone has to "roll their own"
implementation (which may not be as efficient as it would be if
it were in the standard library).
If a future C standard included strlcpy() in the standard C library,
then I believe glibc would add it too.
Thus, if you want strlcpy() available everywhere,
it might be best to appear to the ISO C group to add strlcpy()
and strlcat() to the C standard library.
Posted May 29, 2003 23:14 UTC (Thu)
by eh (guest, #266)
[Link] (2 responses)
What was his rationale? > If a future C standard included strlcpy() in the standard C library, I certainly hope you're right about that. > [...] it might be best to appear to the ISO C group to add strlcpy() Yes, provided s/\(appea\)r/\1l/ ;)
Posted May 30, 2003 4:52 UTC (Fri)
by rloomans (guest, #759)
[Link] (1 responses)
> What was his rationale?
The thread starts here.
Christoph Hellwig posted a patch to implement strlcat() and strlcpy(). Ulrich Drepper replies scathingly...
Posted May 30, 2003 14:59 UTC (Fri)
by eh (guest, #266)
[Link]
Posted May 29, 2003 20:27 UTC (Thu)
by nas (subscriber, #17)
[Link] (12 responses)
Posted May 29, 2003 22:20 UTC (Thu)
by ncm (guest, #165)
[Link] (7 responses)
Anyway I think it's better. You decide.
Posted May 30, 2003 4:15 UTC (Fri)
by tjc (guest, #137)
[Link]
Posted May 30, 2003 19:36 UTC (Fri)
by tjc (guest, #137)
[Link] (5 responses)
I haven't tested this extensively, but I've found that the best way to debug code is to post it on the internet. ;-) People come out of the woodwork...
Posted May 31, 2003 1:39 UTC (Sat)
by dododge (guest, #2870)
[Link] (3 responses)
Pointer comparison is undefined if the pointers
are not within the same object, so this is not
portable standard C.
I haven't looked closely at the rest of the function. If you
want it thoroughly picked apart you could try posting it to
USENET comp.lang.c [insert evil laugh]
Posted May 31, 2003 6:04 UTC (Sat)
by tjc (guest, #137)
[Link] (2 responses)
I've heard of this, but I have never read a good explanation. I just assumed that this restiction has something to do with pointer aliasing. If you understand this, now is the time to show off! ;-) memmove is safe for overlapping regions, and since it's part of the standard library it's allowed to use architecture-specific magic to compare arbitrary pointers. How is the performance of memmove()? Does it copy memory a double word at a time? I couldn't find a general way to do this without using architecture-specific magic as you say, so I fell back to memcpy(). BTW, s/int len, i/size_t len/ above. I noticed this about 100ns after I posted. Always use -Wall...
Posted Jun 2, 2003 6:13 UTC (Mon)
by eru (subscriber, #2753)
[Link]
I always assumed the restriction in the C standard exists mainly because
Posted Jun 4, 2003 3:39 UTC (Wed)
by dododge (guest, #2870)
[Link]
Well, it's undefined because the standard explicitly says so
Portability discussions come up in
Depends on your C library, compiler, operating
system, chip architecture, etc. You'll have to examine
your libc source to find out how it's done for your system. And
you'll also have to check your compiler output to make sure
it actually calls the libc implementation. For example gcc 2.95.3
on sparc-sun-solaris produces inline assembly for small
I'm rather fond of
Posted May 31, 2003 20:36 UTC (Sat)
by fjord (guest, #6510)
[Link]
http://sources.redhat.com/ml/libc-alpha/2000-08/msg00110.html strlcpy should return the size of the source string and do nothing, if the buffer is too small.
Posted May 29, 2003 22:21 UTC (Thu)
by raph (guest, #326)
[Link] (2 responses)
Posted May 30, 2003 4:40 UTC (Fri)
by ncm (guest, #165)
[Link] (1 responses)
I wonder if we should bother about what to do if size == 0.
Mine crashes spectacularly, which is a Good Thing.
Getting within 15% of memcpy is pretty damn good, in my estimation.
Of course I didn't read Linus's version, or OpenBSD's; that would be
cheating, and I would be tainted besides. Of course now that I have
been told, via cleanroom methods, I can adjust mine to be equally
fast, and maybe (one can hope) actually identical to both Linus's and
OpenBSD's.
Posted May 30, 2003 16:28 UTC (Fri)
by tjc (guest, #137)
[Link]
You're probably going to have to copy more than one char at a time to match memcpy() for speed.
Posted Jun 5, 2003 14:32 UTC (Thu)
by djm (subscriber, #11651)
[Link]
It is licensed that was so people don't have to make stupid errors when reinventing the wheel.
Posted Jun 10, 2003 19:22 UTC (Tue)
by hogsberg (guest, #11751)
[Link]
Kristian
Thanks for noting this function; note also that *BSD has a strlcat(). strlcpy()
Maybe the glibc folks should consider adding these two.
BTW, OpenBSD might have been first, but Free&NetBSD have them now too.
Once again LWN has called my attention to something I'm glad to know.
I agree that glibc should add these functions. *BSD has had them for years. I think even the Solaris libc has them.strlcpy()
There are many perfectly valid uses of strcpy(). In fact most usesstrlcpy()
of strcpy() are probably correct. Removing that function would lead
to huge problems with backwards compatibility, cross-platform portability,
and standards compliance. It would be a very, very bad idea.
else what they can or can't use. The only function I have ever agreed
with removing is gets() because there is a nearly 100% chance of a bug
every time it is used.
Yes, strlcpy() is in the *BSDs, it's also in Sun Solaris.
It's also in the library Glib (NOT glibc), the basic library for
GTK+ and GNOME.
strlcpy()
I grabbed libc-hacker-*.bz2 fromstrlcpy()
ftp://sources.redhat.com/pub/glibc/mail-archives/
and grepped neither strlcpy or strlcat.
> then I believe glibc would add it too.
> and strlcat() to the C standard library.strlcpy()
Well, thanks for the link, but I almost wish I hadn't asked. strlcpy()
Now sickened early in the morning, waiting for beer o'clock.
> Ulrich Drepper replies scathingly...
... and inappropriately, even idiotically.
sentence 1: claims inefficiency, contradicting Usenix paper cited by
Hellwig, does not support claim.
sentence 2: claims strl*() lead to ``other errors'', again unsupported.
sentence 3: On the soapbox with disregard for real world and the problem
strl*() are meant to address.
Next, in reply to Hellwig's reply to his first reply he reveals the One True:
*((char *) mempcpy (dst, src, n)) = '\0';
He doesn't say whether n is sizeof(*dst)-1, or strlen(src), but either way
this must be preceded by setup and error-checking code. The former
needs n >= strlen(src) or the copy is potentially truncated. So either way
previous error-prone code is necessary, a strlen() is necessary (he argued
efficiency), and you wind up with the elements necessary for plain old
strcpy() (stpcpy() if you're saving the return). So what does mempcpy()
do that's better than strlcpy()?
Next muddled paragraph suggests strl*() are buggy because ``If a string is
too long for an allocated memory block the copying must not simply
silently stop.'' He seems to have missed that his mempcpy() thingy will
``simply silently stop'' and requires advance knowledge of strlen(src)
whereas the return from strlcpy() eases truncation detection which
subsequent code can then handle.
I don't know why I bothered writing this. I'm annoyed now. There's a reason
I usually just lurk.
I don't read the list and don't know Ulrich Drepper's character. I only hope
he just had a bad day. That was 08/2000, maybe someone should bring
the subject up again. (Now someone will say it's been brought up again
and he maintains the same arguments, right?)
(To counteract the tone of this posting I want to say I really admire glibc and
its developers.)
I wrote a public domain version of strlcpy since the BSD version is licensed with the annoying advertising clause.
strlcpy()
I have posted a better implementation, also public domain.
better code
I timed both your implementation and Linus' through a 4 billion interation loop, and Linus has you by about 10 to 15 percent. But then he's using memcpy(), so there's an issue with overlapping source and destination strings..
better code
OK, here's my implementation:better code
size_t strlcpy(char *dest, const char *src, size_t n)
{
int len, i;
if (!n)
return 0;
len = strlen(src);
if (len >= n)
len = n - 1;
/* check for overlapping source and destination */
if ((src < dest && src + len >= dest)
|| (dest < src && dest + n > src)
|| src == dest)
{
size_t i;
for (i = 0; src[i] && i < n - 1; i++)
dest[i] = src[i];
dest[i] = (char) 0;
}
else
{
memcpy(dest, src, len);
dest[len] = (char) 0;
}
return len;
}
better code
/* check for overlapping source and destination */
if ((src < dest && src + len >= dest)
memmove
is safe for overlapping regions,
and since it's part of the standard library it's
allowed to use
architecture-specific magic to compare arbitrary
pointers. It can also make use of optimized machine
code, so it's potentially more efficient than any
implementation written in standard C.
but I've found that the best way to debug code is to post it on the internet.
:-)
Pointer comparison is undefined if the pointers are not within the same object, so this is not portable standard C.better code
>> Pointer comparison is undefined if the pointers are not within the samebetter code
>> object, so this is not portable standard C.
>
> I've heard of this, but I have never read a good explanation. I just
> assumed that this restiction has something to do with pointer aliasing. If
> you understand this, now is the time to show off! ;-)
in segmented memory management, the numeric values of pointers do not
necessarily correspond to their relative arrangement in memory.
Comparison is meaningful only for pointers that have the same segment
part. Since Linux does not use segmentation, at least not in an
user-visible way, there is no need to worry about this. (Few operating
systems use segments these days, but I happen to work with one that does,
even though it runs on the 32-bit versions of x86. Yes, 48-bit pointers!)
better code
Pointer comparison is undefined if the pointers are not within the same object,
I've heard of this, but I have never read a good explanation. I just assumed that this restiction has something to do with pointer aliasing.
:-)
. As to why the standard says so,
there is presumably some architecture out there that C works on
which cannot reliably support comparing arbitrary pointers; or
allowing this comparison might make it too difficult to implement
C on certain architectures. The most obvious reason for allowing
this would be to implement memmove
, which the standard
already provides.
comp.lang.c
fairly
often, and someone occasionally chimes in with an example of a real
architecture they deal with where common-sense assumptions about
computer architecture don't hold true.
There's a lot of weird designs out there, and when you start
talking about embedded devices they may even have a larger installed
base than anything x86-derived. The worst case is
the "DeathStation 9000", a hypothetical
machine where even the most subtle undefined
behavior produces catastrophic results.
How is the performance of memmove()? Does it copy memory a double word at a time?
memcpy
operations
rather than actually calling into libc.
Always use -Wall...
-ansi -pedantic -Wall -W
myself
:-)
Hmm, I haven't actually read the original documentation for the strl* functions, but according to this:better code
I'm pretty sure you have an off-by-one error there - your code can write up to (size+1) bytes of dst, while from the paper it looks like the correct semantics are to write up to size bytes only - so that strlcat(buf, src, sizeof(buf)) is safe.Off by one?
I agree, Neil's is buggy. Walk through it with size == 1.
It clobbers one beyond the end of the input array.
Off by one?
Getting within 15% of memcpy is pretty damn good, in my estimation. Of course I didn't read Linus's version, or OpenBSD's; that would be cheating, and I would be tainted besides. Of course now that I have been told, via cleanroom methods, I can adjust mine to be equally fast, and maybe (one can hope) actually identical to both Linus's and OpenBSD's.Off by one?
Rubbish - the OpenBSD strlcpy has NO advertising clause. It is licensed under an ISC license, which is about as liberal as you can get. strlcpy()
Forgive me for nit-picking, but the correct term is NUL-terminated. NULL is the special pointer in C, NUL is the ASCII character with integer value 0 used for terminating strings.strlcpy()