The ups and downs of strlcpy()
The ups and downs of strlcpy()
Posted Jul 19, 2012 4:26 UTC (Thu) by wahern (subscriber, #37304)Parent article: The ups and downs of strlcpy()
First of all, checking and branching on the return value of strlcpy is easier than other interfaces. The return value gives you all the information you need to know: specifically, whether truncation occurred and how much you need to grow your buffer. Other interfaces require more logic to get at this information. In other words, strlcpy is a more elegant abstraction.
But more importantly, the idea that you _need_ to check the return value is wrong. If truncation occurs, it's likely because there's input that is unexpected by the developer. This may because the developer was lazy, or maybe because there's an some implicit contract or protocol.
Either way, the input is garbage. Garbage in is garbage out. Now, you can try to catch that garbage, but why? If someone gives me a domain length of 300 characters, should I bail or just silently truncate? It depends. you could bail, but then you may have to add a new error path unlikely to be tested much, if ever. Sometimes it's better to just handle garbage as sanely as possible, and that means just going through the motions.
More bugs occur in "exception" blocks than perhaps anywhere else. NTP bug last month? Exception. Memory errors. Signals. Etc. I mean, come on people!
So, sometimes the sanest thing to do with garbage is to keep trucking along, and let the user reap what he has sown, at least as long as the garbage output is direct function of the garbage input. Thus, strlcpy has a third useful mode: truncation.
No other alternative to strlcpy does all three of these things. And no other alternative or set of alternatives is as simple to use. Simple code is better code, _always_.
This debate over strlcpy is plain stupid. Of course there are always better ways to do address any particular scenario. But the job of a C library isn't to give you the perfect tool for each and every job. It's to provide a small collection of tools with higher average utility. This reductive obsession with telling people to resort to things like memcpy, or to use dynamic string libraries, is insane. It's like taking away condoms and telling people to get married before they have sex. It ain't gonna happen. The world isn't that simple.
Posted Jul 19, 2012 15:55 UTC (Thu)
by RobSeace (subscriber, #4435)
[Link] (8 responses)
Posted Jul 19, 2012 22:08 UTC (Thu)
by nix (subscriber, #2304)
[Link] (4 responses)
Posted Jul 19, 2012 22:44 UTC (Thu)
by RobSeace (subscriber, #4435)
[Link] (3 responses)
I'm not sure how long glibc has had open_memstream(), but you could've done it with that and fprintf() instead... Or, asprintf(), if that's been around longer... Or, hell, you could always have just fprintf()'d to a temp file, checked the size, allocated a buffer, and read the file back in... Oh, wait, you said "sane"... ;-)
I just thought the same "Oh noes, truncation!!!" worries applied to snprintf(), as well... And, frankly, I haven't heard of that causing major security nightmares anywhere yet... Has it?
Posted Jul 19, 2012 23:11 UTC (Thu)
by nix (subscriber, #2304)
[Link] (2 responses)
Posted Jul 20, 2012 10:09 UTC (Fri)
by RobSeace (subscriber, #4435)
[Link] (1 responses)
Posted Jul 20, 2012 12:44 UTC (Fri)
by nix (subscriber, #2304)
[Link]
Obviously the only real solution is to write your own string abstraction using counted strings or whatever floats your boat...
Posted Jul 20, 2012 4:18 UTC (Fri)
by cmccabe (guest, #60281)
[Link] (2 responses)
Programmers who want to check for truncation will find it easier to do with strlcpy. It requires fewer lines of code. That makes it easier to audit the code and find bugs. Programmers who don't want to check for truncation aren't going to do so anyway, no matter what API you add or don't add to the standard library.
C is a fundamentally different language than most of the modern languages out there. It's not managed. There will always be a way for the programmer to screw up. People seem to not grasp this concept. They think strlcpy should not be added, because somewhere-- maybe hiding under the couch-- is a perfect function which will magically make copying strings in C safe, like in the managed languages. There isn't.
Posted Jul 20, 2012 21:08 UTC (Fri)
by smurf (subscriber, #17840)
[Link] (1 responses)
Look, we all know that there's no magic bullet in programming. Different tools do different parts of the job well, others … not so much.
Fortunately, there are two quite simple workarounds for not having strlcpy in libc:
* add -lbsd to your GCC command line.
* #define strlcpy(d,n,s) snprintf((d),(n),"%s",(s))
Which of these is more efficient is left as an exercise to the reader. :-P
Posted Aug 24, 2013 23:57 UTC (Sat)
by tjc (guest, #137)
[Link]
I just tried this (better late than never!), and parameters 2 and 3 are reversed. Fore the sake of posterity, it should be:
#define strlcpy(d,s,n) snprintf((d),(n),"%s",(s))
Posted Jul 20, 2012 12:13 UTC (Fri)
by vonbrand (subscriber, #4458)
[Link]
The problem with your "just keep on trucking with garbage input" is that that case will be as little tested as the error handling code it replaces, and I'm not so sure it will be handled with any real care for wacky data by your average programmer.
Posted Jul 20, 2012 21:21 UTC (Fri)
by epa (subscriber, #39769)
[Link] (11 responses)
The _s family of functions mentioned elsewhere in the discussion sound like the right approach. If you want truncation, and you've thought about the consequences, then you can ask for it. If you haven't thought about it and you don't explicitly check for the too-long error case, then the fail-safe behaviour is just to abort if this happens.
Posted Jul 20, 2012 21:36 UTC (Fri)
by dlang (guest, #313)
[Link] (10 responses)
truncating and returning an error saying that you did so leaves the error checking up to the programmer where it belongs.
sometimes programmers will not check error conditions properly, if they don't their software will have problems no matter what the library routines do.
If I have a service serving thousands of users per second, shutting the entire service down because someone entered a too-long string is unlikely to be what I want to have happen.
Posted Jul 20, 2012 23:48 UTC (Fri)
by apoelstra (subscriber, #75205)
[Link] (9 responses)
But if this happened, and it had never occurred to you that it -could- happen, better that the program crashes than to do something unpredictable.
On the other had, if it did occur to you, you'd have put a check in.
Posted Jul 21, 2012 0:27 UTC (Sat)
by dlang (guest, #313)
[Link] (8 responses)
not everything is so critical that being protected against every possible exploit is the most important thing.
If you are running a game server, doing something "unexpected" for someone who puts in a 200 character name may be preferred to shutting down the game for everyone else.
It also depends on what the worst 'unexpected' think that it could do is. If it's "gain a shell prompt on the server" it's a lot more significant than put the wrong thing in a high score list"
You are exibiting the biggest failing of security people, mistaking security as an end in and of itself as opposed to being a tool to support everything else. As a security person myself, it's a tendency that I trip over regularly in myself. Everything has a cost and sometimes the cost of something is higher than the thing it's preventing.
Posted Jul 23, 2012 20:12 UTC (Mon)
by mmeehan (subscriber, #72524)
[Link] (5 responses)
I like how glib handles this (G_DISABLE_CHECKS and G_DISABLE_ASSERT). Normally function calls are made safe with g_assert and g_return_if_fail macros, but if you'd like to be unsafe (and slightly faster), you can disable them with compile-time options. By default you're safer security-wise, but you can remove the brakes if you desire.
Posted Jul 25, 2012 19:13 UTC (Wed)
by bronson (subscriber, #4806)
[Link] (4 responses)
It would be nice if safety was always the default. Alas, libc (as standardized) only started thinking this way relatively recently.
Posted Jul 25, 2012 20:58 UTC (Wed)
by smurf (subscriber, #17840)
[Link] (3 responses)
The fundamental point is that you cannot know beforehand whether looking at the string twice is a [performance] problem, whether truncating (with or without fixing incomplete UTF-8 codes) is better than not starting to fill the buffer in the first place, whether calling abort() is a good idea (I'd say that if you are a library, it almost never is), whether to return something negative or the new length or the source length or …, and a host of related questions, all of which do not lend themselves to consensus answers. As this discussion shows quite clearly, IMHO.
My point is that, with the sole exception of leaving the destination buffer undisturbed when the source won't fit, any of the aforementioned behaviors can be implemented with a reasonably-trivial O(1) wrapper around strlcpy(). Therefore, keeping strlcpy() out of libc is … kindof stupid. Again, IMHO.
Instead, people are told to use strncpy(). Which they'll do incorrectly. Let's face it, running off the end of a string into la-la land is always worse than truncating it.
Posted Jul 26, 2012 1:28 UTC (Thu)
by nybble41 (subscriber, #55106)
[Link] (2 responses)
Posted Jul 26, 2012 8:53 UTC (Thu)
by renox (guest, #23785)
[Link] (1 responses)
That said, one size doesn't fit all so having different function is reasonable, the biggest issue is that there is no sane default behaviour..
Posted Jul 26, 2012 16:27 UTC (Thu)
by nybble41 (subscriber, #55106)
[Link]
The strxcpy function isn't just a wrapper; it does all of the real work. The strxcpy_abort, strxcpy_truncate functions only run when an overflow condition is detected. This allows you to substitute your own preferred method of error-handling. This is actually rather similar to the way exceptions are handled in Common Lisp or Scheme programs, except that the Lisp version would use dynamic variables rather than explicit arguments for the handler code, which results in less cluttered code. Scheme-style parameters are attached to the current continuation, meaning that they're not only thread-safe, but that the bindings only affect the inner dynamic scope of the (parameterize) form, even in exceptional cases such as non-local return (like the middle example above) and even re-entry into a dynamic scope which was previously exited.
Posted Jul 25, 2012 12:30 UTC (Wed)
by NAR (subscriber, #1313)
[Link] (1 responses)
Posted Jul 26, 2012 11:51 UTC (Thu)
by hppnq (guest, #14462)
[Link]
The ups and downs of strlcpy()
The ups and downs of strlcpy()
The ups and downs of strlcpy()
> securely -- you couldn't tell how big a buffer it would need in the
> general case without reimplementing the guts of sprintf() yourself.
The ups and downs of strlcpy()
The ups and downs of strlcpy()
The ups and downs of strlcpy()
The ups and downs of strlcpy()
The ups and downs of strlcpy()
The ups and downs of strlcpy()
The ups and downs of strlcpy()
The ups and downs of strlcpy()
The ups and downs of strlcpy()
The ups and downs of strlcpy()
The ups and downs of strlcpy()
The ups and downs of strlcpy()
* If I send garbage to a function it should fail immediately (abort). The state of my program is undefined.
* If I send garbage to a function it should deal with it somehow (truncate & null pad). No library should ever abort, because that may make my program crash when the error was survivable.
* If I send garbage to a function it's my own fault and my code should check pre and post conditions (silently do whatever). If you fail to do this you will have buffer overflows and get pwned.
The ups and downs of strlcpy()
Aborting is not "safe". Aborting is one of at least five ways to handle this particular error. Whether that is 'safe' depends on the context, i.e. your definition of that word.
The ups and downs of strlcpy()
Here's a suggestion (only partly sarcastic):
The ups and downs of strlcpy()
typedef size_t (*strxcpy_handler_t)(char *dst, const char *src, size_t size, void *data);
size_t strxcpy(char *dst, const char *src, size_t size, strxcpy_handler_t overflow_fn, void *overflow_data)
{
char *p;
const char *q;
for (p = dst, q = src; *q; ++p, ++q)
{
if ((p - dst) >= size)
{
return overflow_fn(dst, src, size, overflow_data);
}
*p = *q;
}
/* get here only if strlen(src) < size */
*p++ = '\0';
return (p - dst);
}
size_t strxcpy_truncate(char *dst, const char *src, size_t size, void *data)
{
if (size <= 0) abort();
dst[size - 1] = '\0';
return size + strlen(src + size);
}
size_t strxcpy_abort(char *dst, const char *src, size_t size, void *data)
{
abort();
return size;
}
if (strxcpy(dst, src, dst_size, strxcpy_truncate, NULL) >= dst_size) ...;
(void)strxcpy(dst, src, dst_size, strxcpy_abort, NULL);
(void)strxcpy(dst, src, dst_size, strxcpy_subst, "(input too long)");
/* ... */
The ups and downs of strlcpy()
The ups and downs of strlcpy()
(define (default-error-handler error-value) (abort))
(define current-error-handler (make-parameter default-error-handler))
(define (do-something)
(... (if (ok? var) var ((current-error-handler) var)) ... ))
; aborts on error
(do-something)
; evaluates to #t on success, or #f on error
(let/cc return
(parameterize ([current-error-handler (lambda _ (return #f))])
(do-something)
#t)
; uses "value" in place of var on error
(parameterize ([current-error-handler (lambda _ value)])
(do-something))
Why would aborting a single process shut down the whole gaming server? I mean if you have a sufficiently large and complicated software, there will be terminal bugs there that you have to handle (e.g. have a lightweight supervisor process that starts a separate server for each new client). But if you already handle these problems, then you might as well crash on purpose in the worker processes.
The ups and downs of strlcpy()
It would not make sense to handle the relatively complex case of failing modules correctly but not string manipulation routines.
The ups and downs of strlcpy()