How is strlcpy more dangerous? It behaves more like snprintf and friends in that it always adds null termination.
In any case, string handling imo is a weak and error prone area of c and anything is better than functions that sometimes null terminate, sometimes don't, don't perform bounds checking etc.
I'm not suggesting the original ones be removed, clearly that would break a lot of existing software, but lets provide some better and less error prone alternatives without getting caught in the whole "well the other functions work" or "everyone uses them" trap that seems to hold up progress.
Posted Mar 28, 2012 17:39 UTC (Wed) by arjan (subscriber, #36785)
[Link]
the problem with strlcpy is that it truncates, and lets the program continue to operate on known-flawed data.
you really want a strcpy variant that, on overflow, zeroes out the WHOLE THING, and ideally reports a proper, easy to check error return code for it.
if you hit an overflow, you outright DO NOT want to trust what you copied; not in any way.
A turning point for GNU libc
Posted Mar 28, 2012 17:40 UTC (Wed) by Cyberax (✭ supporter ✭, #52523)
[Link]
How about a function which just does abort() in case of overflow?
A turning point for GNU libc
Posted Mar 28, 2012 17:45 UTC (Wed) by slashdot (guest, #22014)
[Link]
strcpy_s in the Microsoft CRT does that (unless you explicitly set an error handler).
A turning point for GNU libc
Posted Mar 28, 2012 17:48 UTC (Wed) by Cyberax (✭ supporter ✭, #52523)
[Link]
Good for them. IMO, that's the only correct behavior in this case.
A turning point for GNU libc
Posted Mar 28, 2012 21:45 UTC (Wed) by dashesy (subscriber, #74652)
[Link]
Accepting the useful strcpy_s however can open the can of worms (strncpy_s and the rest of that family)
A turning point for GNU libc
Posted Mar 28, 2012 18:25 UTC (Wed) by arjan (subscriber, #36785)
[Link]
that's what strcpy() does...
but yes, that's one of the options, and as default behavior it's not all that bad (it means you can have a crash dump collector detect it and send it to the developer team).
Just it's also a death-sentence kind of API, you also want a variant which is "try to see if it fits, but if it doesn't let me know and I'll handle it gracefully"....
A turning point for GNU libc
Posted Mar 28, 2012 17:44 UTC (Wed) by cmorgan (guest, #71980)
[Link]
Ahh I see why that isn't a good idea either :-)
It would be handy if the libc people would see about innovating in that area a little bit. Maybe 3rd party libraries work but why get so stuck on posix as someone else mentioned?
Chris
A turning point for GNU libc
Posted Mar 28, 2012 17:50 UTC (Wed) by arjan (subscriber, #36785)
[Link]
letting other libraries work out what the right solution is, and when a winner emerges, pull that into glibc... doesn't sound like too bad a strategy to be honest.
A turning point for GNU libc
Posted Mar 28, 2012 19:20 UTC (Wed) by wahern (subscriber, #37304)
[Link]
That doesn't make any sense. Almost _all_ C routines continue on their merry way on undefined behavior.
The difference between strlcpy is that the behavior is well-defined as far as the language is concerned. The bad behavior (if truncation is undesired) is between chair and keyboard, and that's a huge step forward from the mess that traditional string routines have _actually_ made.
And truncation isn't silent: if (sizeof buf <= strlcpy(buf, str, sizeof buf)) oops("truncated!");
There, happy?
Most unix systems have strlcpy except Linux and compatriots like HP/UX. Nobody would have even questioned the utility of it if Drepper hadn't been so stubborn. And now somehow strlcpy() is faulted for supposed flaws that exist in almost all C routines, and are _especially_ egregious in the ones that Drepper proposed as alternatives. Just the other week on stackoverflow I saw someone post a memccpy() "alternative" which contained an off-by-one buffer overflow. (They apparently thought memccpy returned a pointer to the terminal character, not a pointer after the terminal character.)
Ridiculous.
And alternatives like strcpy_s from Microsoft are even more idiotic. Have you ever even tried to use them? The entire Annex K is stupid, and not even Microsoft supports the entire Annex even though they sponsored it.
People are entitled to their opinions, but this particular opinion just causes way too much headaches for too many people. You don't see systemd haters trying to keep systemd from shipping, do you? They just don't want to be _forced_ to use systemd. Nobody is forcing you to use strlcpy.
A turning point for GNU libc
Posted Mar 28, 2012 20:03 UTC (Wed) by arjan (subscriber, #36785)
[Link]
I would argue that the whole debate has shown that it's not a clear cut good API... and glibc, with its compat promise, does well to be conservative about adopting a solution.
(having been on the receiving end of such 'security scanner fixes' introducing more bugs than the original code had... I'm no big fan of either strncpy or strlcpy as replacement for strcpy.... especially since the linux strcpy is checking by itself in most of the critical cases already)
"does well to be conservative about adopting a solution"?
Posted Apr 6, 2012 12:58 UTC (Fri) by cjcoats (guest, #9833)
[Link]
But see this discussion about GLIBC memcpy() changes at Version 2.11 breaking many existing executables
with Linus' performance-evaluation (proved it did not create the claimed speedup, and provided alternatives) and remarks about breaking those existing executables (turns out Flash was the most prominent culprit but there were lots of others). [Warning: there's a lot of stuff here...]
And Drepper not only did not care: he said he did the right thing!
FWIW
"does well to be conservative about adopting a solution"?
Posted Apr 6, 2012 15:05 UTC (Fri) by khim (subscriber, #9252)
[Link]
Generally compatibility is guaranteed only if programmer follows guidelines. And in this sense Drepper has a point: the memory areas should not overlap. Use memmove(3) if the memory areas do overlap. is listed as requirement for memcpy for at least half-century (perhaps more, but I'm not sure).
Sometimes, if there are a lot of broken programs, bug-to-bug compatibility is important enough to go further - and this is example what happened here in the end as you can see here.
"does well to be conservative about adopting a solution"?
Posted Apr 6, 2012 17:37 UTC (Fri) by dlang (✭ supporter ✭, #313)
[Link]
that's a valid argument if the change improves something, but if (as in this case) the change doesn't give any performance improvement, the only 'advantage' of the change is that it breaks existing programs.
"does well to be conservative about adopting a solution"?
Posted Apr 6, 2012 20:15 UTC (Fri) by khim (subscriber, #9252)
[Link]
that's a valid argument if the change improves something, but if (as in this case) the change doesn't give any performance improvement, the only 'advantage' of the change is that it breaks existing programs.
Wow. Just… wow. I guess it's time to ask the usual question: are you an idiot or just play one on TV? You were given link - now, please go and read it.
People tend to see message from Linus (which says I bothered to _measure_ the speed, and according to my measurements, glibc wasn't any faster than my trivial version and was likely slower) and immediately switch to Linus is god, GLibC developers are stupid mindset. Which is not justified at all.
Because of course the very next sentence (but I only tested two cases) in the same paragraph flies right over their head and detailed explanation (At last on Core2 we gain 1.83x speedup compared with original instruction sequence and Based on our micro-benchmark small bytes from 1 to 127 bytes, we got up to 2X improvement, and up to 1.5X improvement for 1024 bytes on Corei7) is totally ignored or, at best, hand-waved with hopefully Linus has answered this one appeal to authority.
This is the same never-ending fight between pragmatists and standard hairsplitters. Linus, ever the pragmatist, never rebuffed speedup claim when it was pointed out that he was incorrect (good for him: speedup is very much hardware-dependent and was just unobservable on the hardware he used, it's quite real and measurable on different hardware) but he said quite sensible thing from pragmatist's POV: new version of memcpy may be more efficient, but it's more complex as well thus the usual excuse (we have trivial and fast memcpy and slower, but more robust memmove) no longer applies. But GLibC and GCC developers, ever then nitpickers, say that this makes no sense: spec most definitely says that if copying takes place between objects that overlap, the behaviour is undefined (and even that other OS agrees) so why should they add any such checks to the code which works fine in standard-mandated case?
Note: GLibC guys rolled back change for old binaries pretty quickly when it was found that their improvement broke real programs. After that point it's no longer about “stable ABI” and “backward compatibility”, but about “doing the right thing”.
I think the end result (old programs get the old behavior and new ones should finally fix the bugs in according to the documentation) makes sense in this context. You can say that this is what should have happened from the beginning, but it was not all that obvious that so many programs actually depend on the broken behavior of the old memcpy.
"does well to be conservative about adopting a solution"?
Posted Apr 6, 2012 20:17 UTC (Fri) by corbet (editor, #1)
[Link]
...and that's my cue for the usual request: can we please be just a bit more respectful of each other? If you disagree then by all means say so and why. But we don't need to be playing elementary-school name-calling games here.
"does well to be conservative about adopting a solution"?
Posted Apr 6, 2012 20:27 UTC (Fri) by paulj (subscriber, #341)
[Link]
There should be a naughty chair for commentators who resort to name-calling. Maybe ½ to 1 hour of not being allowed to comment?
A turning point for GNU libc
Posted Mar 31, 2012 21:35 UTC (Sat) by lacos (subscriber, #70616)
[Link]
I don't like this. strlcpy() continues to find the end / length of the source string even if the target buffer has ended way earlier. And it does it every single time.
Instead of this, know the length of your source string (which takes a single strlen() in the beginning), and allocate the target buffer accordingly. Or, if the target is a given, and too small, bail out before copying any characters.
A turning point for GNU libc
Posted Apr 3, 2012 0:20 UTC (Tue) by gdt (subscriber, #6284)
[Link]
Write the trailing 0 character which strlcpy() should have done:
Posted Apr 3, 2012 6:58 UTC (Tue) by smurf (subscriber, #17840)
[Link]
That code doesn't make any sense whatsoever. strlcpy does write a trailing zero.
What he wants is essentially a string copy which, when the buffer is too small, zeroes the buffer's FIRST byte, not its last.
A turning point for GNU libc
Posted Apr 9, 2012 11:04 UTC (Mon) by gdt (subscriber, #6284)
[Link]
You're right, I was thinking of strncpy().
A turning point for GNU libc
Posted Mar 28, 2012 20:43 UTC (Wed) by HelloWorld (guest, #56129)
[Link]
There *are* better and less error prone alternatives, such as GString from glib. So stop whining already.
A turning point for GNU libc
Posted Mar 28, 2012 22:47 UTC (Wed) by wahern (subscriber, #37304)
[Link]
By that logic, people would also be better off using C++. In many cases it'd be easier to port a C project to C++ than to adopt glib in all its monstrous glory.
Saying "just use glib" is cargo cult programming, IMNSHO. Why would I use a 300k SLoC library just for a saner way to copy a string? Some people like coding in plain, vanilla C, especially when our code has to be ported to many different environments. Getting strlcpy into glibc would be just one less extra burden to carry when trying to write sane, simple, secure code.
And also, why do I need a dynamic string object? Is Linux generally ever going to support file names longer than NAME_MAX characters, or paths longer than PATH_MAX characters? Are DNS labels magically going to become longer than 63 characters, or domains longer than 255 characters? Do my enum-to-human-readable-string mappings need an indefinite buffer?
Most people who code in C aren't doing unbounded string operations. Not every C application or library is an editor or a text processor. In fact, very few are. Don't confuse "C-strings" with "strings". People don't use strcpy or strlcpy to manipulate "strings" (the abstract computer science concept); they use them to manipulate C-strings, a very narrow and well-defined data type which in practice is almost always bounded by a rather small limit.
A turning point for GNU libc
Posted Mar 29, 2012 0:48 UTC (Thu) by geofft (subscriber, #59789)
[Link]
Saying "just use glib" is cargo cult programming, IMNSHO. Why would I use a 300k SLoC library just for a saner way to copy a string?
Because it's a saner way to do _everything_. If you just want strings, you can go use the Better String Library (bstring), but then you need to go find a better everything else library, too.
Besides, you're cargo-culting a one million SLoC library called glibc, and arguing for the inclusion of more functions in it... if you like "plain, vanilla C", you're welcome to copy strings without glibc. :) (Or with another standard library like dietlibc.)
A turning point for GNU libc
Posted Mar 29, 2012 11:47 UTC (Thu) by jnareb (subscriber, #46500)
[Link]
> Because it's a saner way to do _everything_. If you just want strings, you can go use the Better String Library (bstring), but then you need to go find a better everything else library, too.
Nb. Git went the way of creating own 'strbuf' micro-library for string manipulation.
A turning point for GNU libc
Posted Mar 29, 2012 7:13 UTC (Thu) by BenHutchings (subscriber, #37955)
[Link]
Is Linux generally ever going to support file names longer than NAME_MAX characters, or paths longer than PATH_MAX characters?
Absolutely - PATH_MAX is a totally meaningless value.
A turning point for GNU libc
Posted Mar 30, 2012 3:55 UTC (Fri) by wahern (subscriber, #37304)
[Link]
Fair enough. I was listing common buffer size macros off the top of my head. But that's beside the point. The others, and many more, are meaningful. Programming, like life, is full of arbitrary limits, and programming as if you'll ever need to meaningfully store a 1MB path name often leads to needless complexity, and complexity breeds bugs.
But let people continue to use strcpy, and let the exploits continue to roll in. Fortunately they've slowed down over the years, thanks to alternatives like snprintf and people copy+pasting strlcpy, and not so much because people are passing glib string objects to library routines.
A turning point for GNU libc
Posted Mar 29, 2012 16:12 UTC (Thu) by HelloWorld (guest, #56129)
[Link]
> Saying "just use glib" is cargo cult programming, IMNSHO. Why would I use a 300k SLoC library just for a saner way to copy a string?
Glib provides a hell of a lot more than just a sane (well, sane by C standards) string implementation.
> Some people like coding in plain, vanilla C, especially when our code has to be ported to many different environments.
Well, you don't seem to be one of them, as strlcpy is clearly *not* standard C. Besides, how is providing strlcpy going to help with portability? Microsofts C implementation doesn't include it either, so you can't rely on it either way.
> Do my enum-to-human-readable-string mappings need an indefinite buffer?
Of course not, but for that use case, strcpy is perfectly sufficient as the required buffer length is known at compile time.
A turning point for GNU libc
Posted Mar 30, 2012 22:24 UTC (Fri) by jengelh (subscriber, #33263)
[Link]
>Glib provides a hell of a lot more than just a sane (well, sane by C standards) string implementation.
I agree - it has utterly pointless typedefs like gchar (char is standardized by C, you know). Hiding an indirection (behind gpointer) is not nice either.
A turning point for GNU libc
Posted Apr 5, 2012 15:32 UTC (Thu) by welinder (guest, #4699)
[Link]
> char is standardized by C, you know
Really?
What, exactly, is standardized about char in C?
* it's a numeric type.
* it is distinct from signed char and unsigned char, but has the same
range of values as one of the two
* it's a magic type for aliasing
* it's the element type of "foo"
* sizeof(char)==1
char c = 0; /* valid */
char c = -1; /* valid, but might not read as -1 */
char c = 128; /* implementation dependent: either valid or undefined */
A turning point for GNU libc
Posted Apr 5, 2012 16:33 UTC (Thu) by nybble41 (subscriber, #55106)
[Link]
True, but gchar is defined as a typedef for char, so it adds no additional guarantees. Why not just use char? At least "guchar" is shorter than "unsigned char"; I see no benefit at all in using gchar.
If you want guaranteed ranges, uint8_t and int8_t are standardized in C99 as integer types with exactly eight bits and (in the signed case) two's-complement representation. Both definitions are required unless the implementation has no compatible integer type.