The ups and downs of strlcpy()

Posted Jul 19, 2012 9:46 UTC (Thu) by etienne (guest, #25256)
In reply to: The ups and downs of strlcpy() by paulj
Parent article: The ups and downs of strlcpy()

> [leave] programmer kindergarten [and use] a more sane string API that tracks sizes.

And obviously lose the use of any string whatsoever in a place where you cannot allocate memory.
And lose the capability to use multi-lingual constant strings because the size of string memory can never be bigger than the total size of your code, I mean:
const char *error_mlstr = "error\0erreur\0erro\0Ошибка\0";
But the main problem anyway is that strlcpy() do not even try to behave with UTF8, cutting the string in the middle of a char may create bigger security problems.

The ups and downs of strlcpy()

Posted Jul 19, 2012 15:49 UTC (Thu) by smurf (subscriber, #17840) [Link]

There are two easy workarounds for that:

* if you reallocate the buffer anyway, or if your program does not care about the character set, this is not a problem.

* if your program blindly assumes that its input is valid UTF-8, don't bother – you're going to fail anyway.

* otherwise, a wrapper which NULLs an incomplete UTF8 character at the end of your buffer is ten lines of C and left as an exercise to the reader. ;-)

The ups and downs of strlcpy()

Posted Jul 19, 2012 22:10 UTC (Thu) by nix (subscriber, #2304) [Link] (2 responses)

And obviously lose the use of any string whatsoever in a place where you cannot allocate memory.

Places where you cannot allocate memory are vanishingly rare (excepting in OOM situations, where the only sane thing to do is to terminate the process and let a parent deal with it). It is not worth crippling the string API just for this.

const char *error_mlstr = "error\0erreur\0erro\0Ошибка\0";

And for this, you want a string table abstraction. C gives you all the tools you need to write proper ADTs; why do so many C programmers persist in trying to do everything without such help?

The ups and downs of strlcpy()

Posted Jul 20, 2012 9:56 UTC (Fri) by etienne (guest, #25256) [Link] (1 responses)

>> const char *error_mlstr = "error\0erreur\0erro\0Ошибка\0";
>And for this, you want a string table abstraction. C gives you all the tools you need to write proper ADTs; why do so many C programmers persist in trying to do everything without such help?

And replace three to eight bytes strings with arrays of eight bytes pointers?
Some ways to write code wants you to use small strings:
cout << "The" << pet? "cat" : "dog" << (nb>1)? "are" : "is" << "black.";

> Places where you cannot allocate memory are vanishingly rare

Places where you need non-standard memory allocation (fail if allocation would sleep, allocate as virtual or physical memory, fail if allocation obviously too big at 10 Mbytes for a string, force a stack allocation) may not be so rare.

The ups and downs of strlcpy()

Posted Jul 20, 2012 12:59 UTC (Fri) by nix (subscriber, #2304) [Link]

And replace three to eight bytes strings with arrays of eight bytes pointers?

No. I'd expect such an abstraction to return a struct (for information-hiding purposes) which has one member, an offset into the string table. No pointers needed.

Places where you need non-standard memory allocation (fail if allocation would sleep, allocate as virtual or physical memory, fail if allocation obviously too big at 10 Mbytes for a string, force a stack allocation) may not be so rare.

To a first approximation these are all things that are only going to happen in kernel coding. If you're writing kernel code I expect you to be smart enough to use the language you're writing in, or at the very least to have appropriate abstractions that can be told things like 'do not allocate now' (and indeed the kernel's various internal abstractions can be told just this).

However, most people are not kernel programmers, and don't operate under such harsh constraints. For them, there's no excuse to not use appropriate abstractions other than a pointlessly minimalist C coding style more appropriate for the tiny systems of the 1970s than for now.