> Well, strlcpy does not make any sense in UTF-8 world thus your wish is obviously granted.
Reasonable programming languages have real character data type.
Reasonable programming languages also have one-dimensional arrays.
This route reasonable programming languages get strings (for free).
They also have automatic OOB access checks.
No part of this strncpy/strlcpy() idiocy makes sense in reasonable programming language universe. They won't even understand what the fuss is all about.
Posted Mar 23, 2012 13:22 UTC (Fri) by dgm (subscriber, #49227)
[Link]
> No part of this strncpy/strlcpy() idiocy makes sense in reasonable programming language universe. They won't even understand what the fuss is all about.
That's fine. They are not the people that need to care. If an slow and safe string is all you need, why worry about all this? Just use whatever are given in your chosen language and move on. There's nothing for you to see here.
Still no strlcpy and friends
Posted Mar 23, 2012 14:07 UTC (Fri) by adobriyan (guest, #30858)
[Link]
> If an slow and safe string is all you need
nice
> Just use whatever are given in your chosen language and move on.
The problem is that I'm C guy mostly.
But there is nothing given in C.
Exactly nothing.
The correct fix belongs into the programming language.
Looking at other PLs and several C "solutions" it should be obvious.
From this POV, Ulrich's decision is very smart.
Still no strlcpy and friends
Posted Mar 23, 2012 15:12 UTC (Fri) by tialaramex (subscriber, #21167)
[Link]
“Reasonable programming languages have real character data type.”
A "real character data type" is almost never what you actually want because of how poorly defined characters are (or from another point of view, how many different and incompatible definitions there are for "character").
It's tempting to create a data type for Unicode code points. (Java specifications, some parts of the Win32 API, various databases, historically did this to their cost) because they are sometimes called "characters". But they can also represent things which aren't intuitively characters (such as the Byte Order Mark, or the LTR/RTL mode switches) and they can represent fractions of a character (like a macron) or symbols which are arguably groups of characters (like ligatures).
On the whole it's best to forget "characters" and handle only strings and sequences of bytes. This obliges the programmer to focus with due caution on any places that translate between the two. On the rare occasion that you do want to process Unicode code points they fit nicely into any modern integer type, such as 32-bit signed integers common in C.
This still leaves you with plenty of tricky problems (e.g. canonicalisation) with your Unicode strings if you need more work.
Still no strlcpy and friends
Posted Mar 23, 2012 20:23 UTC (Fri) by cmccabe (guest, #60281)
[Link]
> > “Reasonable programming languages have real character data type.”
> A "real character data type" is almost never what you actually want
> because of how poorly defined characters are (or from another point of
> view, how many different and incompatible definitions there are for
> "character").
Please. We're trying to do "programming language advocacy" here.
Don't intrude on this with your "facts"or "logic."
Can I get an A-men?
Still no strlcpy and friends
Posted Mar 26, 2012 18:58 UTC (Mon) by bronson (subscriber, #4806)
[Link]
Great post! It's unfortunate how many people still think they should be using C/C++'s fundamental char type.
The days of pointer arithmetic and character arrays have mostly drawn to a close. Even in C.
Still no strlcpy and friends
Posted Mar 27, 2012 15:09 UTC (Tue) by dgm (subscriber, #49227)
[Link]
I think you have to read again the post you're answering to.
Still no strlcpy and friends
Posted Apr 2, 2012 20:57 UTC (Mon) by bronson (subscriber, #4806)
[Link]
Care to say more? Code like "while(*c != '/') *b++ = *c++" doesn't work so well anymore.