C, Fortran, and single-character strings
C, Fortran, and single-character strings
Posted Jun 22, 2019 5:05 UTC (Sat) by marcH (subscriber, #57642)Parent article: C, Fortran, and single-character strings
Not just Fortran but any remotely sane/safe/modern language including C++. Even newer and safer C APIs.
Some other comment mentioned "cargo cult programming techniques": null-terminated strings is probably one of the top examples of that. Any other language doing it?
Posted Jun 22, 2019 17:45 UTC (Sat)
by ncm (guest, #165)
[Link] (7 responses)
But it's not the only dodgy practice around strings, and they are accumulating at an impressive rate. A lot of Pascal family languages store/stored the length in the first byte, with no great answer to how to do a longer string. Others, for first-two-bytes. Lots of languages switched to two-byte first generation Unicode, but have no concept of normalizing different representations with modifier code points, so e.g. strings that produce the same set of glyphs compare unequal, and there is no concept of a character representable only as a pair of two inseparable code units.
Unicode has characters that have no visible glyph and take no space, so could be sprinkled anywhere, and lots of code points have glyphs necessarily identical to others, that normalization isn't allowed to choose just one of. Lots of languages have adopted UTF-8, but not tackled any of the similar problems.
Getting exercised over the choice of representing length with a null terminator will leave you entirely unequipped for the much bigger problems that matter.
Posted Jun 22, 2019 18:12 UTC (Sat)
by marcH (subscriber, #57642)
[Link] (6 responses)
I was referring to *memory* length from a safety and performance perspective.
> Every language does that has to interact with C does, which today is all of them.
Yeah, sure. Off-topic too.
Posted Jun 22, 2019 21:04 UTC (Sat)
by ncm (guest, #165)
[Link] (5 responses)
Null termination is an example of a venerable programming practice, the use of sentinel elements, lately fallen from favor now that memory and cycles are thousands, millions, or even billions of times cheaper than they once were.
If we sneer at choices made then, under the constraints of the time, how much more derision do we deserve for unfortunate choices made without such constraints? 'Cause I could list such, all day long, about any system, language, or technology you can think of.
Posted Jun 22, 2019 21:37 UTC (Sat)
by marcH (subscriber, #57642)
[Link] (4 responses)
> 'Cause I could list such, all day long, about any system, language, or technology you can think of.
Sure, let's start by looking at some CVE statistics. Wait, I said no digression sorry.
Posted Jun 23, 2019 4:04 UTC (Sun)
by Cyberax (✭ supporter ✭, #52523)
[Link] (3 responses)
C strings allow you to pass substrings as a pair of pointers (or just one pointer for tail substrings), for example.
Posted Jun 23, 2019 18:02 UTC (Sun)
by marcH (subscriber, #57642)
[Link] (2 responses)
Yes the type of (safer) arrays would have been one step above "primitive".
Looking at string.h on opengroup.org, it's interesting to see almost half the functions there already have some size_t argument.
> C strings allow you to pass substrings as a pair of pointers (or just one pointer for tail substrings), for example.
This is indeed a performance optimization. It's also a dangerous one if the array is not const (who owns it now?) and I don't see how "higher level" arrays would stop you from still doing that, I would just discourage you from doing it routinely in non-critical paths.
Posted Jun 23, 2019 21:07 UTC (Sun)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
I'm not saying that it's a good idea now, but null-terminated strings certainly make sense in C.
Posted Jun 25, 2019 16:41 UTC (Tue)
by rgmoore (✭ supporter ✭, #75)
[Link]
No. A language with only safe arrays won't be C. C is supposed to provide access to low-level functions and that includes unsafe pointers and arrays. But C is also supposed to allow programmers to build higher-level abstractions, including things like safe arrays and strings, and there's excellent reason to use those safe arrays and strings in place of the unsafe alternatives when performance is not critical.
C, Fortran, and single-character strings
C, Fortran, and single-character strings
C, Fortran, and single-character strings
C, Fortran, and single-character strings
C, Fortran, and single-character strings
C, Fortran, and single-character strings
C, Fortran, and single-character strings
Sure, but C was designed without such arrays. And a language with safe arrays won't be C.
C, Fortran, and single-character strings
Sure, but C was designed without such arrays. And a language with safe arrays won't be C.