And then we have Pascal-layout strings (as opposed to actual Pascal 'strings', a nightmare for other reasons, see Kernighan). They don't fix this problem (you just have to overwrite the start of the string, not its end) and have two much bigger problems: finite string length, and an increase in size of every string. The finite string length means that writing general string-handling algorithms without special cases for the rare event of large strings is impossible, and the increase in size of every string bloats small strings, which are by far the common case. You can patch both of these: the first, by making the finite string length as large as a pointer; and the second, by noting that alignment constraints in existing systems bloat the effective size of strings anyway. But of course this soon turns into a special case of nul-terminated strings: point the pointer at the end of the string, bingo, one rather hard-to-consult nul by any other name.
The biggest downside is probably a long-term ABI problem. The scheme is inflexible. If your Pascal string-length header is too short, however do you expand it? It's wired into every string-using program out there! At least nul-terminated strings need no expansion.
The real solution to string-handling unfortunately requires a VM of some description which can prevent the program from accidentally overwriting fields in aggregates by writes to any other field or variable. Then you can do reliable Pascal strings, separating the length from the content, or reliable null-terminated strings, with the separate compartment containing a pointer into the string. Unfortunately this is incompatible with low-level all-the-world's-a-giant-arena languages like C without very specialized fine-grained MMU hardware.
(I have, like everyone, written my own dynamic string-handliing library when younger. It starts out simple but it's amazing how soon you have to introduce extra code to track pointers and make freeing them in error cases less verbose, and extra code to track memory leaks... you need that in C anyway of course but the massive increase in dynamic memory use that dynamically allocating most strings brings tends to force them on you sooner than otherwise.)
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds