> not all the punctuation characters are single bytes
I was referring to the punctuation characters used to delimit CSV, which are all ASCII characters (as are those used in XML).
> The rest of your points stand: if all you want to do is manipulate
> ASCII characters in a UTF-8 stream, you can do that without being
> Unicode-aware
My points were that you can do all of those things (e.g. search and replace) EVEN IF the input is non-ASCII.
Your example of delete key behaviour is an interesting one that comes under my category of "GUI and similar I/O". It is clearly necessary to delete back as far as the last character-starting byte. Doing so is not very hard.
> I suppose misbehaviour from this change is unlikely *if* you're in
> the US. Anywhere else? Bite your knuckles.
I am not in the U.S., and my code works with UTF-8 without the sort of major headaches that you allude to.