Um, not all the punctuation characters are single bytes. A huge variety of
punctuation is up above U+2000, for instance, including U+2010 (the
hyphen) and U+2003 (the em space). Helpfully this is somewhat jumbled up
with nonpunctuation stuff like numeric superscripts.
The rest of your points stand: if all you want to do is manipulate ASCII
characters in a UTF-8 stream, you can do that without being Unicode-aware
at all. But this will tend to annoy your users when they type in a € and
find that your program can't manipulate it because it's U+20AC. It'll
annoy your users even more to find that they can remove some characters,
but that others take several keystrokes to remove and miraculously
transmogrify into other characters as they do so. (More mess: the Euro
cent sign is U+00A2!)
I suppose misbehaviour from this change is unlikely *if* you're in the US.
Anywhere else? Bite your knuckles.