Yes, isalpha() and ctype is one thing that should be fixed. There are only 3 types of byte with the high bit set:
1. bytes that are not allowed in UTF-8.
2. "second" bytes
3. "first" bytes
I think first & second bytes should pass the isalpha() test. This will allow UTF-8 letters to be put into identifiers and keywords (of course it also allows UTF-8 punctuation and lots of other stuff but that is about the best that can be done). I also think ctype should not vary depending on locale, this is another thing that causes me nothing but trouble, most programmers revert to doing ">='a' && <='z'" and thus make their software even less portable.
Probably the ctype tables should add some bits to identify these byte types.