You're talking as if the phrase "split a string on '\t'" is somehow ambiguous. If you rewrote that to say "split a bytestring on the byte value 9" is it somehow different? Even though the operations are identical?
Assuming ASCII is so normal that people don't even write down the assumption anymore. In every encoding still widely used the above are identical. Not even SJIS is crazy enough to break it. I suppose for completeness we should specify a byte as 8 bits, but hardly anyone questions that anymore.
If you're decoding a chunk of text whose encoding is not determined a priori, then you need to parse that text to determine the correct encoding. If as part of that the system allows you to treat the string as if it was ASCII that really doesn't seem like a problem to me.
Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds