Those arguments could equally be made against UTF-8, where there are different byte sequences that some UTF-8 parsers will consider equal while others will consider to be invalid (e.g. encoding a '\u0000' as '\xC0\x80'). The solution to this problem is to require that inputs be in a canonical form.
Of course, once you start working with Unicode it isn't really enough to just require unique representations for each code point. You can have multiple sequences of unicode code points that have the same meaning. So you really want a normalised code point sequence encoded in a canonical form.