UTF-8 actually specifies only one valid byte sequence for a given sequence of code points; which some parsers will accept other sequences, only one is valid and therefore canonical. UTF-7, on the other hand, doesn't have a single valid byte sequence, and doesn't seem to have any obvious canonical form.
The code point sequence issue is real (which is why I was careful not to say "character" anywhere), and unfortunately, there are multiple possible normalizations. So not only do you need a normalized code point sequence, you need one with a particular normalization that everything will agree on. (Also, since the availability of characters may affect the normalization, you might in principle have to specify the version of Unicode, although I think they're careful not to introduce new ways of getting the same character.) And, of course, you have to avoid using Apple products, because they silently rename your files to have a different normalization from what everybody else uses.