Anybody who thinks strlen(utf8) should return anything other than the number of bytes in the string does not know what they are talking about. Sorry.
UTF-8 is TRIVIAL if people would just WAKE UP and realize that it *is* trivial. The ONLY people who care where character boundaries are is people writing low-level rendering routines that have to look up font glyphs.
But for some reason the fact that a byte array represents a series of characters causes otherwise intelligent programmers to turn into complete morons. It suddenly becomes IMPOSSIBLE to work with the bytes, just because of the type of data in the string!
Here is a thought experiment: why in the world are we capable of making files containing English text when all the *words* are different sizes! Why it must be impossible! Counting words will be so slow and inefficient! How could the programs ever work?