Filesystems and case-insensitivity
Filesystems and case-insensitivity
Posted Nov 28, 2018 21:06 UTC (Wed) by smurf (subscriber, #17840)In reply to: Filesystems and case-insensitivity by perennialmind
Parent article: Filesystems and case-insensitivity
It's easy to encode invalid byte sequences so that they survive a round trip through Unicode / UTF-8 – you mis-appropriate the surrogates. The actual higher-level semantics of that, though, are fraught with corner cases you *really* don't want to deal with.
Basically IMHO there are two sane choices – (a) the current situation: the kernel does not attach any semantics to any bytes other than '/' and '\0' (thus there is no chance for case insensitivity beyond ASCII), or (b) you use clean and preferably pre-normalized UTF-8 on the userspace/kernel boundary, outlaw anything nonconforming, and do everything else in userspace. Anything else is a recipe for long-term desaster.
