Case-insensitive filesystem lookups
Case-insensitive filesystem lookups
Posted May 24, 2018 17:45 UTC (Thu) by excors (subscriber, #95769)In reply to: Case-insensitive filesystem lookups by epa
Parent article: Case-insensitive filesystem lookups
In many cases, I think filenames really are a UI concept that is being used directly as a core part of the filesystem (the disk format plus the associated APIs and protocols like SMB), which feels like a serious layering violation. When a user saves a document, they give it a human-readable name so they can find it later in a list of all their saved documents. They don't care if it's stored with that name as its filename, or if it's stored as "cff5f247-64bd-4066-ab2f-66ff8aed2322.doc" and the name is in some metadata, or if it's stored in a special database and not as a separate file at all - the UI could be the same for all of those. But since we choose to implement it with human-readable filenames, the UI is complicated by filesystem restrictions (why can't the user put "/" in a document name?), and the filesystems(/APIs/protocols/etc) are complicated by UI issues (Unicode, case sensitivity, locale dependence, etc). It seems particularly bad given that Unicode changes over time, and locales differ between users, while filesystems are persistent and shared - there's a fundamental mismatch there.
Surely there must be a better way to design the system, if legacy compatibility didn't matter, where the implementation details of storing and referencing files are more cleanly separated from the UI concept of naming files? (Though of course legacy compatibility does matter more than almost anything else, so this is hypothetical and probably pointless.)
(There are other cases where filenames aren't UI, they're well-known identifiers like "/etc/passwd" or "c:\autoexec.bat" - the name is needed as a portable way for programs to refer to a particular file. But they have very different requirements to user-chosen document names, e.g. ASCII is probably fine, and it's not obvious that the same solution should be used for them.)
