|
|
Log in / Subscribe / Register

Case-insensitive filesystem lookups

Case-insensitive filesystem lookups

Posted May 24, 2018 17:45 UTC (Thu) by excors (subscriber, #95769)
In reply to: Case-insensitive filesystem lookups by epa
Parent article: Case-insensitive filesystem lookups

> I sympathize with the view that these things should be handled in the user interface, not the filesystem. If you have to put locale code in the filesystem itself you've surely taken a wrong turning.

In many cases, I think filenames really are a UI concept that is being used directly as a core part of the filesystem (the disk format plus the associated APIs and protocols like SMB), which feels like a serious layering violation. When a user saves a document, they give it a human-readable name so they can find it later in a list of all their saved documents. They don't care if it's stored with that name as its filename, or if it's stored as "cff5f247-64bd-4066-ab2f-66ff8aed2322.doc" and the name is in some metadata, or if it's stored in a special database and not as a separate file at all - the UI could be the same for all of those. But since we choose to implement it with human-readable filenames, the UI is complicated by filesystem restrictions (why can't the user put "/" in a document name?), and the filesystems(/APIs/protocols/etc) are complicated by UI issues (Unicode, case sensitivity, locale dependence, etc). It seems particularly bad given that Unicode changes over time, and locales differ between users, while filesystems are persistent and shared - there's a fundamental mismatch there.

Surely there must be a better way to design the system, if legacy compatibility didn't matter, where the implementation details of storing and referencing files are more cleanly separated from the UI concept of naming files? (Though of course legacy compatibility does matter more than almost anything else, so this is hypothetical and probably pointless.)

(There are other cases where filenames aren't UI, they're well-known identifiers like "/etc/passwd" or "c:\autoexec.bat" - the name is needed as a portable way for programs to refer to a particular file. But they have very different requirements to user-chosen document names, e.g. ASCII is probably fine, and it's not obvious that the same solution should be used for them.)


to post comments

Case-insensitive filesystem lookups

Posted May 25, 2018 19:03 UTC (Fri) by drag (guest, #31333) [Link]

> Surely there must be a better way to design the system, if legacy compatibility didn't matter, where the implementation details of storing and referencing files are more cleanly separated from the UI concept of naming files?

Since Unix files can be arbitrary strings then just use a hash of the file to store it in the file system. Then you manage names on the application layer by providing a handy dandy API for everybody to use.

Because just imagine that instead of one locale you have to make insensitivity work for ALL locales. A lot of Linux file systems house data that is globally sourced using languages and names from dozens, if not hundreds, of different languages.

Good luck making that work on a file-system layer.

I mean: what are you going to do?

To have any remote chance of making it work in a case sensitive manner is by having the locale of each file embedded right there in the file system's metadata so it can be correctly managed in the way it was intended. And then what are you going to do when you have a English user from North America edit a file somebody made from Greece? Change the locale? Make the insensitivity work differently or now force the English user to understand the character set used by the other person from Greece? How are you going to deal with file names that don't conflict in the original locale, but do after somebody edits it?

So the choice is really:

1. Have a case sensitive file system that always works under all circumstances that is simple, robust, and fast.

2. Have a case insensitive system with massive amounts of extra code and logic that will never actually have a chance of working.

YES; having a sensitive file system is a bad UI. But it's impossible to make it actually work otherwise.

Therefore: If you are looking for a very good user interface exposing a Unix-style file system to the user is not a good solution. You have to do something else.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds