|
|
Subscribe / Log in / New account

Rethinking fsinfo()

Rethinking fsinfo()

Posted Aug 25, 2020 19:02 UTC (Tue) by mathstuf (subscriber, #69389)
In reply to: Rethinking fsinfo() by excors
Parent article: Rethinking fsinfo()

> Most code that processes the path should treat it as an opaque blob or decode it to wchar_t*, and wouldn't need to care about Unicode or surrogates etc.

I agree that just stuffing paths into binary storage is the best solution. However, usually paths need displayed or the storage you're using has a human caring about it at some point in its lifetime. Especially if you're using a container format like JSON. It's nice and all, but a way to store arbitrary binary data without having to figure out how to encode it so that it is Unicode safe would have been much appreciated. (No, BSON don't fix this; they just change the window dressing from `{:"",}` into type-and-length-prefixed fields or type-and-NUL-terminated sequences). CBOR has binary data, but then library support is more widely lacking.

FWIW, I've spent a lot of time thinking about how to stuff paths into JSON: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p...


to post comments

Rethinking fsinfo()

Posted Aug 26, 2020 19:43 UTC (Wed) by unilynx (guest, #114305) [Link] (2 responses)

I hope for a future where someone introduces a 'sane-names' filesystem mount option, which will forbid the use of invalid UTF8, filenames starting with a dash or space, containing dollar signs, and all the other funny things that make processing filenames hard or dangerous. Spaces in filenames we probably have to live with.

Distributions might slowly make that option the default for new systems, sysadmins can opt-in faster themselves, unless they really have to deal with those few applications (which will hopefully disappear or become obsolete fast) that really, really want to create weird filenames.

Rethinking fsinfo()

Posted Aug 27, 2020 4:48 UTC (Thu) by neilbrown (subscriber, #359) [Link]

Requiring valid UTF-8 is probably sensible for a new filesystem.
Excluding end-of-line characters is probably justifiable too. (or any control char ... I don't think we need TAB or DEL).
Anything else is parochial.
When I'm choosing a name to save my document from my GUI, why should I care about your inability to write safe shell scripts, or even have any understanding that "the shell" exists.
It is bad enough that I cannot put a '/' in my file names, why would you prevent me using '$'??

Rethinking fsinfo()

Posted Aug 29, 2020 11:29 UTC (Sat) by flussence (guest, #85566) [Link]

We're slowly getting there, the kernel has Unicode normalisation for filesystems at long last. I think we could live without ASCII control chars next, though I don't agree that we should forbid filenames from containing strings that an average person at a regular keyboard could type. Computers are meant to serve people, not the other way around.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds