|
|
Log in / Subscribe / Register

A report from the documentation maintainer

A report from the documentation maintainer

Posted Nov 2, 2016 22:44 UTC (Wed) by nybble41 (subscriber, #55106)
In reply to: A report from the documentation maintainer by farnz
Parent article: A report from the documentation maintainer

> ... the Unicode solution is based on the principle that it's better for the algorithm to match things it shouldn't, than it is for it to miss things it should match.

That's debatable, and it really depends on what the glob pattern is being used for. When deleting files, for example, it would generally be better to match conservatively so that you don't remove files which the user didn't expect to match. In most cases it's much easier to clean up any files which were missed than it is to restore ones which were unexpectedly removed. The same would apply to any operation which modified files in place (e.g. perl -i).

Personally, I just set LC_COLLATE=C and compare all strings bytewise--which does not preclude the use of Unicode filenames. I find this far less surprising than any of the locale-specific options, and think it would be a safer default, especially for scripts. Interactively, absent some indication of the user's intent, perhaps the shell should evaluate the glob pattern both ways and generate an error if the results do not agree.


to post comments

A report from the documentation maintainer

Posted Nov 2, 2016 23:13 UTC (Wed) by mstone_ (subscriber, #66309) [Link]

The solution here is simply not to blindly delete files using glob patterns. If you need this for some reason you'd darn well better set your locale to C, but you should probably just come up with a safer solution.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds