Wheeler: Fixing Unix/Linux/POSIX Filenames
Wheeler: Fixing Unix/Linux/POSIX Filenames
Posted Mar 26, 2009 2:44 UTC (Thu) by explodingferret (guest, #57530)Parent article: Wheeler: Fixing Unix/Linux/POSIX Filenames
freenode, so I have to deal with a lot of issues to do with quoting,
word splitting, etc.
Here are some problems I noticed in your article:
1) "These restrictions only apply to Windows - Linux, for example, allows
use of " * : < > ? \ / | even in NTFS." -- is "/" supposed to be in that list?
2) You state that changing IFS and banning newlines and tabs in filenames would make things like 'cat $file' safer, but you should also state that shell glob characters would also need to be removed (namely *?[]).
3) You state (or at least imply) that there is no way to reliably use filenames from find, but there is a POSIX compliant and known portable method:
find . -type f -exec somecommand {} \;
or for more complex cases:
find . -type f -exec sh -c 'if true; then somecommand "$1"; fi' -- {} \;
For xargs fans, on all but files with newlines, you can do
find . -type f | sed -e 's/./\\&/g' | xargs somecommand
This is a feature of xargs and is specified by POSIX. It disables various quoting problems with xargs that you don't mention.
4) Your setting of IFS to a value of tab and newline is overly complicated. Simply use IFS=`printf \\n\\t`. It is only trailing newlines that are removed. If the different behaviour this causes with "$*" is not desired, one can set IFS=`printf \\t\\n\\t`. I know of no tool or POSIX restriction that says characters may not be repeated in IFS.
Otherwise great article! It really would be so nice to use line-separated commands in `` and not have to worry about things breaking. And although most of the thoughts expressed here are well known to me, the idea of getting the kernel to check the validity of UTF-8 filenames is fantastic!
