LWN.net Logo

Control characters in file names

Control characters in file names

Posted Nov 25, 2010 19:15 UTC (Thu) by jengelh (subscriber, #33263)
In reply to: Control characters in file names by Spudd86
Parent article: Ghosts of Unix past, part 4: High-maintenance designs

>do you know how man[y] lines of shell it takes

Aha. So... everybody knows it is possible to have files with odd filenames, and everybody keeps on using shells or shell constructs that cannot deal with this properly? I can see the flaw in that.

>something that can iterate over files and run one command on each when it must handle files that have those in the name?

for i in *; do cmd "$i"; done;
find . -whatever -exec cmd \;
find . -whateverelse -print0 | xargs -0 cmd;

There are so many safe ways available. I am really not responsible for people doing UUOC or thelike.


(Log in to post comments)

Control characters in file names

Posted Nov 25, 2010 20:31 UTC (Thu) by Spudd86 (guest, #51683) [Link]

The for loop won't work... the find examples only work if you want to run a single, non-shell command.

Control characters in file names

Posted Nov 25, 2010 21:48 UTC (Thu) by jengelh (subscriber, #33263) [Link]

In which case will the for loop not work? (Other than * not globbing files starting with a dot.)

Control characters in file names

Posted Nov 25, 2010 22:00 UTC (Thu) by Spudd86 (guest, #51683) [Link]

if there's a file that starts with - or has any sort of control character it will break.

see here: http://www.dwheeler.com/essays/filenames-in-shell.html and here: http://www.dwheeler.com/essays/fixing-unix-linux-filename... although for some reason I remember it being much worse than that, though being correct everywhere in your script could eventually be a pain.

Control characters in file names

Posted Dec 2, 2010 19:19 UTC (Thu) by Ross (subscriber, #4065) [Link]

Are you proposing to remove hyphens from filenames too, or is this getting off-topic? :)

Control characters in file names

Posted Nov 25, 2010 23:30 UTC (Thu) by cmccabe (guest, #60281) [Link]

> In which case will the for loop not work? (Other than * not globbing files
> starting with a dot.)

The for loop should be

for i in *; do cmd "./$i"; done;

In case one of the filenames begins with a dash.

Control characters in file names

Posted Nov 26, 2010 10:28 UTC (Fri) by Yorick (subscriber, #19241) [Link]

Of course file names can be handled safely in most languages, but that's not the point. Wheeler describes it better and in more detail, but briefly, the aim is:
  • Make it harder to make mistakes, brittle and/or exploitable code. Even flawless programmers are affected by other people's errors.
  • Eliminate a dangerous class of control character exploits, mainly when displaying file names on terminals.
  • Allow for more design options. Remember, restricting data formats can be a way to give the programmer more freedom, not less.

To illustrate the last point: The only possible delimiter for files names is currently the null byte, which is not very practical in many languages and in shell scripting in particular. Linefeeds would be much more natural and are supported by many more tools.

The benefits are clear, and the costs appear to be very low. The only serious objection I have seen so far concerns existing file names using an ISO 2022-based encoding. There are several possible solutions: allowing the control character restriction to be lifted as a per-mount option (possibly only allowing ESC, SI and SO), or a mount option that recodes into UTF-8.

Control characters in file names

Posted Nov 29, 2010 16:30 UTC (Mon) by nix (subscriber, #2304) [Link]

The xargs only works if you have at least one matching file. You want -0r. (Of course this is totally GNU-only.)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds