Safename: restricting "dangerous" file names
There are few restrictions on file names in Linux—essentially just two: no "/" and no "\0"—but that freedom can lead to various problems, including security problems. Vulnerabilities like arbitrary file deletion and denial of service have resulted from programs mishandling file names with unexpected characters, for example. Most users and administrators do not use file names with control characters or other oddities, to the point where some may not even realize it is possible to construct such file names. Protecting those users from these kinds of unexpected problems and vulnerabilities is the target of the Safename Linux security module (LSM) that is being proposed by David A. Wheeler.
There are a myriad of ways that file names can go "wrong" on Linux. Consider a file name that begins with a "-"; if that name ends up on the command line (perhaps via a shell glob pattern), the file could be interpreted as command-line switch. File names containing newlines or other control characters can also lead to unexpected results—and output. Beyond that, file names that are illegal in the system encoding (e.g. UTF-8) cannot be displayed sensibly.
The problems that come from unexpected file names are described on a web page that Wheeler maintains. On that page, he suggests that allowing system administrators to restrict the kinds of file names that can be created would alleviate a whole raft of problems. Safename would provide a mechanism to do just that.
As Wheeler notes on that page (and in the patches), POSIX defines "portable" file names that are quite a bit more restrictive (only ASCII alphanumeric characters, period, underscore, and hyphen if it is not the first character of the name). Other operating systems and some filesystems on Linux already impose more restrictions on file names, including disallowing space and control characters or mandating a particular encoding.
There have been security vulnerabilities caused by unexpected file names, including a denial of service in logrotate caused by newlines or backslashes in file names (CVE-2011-1155) and a remote arbitrary file deletion vulnerability in uscan caused by white-space characters in file names (CVE-2013-7085). Undoubtedly, others lurk in various programs, but the bigger problem is probably contained in scripts and other "one-off" programs that administrators write to solve a problem quickly—without considering the ways that "strange" file names can result in bugs, especially when run on user-controlled directories.
In the patch posting, Wheeler outlines three ways that these potentially dangerous file names might come about. A malicious user or application could directly create a file that is then used by some other non-malicious application leading to an exploit. Or a non-malicious, unprivileged application could be tricked by an attacker into creating a dangerous file name, which could then lead to an exploit when some non-malicious, but buggy, script or program uses the file. Similarly, a privileged application could be tricked into creating one of these file names, which could lead to an exploit when some other code handles the file name—which means that administrators may want a way to stop even privileged code from creating them.
Safename will help administrators avoid these kinds of problems by restricting the kinds of file names that can be created. Notably, it does not enforce any restrictions on existing file names, though that could be added as an (expensive) operation at mount time. It uses the LSM hooks for any operation that can create a new file name (file creation, hard or symbolic link creation, rename, and directory or special-file creation) and enforces a set of restrictions on them.
The behavior of Safename is governed by a number of control files that are currently under /proc/sys/kernel/safename, but will be moving /sys/fs/safename based on a suggestion from Casey Schaufler. Enabling the feature for unprivileged users is done using the mode_for_unprivileged file, while privileged users' file name creation is governed by mode_for_privileged. Currently, "privileged" means has CAP_SYS_ADMIN, though that will change to CAP_MAC_ADMIN, which was also suggested by Schaufler since it is less likely to be given to a process for other purposes (CAP_SYS_ADMIN is something of a catch-all).
There are two settings available that can be combined and written to the two mode files. They are implemented as two bits that govern whether the rules are enforced and whether illegal file names are logged (using printk_ratelimited()). The low-order bit is for the enforcement setting and the other is for the logging setting. So, zero means no enforcement or logging, one is for enforcement without logging, two for logging without enforcement, and three for both actions. For both modes, the default value is zero, which means no enforcement and no reporting (effectively the same as a kernel running without the module loaded).
In addition, there are configuration files to alter the rules for file names. The boolean utf8 file governs whether the file names must be valid UTF-8; it defaults to zero, which turns off UTF-8 checking. There are also three files that govern the character values allowed in various parts of the file name: first character, last character, and the characters in between. Those files are:
- permitted_bytes_initial: The permitted set of characters for the first byte of the file name, the default is 33-44,46-125,128-254, which omits control characters, space, hyphen, tilde, delete (0x7f), and 0xff.
- permitted_bytes_middle: The permitted set for the characters of the file name that are not the first or last (so file names of one or two characters are not subject to these requirements). By default, the value is 32-126,128-254, which leaves out control characters, delete, and 0xff.
- permitted_bytes_final: The set of characters allowed for the last byte of a file name (a one-character file name must pass the initial and final tests). The default is 33-126,128-254, which removes control characters, space, delete, and 0xff.
The comments on the patches have been fairly sparse to date, but the proposal is an indicator that the security module stacking feature is leading to more special-purpose LSMs being developed. When the single LSM slot was generally occupied by one of the monolithic LSMs (e.g. SELinux, AppArmor, Smack), there was little point in creating smaller modules that catered to a specific security concern. With the ability to add multiple LSMs that came with module stacking, efforts like LoadPin and Safename will be able to offer specialized tools for administrators who want them.
Index entries for this article | |
---|---|
Kernel | Modules/Security modules |
Kernel | Security/Security modules |
Security | Linux Security Modules (LSM) |
Posted May 12, 2016 6:35 UTC (Thu)
by ttonino (guest, #4073)
[Link] (60 responses)
Now, the names are generated by circumstances or by attackers.
- attacker submits a web form. Broken server side script splits name in 2 and executes half of it as a command.
The second problem would be foiled, because that file could not have been created in the first place. The first problem still exists.
A risk that the file name filtering can trigger is seen in a recent Windows trojan which installs a directory with a reserved name (I think named com1: or similar). With the result that directory cannot be deleted. But this sounds less of a risk if the filtering is done cleanly.
Posted May 12, 2016 7:58 UTC (Thu)
by NAR (subscriber, #1313)
[Link]
Posted May 12, 2016 12:54 UTC (Thu)
by matthias (subscriber, #94967)
[Link] (5 responses)
This security module should be much better, as it only disallows creation of reserved names. It does not forbid unlink() or open(), so you cannot have files that cannot be accessed or deleted.
Posted May 12, 2016 13:07 UTC (Thu)
by farnz (subscriber, #17727)
[Link] (4 responses)
To be fair to DOS here, it's back-compat with DOS 1.0 that gives it this problem. In DOS 1.0, you didn't have directories, thus could you not have a /dev equivalent. When DOS 2.0 added directories, you could not assume that the programs users ran were directory-aware; if I did B: CHDIR AWESOME then A:\SAUCE, there was no way to tell whether an access to "COM1" was intended to be a DOS 1.0-style access to the serial port, or a DOS 2.0-style access to B:\AWESOME\COM1. DOS chose to assume that you always meant a DOS 1.0-style access to the device, and Windows chose to keep that constraint.
Posted May 12, 2016 13:40 UTC (Thu)
by khim (subscriber, #9252)
[Link] (3 responses)
The fact that MS DOS 2.0 kept all these special files accessible from all places is just lazyness, plain and simple. MS DOS 2.0 could have fixed that problem easily. Now, with MS DOS 3.0 or MS DOS 5.0 it would have been problematic because at that point there were programs which used old behavior - but that's different story. What MS DOS 2.0 did and what Windows did was clearly a mistake which haunts us to this day. That OS also made another mistake: it kept "/" as a command line switch character by default. Sure, it introduced the ability to change it but kept "/" as default we are still struggling with crazy "/" vs "\" mixup. Now, it's easy to say that all these things were clearly a mistakes - but they kept MS-DOS usable back then and that's why we fight them now. In an alternate universe "proper" MS-DOS would have just died and we would have blamed some other OS for our miserable present.
Posted May 12, 2016 14:16 UTC (Thu)
by farnz (subscriber, #17727)
[Link] (2 responses)
At least one program I used (and ported to Linux) used functions 0Fh/14h/15h for all file access - it had been written for DOS 1.0, and a small amount of DOS 2.0 functionality was added as later extensions, using function 3Bh to change directory, but still using the old (working) core to access files. It massively reduced the scope of the rework required if 0Fh still worked as the "DOS 2.0" file access function - in this particular case, it meant that the program worked on both DOS 1.0 and DOS 2.0, but the extra functionality for DOS 2.0 machines simply didn't work (the program ate the errors).
And note that internally, this program was messy (as so many in-house programs are); once DOS 2.0 was deployed company-wide, later functionality was added using function 3Dh/3Fh/40h to access files, except where it was interfacing with the old core, where it used 0Fh/14h/15h. As a result, we needed an access to a file via 3Dh/3Fh/40h to access the same thing that 0Fh/14h/15h would, because otherwise whether (e.g.) COM2 was the second serial port or a file would depend on whether you were using the old bit of the codebase or the new...
Windows maintaining this in the Win16/Win32 API, however, has no excuse.
Posted May 17, 2016 14:15 UTC (Tue)
by khim (subscriber, #9252)
[Link] (1 responses)
I'm not exactly sure what you mean by that sentence. Have people stopping writing messy programs when Windows was introduced? Or have they stopped using old code? Win16 programs had access to the very same DOS functions in Windows 1.0/2.0, you know. And even with 3.x many INT21 functions were available. If it made no sense to break compatibility with MS DOS 2.0 then I don't see how Windows could have made any other choice. You mean your code was so convoluted and obfuscated that it was impossible to call wrapper which would give you "compatible" interface on top of the 3Dh/3Fh/40h functions? Were you writing your programs in Malbolge? Yes, I could easily understand why some programs needed that kludge. They could have easily added it to their own codebase. As temporary one or permanent one - but it would have been their choice. MS DOS 2.0 had no need for it, it could have easily fixed that problem with MS DOS 2.0 but it choose to be compatible. Windows suffered the same fate for the same reason.
Posted May 17, 2016 14:26 UTC (Tue)
by farnz (subscriber, #17727)
[Link]
I mean that when you're redoing the program to use Windows file APIs, instead of calling the DOS file APIs, you can lose the back compat behaviour - you're having to do major surgery to bring in a GUI instead of a CLI or TUI anyway, so this is just one more "port from DOS to a GUI" tax that shouldn't cost you much, relative to the huge cost of having to redo every user interface decision in the program to use Win16 calls instead of DOS calls; after all, when I rewrote the program for Linux (in C, instead of a mix of 8080 CP/M assembler that went through an automated translator to get 8086 DOS 1.0 code, hand-written 8086 machine code using DOS APIs - co-worker who thought that assemblers were likely to introduce bugs to his code - and 8086 assembler written for DOS native APIs), I had to find and change all these API calls anyway. In contrast, for a lot of apps, the move from DOS 1.0 to DOS 2.0 was supposed to just bring in directory support, nothing else.
And yes, we could have added it internally - but then we wouldn't have bothered adding in directory support (costs too much for the value provided), and would have stuck with the DOS 1.0 API until we ported to another platform. Arguably, had they done that, we'd have ported to a UNIX earlier than we did - when we looked at adding a GUI, the decision was made to move to Linux/X11 instead of Windows, because we already had X11 servers running on client machines, and we could thus run the program on controlled Linux servers instead of having users able to copy it to floppies and ruin our careful deployment strategy.
Posted May 12, 2016 13:27 UTC (Thu)
by ksandstr (guest, #60862)
[Link] (2 responses)
Indeed. This is the same issue as UTF-8 filenames that encode slashes and zeroes without using either the byte for the forward slash, or the zero: programs that consume the name by decoding it to latin1 or normalized UTF-8 end up with slashes and early terminators in paths they generate.
I don't see how filtering the bytes could fix this, nor how the UTF-8 case could be addressed without decoding UTF-8 within the kernel (either to pop EINVAL, or to cause a program to be unable to refer a denormal name it knows from having created it).
Posted May 12, 2016 18:00 UTC (Thu)
by mbunkus (subscriber, #87248)
[Link]
Posted May 12, 2016 18:03 UTC (Thu)
by david.a.wheeler (subscriber, #72896)
[Link]
Just to be clear, safename does counter overlong encodings; just turn on the UTF-8 checking. If someone tries to create an overlong version (e.g., of '/') it will be rejected. That means that safename has to check every byte in the string, but it's quite fast. It only needs to check the new filename (not the whole path), and only when the filename is created (once it's created you're fine).
Posted May 12, 2016 17:59 UTC (Thu)
by david.a.wheeler (subscriber, #72896)
[Link]
That's true to an extent, but Unix was created in 1971, and we still haven't managed to train users and developers to never make a mistake. Constructs like the glob "*.c" are actually dangerous in some circumstances; you're supposed to use ./*.c instead, but people still use the dangerous versions (many don't even know that they are dangerous). Since people still make mistakes, including mistakes caused by their failure to be omniscient, let's allow people to configure their system so the mistake can't have any bad effects.
Obligatory car example: Cars designed around the same time that Unix was designed ran just fine when people didn't make mistakes. However, those cars were death traps in an accident. Modern cars are designed to reduce damage when the inevitable accident occurs. I'm trying to bring this kind of thinking to Linux. This module means that when people make mistakes, the system is no longer a death trap; instead, it actively works to reduce or prevent the damage.
Exactly!
Hopefully that makes it clear why people might want this.
Correct, safename doesn't prevent that problem. But no single countermeasure solves all problems. What you really need is a set of countermeasures, each of which reduces risks.
There are other countermeasures for accidental name-splitting, e.g., modifying shell scripts to eliminate space from IFS. But that IFS-based countermeasure doesn't handle leading "-" in filenames, while safename does. Combining countermeasures, such as safename and removing space from IFS, eliminates both of the problems you listed.
That's a Windows-specific problem (that I do discuss on my page!). It's a nasty problem, but one that Unix and Linux don't have. I'll let Microsoft figure out how they want to deal with that problem in Windows :-).
Posted May 12, 2016 19:40 UTC (Thu)
by dlang (guest, #313)
[Link] (48 responses)
Posted May 12, 2016 20:26 UTC (Thu)
by viro (subscriber, #7872)
[Link] (47 responses)
Posted May 12, 2016 20:32 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (5 responses)
Posted May 12, 2016 20:46 UTC (Thu)
by viro (subscriber, #7872)
[Link] (4 responses)
Incidentally, I would not assume that this EPERM/EINVAL/whatnot will be gracefully handled by the same kind of code. As to which class of misbehaviour ends up nastier... If you have any data (as opposed to anecdotes) concerning that, I'd love to see it.
Posted May 14, 2016 16:46 UTC (Sat)
by nix (subscriber, #2304)
[Link] (3 responses)
Posted May 14, 2016 17:33 UTC (Sat)
by viro (subscriber, #7872)
[Link] (2 responses)
Posted May 18, 2016 22:24 UTC (Wed)
by nix (subscriber, #2304)
[Link] (1 responses)
Posted May 18, 2016 22:24 UTC (Wed)
by nix (subscriber, #2304)
[Link]
Posted May 12, 2016 20:40 UTC (Thu)
by roc (subscriber, #30627)
[Link] (40 responses)
Posted May 12, 2016 20:42 UTC (Thu)
by roc (subscriber, #30627)
[Link]
Posted May 12, 2016 22:12 UTC (Thu)
by nybble41 (subscriber, #55106)
[Link]
Well, we could take the opposite approach and create a kernel option to force *unsafe* filenames, randomly adding leading hyphens and embedded control characters, spaces, and other problematic byte sequences. That would quickly demonstrate which programs have issues in their filename-handling logic, perhaps allowing us to finally get to the root of the issue.
Only half joking. I have no problem with the "safename" concept so long as it isn't enabled by default, or used by applications as an excuse for poor filename hygiene.
Posted May 13, 2016 14:52 UTC (Fri)
by dgm (subscriber, #49227)
[Link] (37 responses)
If you want to check file names for safety, then you have to check close to where the problem is. In the example about rm and a file called "-R", the right place to check is the shell. No other place makes sense: rm cannot do it (too late), and neither can the kernel (which cannot know about all the possible kinds of applications and their name restrictions).
So no, a half-assed solution is not better than no solution at all. Bugs must be fixed at the right place or they really are not.
Posted May 13, 2016 18:24 UTC (Fri)
by nybble41 (subscriber, #55106)
[Link] (36 responses)
Why would the shell know that "-R" has special meaning to the "rm" command? Some commands (like "echo") would treat "-R" as a literal string. Also consider that commands exist which take options which do not start with a hyphen; for example, xterm gives special significance to arguments starting with "+", "%", and "#", in addition to "-".
The safest thing to do would be to require all pathnames to start with either "./" or "/", to distinguish them from non-pathnames.
Posted May 13, 2016 18:59 UTC (Fri)
by farnz (subscriber, #17727)
[Link] (29 responses)
Ultimately, the problem comes from two design decisions interacting:
Neither of them are wrong, but when you put them together, 'rm' has no idea that the user typed "rm *", and it expanded to "rm -fr foom"; similarly, the shell does not know that '-fr' is a modifier from rm's point of view.
Posted May 14, 2016 8:50 UTC (Sat)
by dlang (guest, #313)
[Link] (20 responses)
Posted May 17, 2016 12:01 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (19 responses)
Spaces get escaped, yes, but (for example), there's nothing that tells rm that '-fr' in argv[1] was from the user typing '-<TAB>', and getting tab completion on the filename -fr, or that '--no-preserve-root' in argv[2] came from a completion function and is meant to be a command argument, not a filename.
Posted May 17, 2016 12:30 UTC (Tue)
by anselm (subscriber, #2796)
[Link] (18 responses)
I consider that a feature, not a bug.
Posted May 17, 2016 12:40 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (17 responses)
It's both; it means that the shell and rm cannot communicate about how a particular element in argv[] came to be, so there's no way for rm to know how to disambiguate things that are both options and filenames. This, in turn, means that the command touch -- '-fr' leads to user surprise, as it means that rm * is now recursive.
This could have been avoided (back in deep UNIX history) by convention; options begin '-', and filenames begin '.' (with tab-completion of filenames thus producing './file' instead of 'file'). This would, however, have entailed the early shell authors deciding to enforce that convention; it's now long enough that muscle memory
Posted May 17, 2016 18:47 UTC (Tue)
by flussence (guest, #85566)
[Link] (16 responses)
You demonstrated this isn't the case only half a sentence later...
Posted May 17, 2016 19:12 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (15 responses)
I didn't. -fr is both a legitimate filename and an option to rm. The user can disambiguate them for rm by using a ./ prefix to say definitely a filename, but there is no equivalent for options, and the prefix is not required (nor is it put in there by shell expansions).
Thus the user can choose to be unambiguous, but if they are ambiguous, rm can't tell what they really meant - did I get -fr into argv[1] via glob expansion or tab completion (probably a filename) or by typing a literal -fr (probably an option).
Posted May 17, 2016 20:44 UTC (Tue)
by hummassa (guest, #307)
[Link] (14 responses)
You inadvertently came up with a nice and neat (IMHO) solution. Just adjust your shell so that * and .* globs expand to ./* and ./.*, respectively
Posted May 17, 2016 20:51 UTC (Tue)
by dlang (guest, #313)
[Link]
Posted May 18, 2016 9:59 UTC (Wed)
by farnz (subscriber, #17727)
[Link] (10 responses)
That's not enough; you need anything that the shell creates as a "filename" to expand to ./result; otherwise rm -* (typo) is also ambiguous. Plus, you need to do this before the current behaviour gets set in historic tradition, so that people don't write scripts containing things like
Posted May 18, 2016 10:39 UTC (Wed)
by itvirta (guest, #49997)
[Link] (9 responses)
Uh, that's hideous. Luckily rm -frv usually works, and even rm -f -r -v is probably easier to write than that.
Posted May 18, 2016 10:48 UTC (Wed)
by farnz (subscriber, #17727)
[Link] (8 responses)
Unfortunately, because this has historically worked, it's now expected behaviour; I've worked with people who consider it the "definitive" style for options, and who prefer to do things like rm --{force,recursive,verbose} because they think that's clearer than rm --force --recursive --verbose. If that's deep in a script, my modified shell is going to break.
Hence coming back to my original point; had early UNIX authors foreseen this gotcha, they could have required relative paths to begin ./, and thus avoided all this pain down the line, because rm would never have been passed a filename without the ./. Now, though, it's too late - too much legacy to fix.
Posted May 18, 2016 13:05 UTC (Wed)
by tao (subscriber, #17563)
[Link]
tar, ar, ps, etc. don't even prefix their options with a dash (though you can).
Posted May 19, 2016 9:57 UTC (Thu)
by gmatht (subscriber, #58961)
[Link] (6 responses)
There are other ways adding ./ could break existing scripts though. For example, the following would no longer give you a list of naughty users:
I guess "bash -n" could warn about unsafe use of "*", warn interactive users, and we could scan all packages for unsafe use of *.
Posted May 19, 2016 9:59 UTC (Thu)
by farnz (subscriber, #17727)
[Link] (5 responses)
You missed case 3 - Bash did not have to scan the file system, but the user's intent was to match a file. For example rm log-{user,system}.txt. There's no way for Bash to detect that sanely without adding in a file system scan - but then the file system scan can cause bash to do the wrong thing for someone who does rm --{force,recursive}.
Posted May 19, 2016 11:18 UTC (Thu)
by anselm (subscriber, #2796)
[Link] (2 responses)
Of course brace expansion, by definition, has nothing to do with existing files. There are lots of legimitate cases for brace expansion where the expansion results are file names that a file system scan won't uncover, because they don't exist. Consider
Doing a file system scan here to “validate” the expansion results would be completely pointless if not counterproductive.
Posted May 19, 2016 11:31 UTC (Thu)
by farnz (subscriber, #17727)
[Link]
Indeed; hence me saying that this isn't actually fixable now. The chance to insist that paths, other strings, and options were distinct entities (with a - hint as first character of an option, and a . hint as the first character of a path, thus not needing a binary protocol to provide the hints) has long since gone away, especially since there are now programs like ps which use the presence or absence of the - to choose between different option parsers.
Posted May 19, 2016 12:55 UTC (Thu)
by tao (subscriber, #17563)
[Link]
touch a b
Posted May 20, 2016 4:22 UTC (Fri)
by gmatht (subscriber, #58961)
[Link] (1 responses)
We don't know that log-{user,system}.txt is an option. We do know that log-{user,system}.txt is a fixed expansion, that doesn't directly depend on any untrusted filenames. So either way we can pass it directly to the application and it handle this ambiguity without worrying too much about the existence of files with malicious names tricking the application.
Posted May 20, 2016 8:15 UTC (Fri)
by farnz (subscriber, #17727)
[Link]
Exactly; if I meant it to be a filename, bash can't tell the application anything that hints that that was my intent. Equally, if I meant it to be an option, bash can't tell the application anything that hints that that was my intent.
Thus, this is currently an insoluble problem, without going back in time and changing the idioms for filenames and options such that filenames *always* began . or / (which then makes log-{user.system}.txt clearly not a filename, as it starts "l"), and options always began -; then, you reserve all other characters for parameters that are neither filenames nor options (e.g. PIDs and IP addresses). This is about 40 years too late now (I wasn't even born when the decisions were being made), so it's an insoluble problem because any shell trying to enforce this needs to cope with the legacy that's already out there.
Posted May 25, 2016 17:19 UTC (Wed)
by Wol (subscriber, #4433)
[Link] (1 responses)
And then you do what we did, and blow up your system (well, we didn't exactly, but our backup system went mad...)
We installed a system ported across from Pr1mos. It used * everywhere as part of a filename ...
(* at the start of a name was sort of the equivalent of .exe in Windows. I'm not even sure it was stored in the directory tree on Pr1mos, but because they couldn't do whatever they did on Pr1mos, they put it in the directory tree on nix ...)
Cheers,
Posted May 25, 2016 20:55 UTC (Wed)
by tao (subscriber, #17563)
[Link]
In the same manner I wouldn't limit filenames to 8.3 when porting from DOS to Linux, or allow backslash in filenames when porting the other way around...
Posted May 17, 2016 6:00 UTC (Tue)
by felix.s (guest, #104710)
[Link] (7 responses)
If I were to design an operating system, I'd have the shell interpret command line arguments and pass them to programs as flags, abstract objects representing file paths (something like O_PATH file descriptors) and opened files (ordinary file descriptors). Pipelines would be based around passing abstract objects, not meaningless blocks of bytes. Incidentally, this would also improve security (because programs no longer need to access arbitrary file system locations) and solve so many other big and little problems (say, <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19165>; the compiler would no longer output text, it'd output objects representing diagnostics).
I think some of the advantages of this approach may be actually attainable on Linux today, by means of passing file descriptors over sockets and memfd. But to actually get to the root of the problem, I think you'd need to redesign Unix from scratch.
Bah, I got a bit carried away. But I had to let off this steam somewhere.
Posted May 17, 2016 6:37 UTC (Tue)
by jem (subscriber, #24231)
[Link]
Posted May 17, 2016 6:51 UTC (Tue)
by viro (subscriber, #7872)
[Link] (5 responses)
Posted May 17, 2016 7:54 UTC (Tue)
by dlang (guest, #313)
[Link]
Posted May 17, 2016 9:18 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link] (3 responses)
Posted May 17, 2016 14:56 UTC (Tue)
by edgewood (subscriber, #1123)
[Link] (1 responses)
Posted May 17, 2016 15:07 UTC (Tue)
by rahulsundaram (subscriber, #21946)
[Link]
Posted May 17, 2016 17:46 UTC (Tue)
by dlang (guest, #313)
[Link]
Posted May 17, 2016 11:56 UTC (Tue)
by dgm (subscriber, #49227)
[Link] (5 responses)
Posted May 17, 2016 14:56 UTC (Tue)
by nybble41 (subscriber, #55106)
[Link] (4 responses)
To the shell, any non-builtin command (including "rm") just takes a list of strings. The shell doesn't (and shouldn't) have any knowledge of what these external commands do, or how they interpret their arguments. The convention about "special" arguments starting with a hyphen would be a heuristic at best, one with both false positives and false negatives. Prompts might be useful for an interactive shell, but would not be appropriate for most scripts—which are where the majority of the problem lies. Non-fatal warnings would either be a recurring nuisance when the arguments are known to be safe for a particular command or would come too late, informing the user of the issue after the damage is already done.
There is one safe and accurate solution: On the command side, implement the "--" protocol for marking the end of the "special" options; and when invoking commands through the shell, always use "--" and quote your variables. Use bash arrays and the "${name[@]}" syntax rather than relying on IFS if you need to manipulate lists of files. E.g.:
> # Instead of rm -f $(list-generator-command)
This won't handle filenames containing newlines. I do not know of any way to read NUL-delimited input (as in "find -print0" or "xargs -0") from a shell script, which would be the most obvious way around that restriction. However, it should be proof against anything else that might occur inside a filename.
Posted May 18, 2016 12:22 UTC (Wed)
by NAR (subscriber, #1313)
[Link] (2 responses)
Actually the shell has some kind of this knowledge, used for completion like at https://github.com/scop/bash-completion.
Posted May 18, 2016 18:14 UTC (Wed)
by nybble41 (subscriber, #55106)
[Link]
The shell provides a framework for programmable completion as an aid to interactive users. It doesn't have any of that knowledge built in, the database may be incomplete or inaccurate, and even in an interactive setting the programmable completion feature may not be enabled. (I personally find it more annoying than helpful and prefer the more predictable, command-independent traditional completion, but tastes vary.) It also isn't generally available to scripts, which is where the problem lies.
Posted May 19, 2016 11:11 UTC (Thu)
by anselm (subscriber, #2796)
[Link]
The shell knows a few things about replacing bits of text with other bits of text. It doesn't really know (or care) what these bits of text mean, and what little knowledge it does have is only used for constructing command lines, not for making sense of them once they're there, which is what this discussion seems to be about.
Posted May 18, 2016 18:07 UTC (Wed)
by flussence (guest, #85566)
[Link]
Just as one example, imagine `find` having an -execv [envvar_name] option instead of having to deal with its horrible inline syntax...
Posted May 12, 2016 9:21 UTC (Thu)
by dlang (guest, #313)
[Link] (11 responses)
What happens if you are trying to get access to the contents of a tar or zip file and some file in it has a bad name?
so many ways this can go wrong..
Posted May 12, 2016 12:47 UTC (Thu)
by matthias (subscriber, #94967)
[Link] (10 responses)
Regarding your eaxmples:
If you are accessing the contents of a tar, then you either have to untar the archive, in which case the module would prevent the creation of those files. Or you access the contents by the means of read() or mmap() and some library for decompressing the contents in RAM. In this case, the filenames usually do not end up being added to some commandline due to shell expansion.
Posted May 12, 2016 17:37 UTC (Thu)
by dlang (guest, #313)
[Link] (9 responses)
The world is not all UTF-8, let alone UTF8-en.
Posted May 12, 2016 17:40 UTC (Thu)
by mjg59 (subscriber, #23239)
[Link] (8 responses)
Posted May 12, 2016 19:42 UTC (Thu)
by dlang (guest, #313)
[Link] (7 responses)
The issue isn't just with what you choose to use, but with what everyone else you deal with chooses to use.
Posted May 12, 2016 19:53 UTC (Thu)
by mjg59 (subscriber, #23239)
[Link] (6 responses)
Posted May 12, 2016 20:01 UTC (Thu)
by dlang (guest, #313)
[Link] (5 responses)
I've dealt with "unable to display" filenames with tab completions and wildcards in the past. It's not fun, but it works. And the programs really don't care what the byte-string that is the filename looks like.
Posted May 12, 2016 20:07 UTC (Thu)
by mjg59 (subscriber, #23239)
[Link] (4 responses)
Posted May 12, 2016 20:12 UTC (Thu)
by dlang (guest, #313)
[Link] (3 responses)
This includes things like Python3 that decides that all strings must be UTF8
Posted May 12, 2016 20:15 UTC (Thu)
by mjg59 (subscriber, #23239)
[Link] (2 responses)
That doesn't alter the fact that they exist, and as such there's no expectation of interoperability between archives with legacy encodings in filenames and modern UTF-8 systems.
Posted May 12, 2016 20:23 UTC (Thu)
by dlang (guest, #313)
[Link] (1 responses)
except that the users expect things to work. they don't expect that their new update will make it impossible to work with others.
UTF is supposed to make things easier, not cut them off.
the "modern utf-8 systems" are the ones with the problems. Thank goodness there are still options and not everyone ignores backwards compatibility.
Posted May 12, 2016 20:29 UTC (Thu)
by mjg59 (subscriber, #23239)
[Link]
Posted May 12, 2016 12:08 UTC (Thu)
by ssl (guest, #98177)
[Link] (4 responses)
Oh yeah, this will break if any Japanese, Egyptian or Georgian (just as examples) users would like to name files using their native scripts.
Posted May 12, 2016 12:25 UTC (Thu)
by mjg59 (subscriber, #23239)
[Link] (3 responses)
How? The filtered bytes are never used in UTF-8 representations of any characters other than space, hyphen and tilde.
Posted May 12, 2016 17:14 UTC (Thu)
by sfeam (subscriber, #2841)
[Link] (2 responses)
Posted May 12, 2016 17:32 UTC (Thu)
by mjg59 (subscriber, #23239)
[Link]
Posted May 12, 2016 17:39 UTC (Thu)
by david.a.wheeler (subscriber, #72896)
[Link]
Safename requires that new filenames pass all the tests, and the system admin can configure the tests. If you use iso-8859-5 as your filename encoding, then you should leave the UTF-8 checking off, and possibly change the set of allowed bytes (as noted in the article, you can configure which bytes are allowed at the beginning, middle, and end).
I would encourage you to migrate to UTF-8 for filename encoding; it works everywhere, and desktop environments are increasingly requiring it. I believe the vast majority of users already encode filenames using UTF-8; in those cases, they can turn on the UTF-8 encoding checking, and be sure that the filenames really are valid UTF-8. There are even tools to help you automatically transition to UTF-8.
Safename doesn't require that you use UTF-8, though. Its byte checking is there specifically so it support checking non-UTF-8 values.
Posted May 12, 2016 12:38 UTC (Thu)
by robbe (guest, #16131)
[Link] (3 responses)
I found two with CR at the end, two html files from boost’s documentation begin with a tilde (destructors?). Pretty harmless.
Files names with a dash as their first character are by far the most numerous. systemd’s usage of -.mount and -.slice will run into trouble under this module.
Posted May 12, 2016 19:55 UTC (Thu)
by error27 (subscriber, #8346)
[Link] (2 responses)
I get 169 files.
The systemd files you mentioned.
I'm in favour of this change but it could be slightly annoying to transition. Except for the minus one files, all the bad names look to be auto-generated. Some are generated by linux programs which can be fixed but a lot were auto-generated on someone else's system.
Posted May 12, 2016 22:38 UTC (Thu)
by joey (guest, #328)
[Link] (1 responses)
Posted May 21, 2016 20:30 UTC (Sat)
by Wol (subscriber, #4433)
[Link]
PI-Open used to create directories (which the OS was not supposed to muck about with the contents of) which contained a whole bunch of files, whose name format was <space><backspace><number>.
It did that because Pr1mos (from which it was ported) had "segmented directories". Basically, a directory with no space for the filename, so files had to be referenced by offset number. For the most part, those directories were meant for programs - each file was mapped to a memory segment when the program was loaded (hence "segmented directory"), but as programmers will, plenty of us found other good uses for them :-)
Cheers,
Posted May 12, 2016 20:10 UTC (Thu)
by flussence (guest, #85566)
[Link] (2 responses)
Posted May 12, 2016 21:32 UTC (Thu)
by wahern (subscriber, #37304)
[Link] (1 responses)
Expecting user software to get it right is absolutely necessary. It's not sufficient by itself, but it's by far the most productive strategy available.
Opinions on mitigations can reasonably vary, especially because not all mitigations are equal. But I simply cannot abide an opinion which discounts the value and imperative of encouraging correctly written software, insofar as correctness is definable. Fortunately, there are absolutely correct ways to securely handle filenames, in large part because the vulnerabilities are so well known and identifiable, at every layer and dimension of the software stack. And there are plenty of best practices to drawn upon, albeit some more debatable than others.
Nobody can prevent someone from writing bad software, but experienced engineers can certainly shame and cajole and encourage both the implementation of and informed selection of correct software. Just look back to the quality of software circa 2000 to today. Worlds apart. Still horribly bug ridden, but at least you can cobble together small network services with a much higher degree of confidence. Remote exploits are much less common in low-level software. Good luck finding an RCE in a modern TCP/IP stack. I'm sure they're there, but not as many as there used to be even though the stacks are much more complex. And that's because of a push for correctness, not mitigations. We're incalculably more secure because of that approach.
Posted May 13, 2016 3:59 UTC (Fri)
by david.a.wheeler (subscriber, #72896)
[Link]
Clearly this is it's not one or the other. It's not possible to mitigate all possible problems. But when errors have a common pattern, a countermeasure can be helpful.
No, it's because people have worked on both correctness and mitigations. My guess is that more of the improvements have come from platform mitigations, not from efforts focused on correctness. Many programmers still have no idea how to write correct programs. But we have tools that mitigate against common errors. Most programming languages prevent buffer overflows. Many web application framework now automatically protect against SQL injection and cross-site scripting. Address space layout randomization and stack canaries also counter many buffer overflows. These changes have little to do with developers knowing writing correct programs, and everything to do with embedding mitigations into your system. Don't get me wrong, I think it is extremely important to focus on writing correct programs. This is not an either/or situation! But no one is perfect, not even people who know what they're doing, so it's important to also have the underlying system mitigate common problems when it is reasonable to do so.
Posted May 12, 2016 21:33 UTC (Thu)
by micka (subscriber, #38720)
[Link] (12 responses)
Posted May 12, 2016 21:38 UTC (Thu)
by sfeam (subscriber, #2841)
[Link] (10 responses)
Posted May 12, 2016 22:33 UTC (Thu)
by joey (guest, #328)
[Link] (7 responses)
Posted May 13, 2016 4:04 UTC (Fri)
by david.a.wheeler (subscriber, #72896)
[Link]
Posted May 13, 2016 14:42 UTC (Fri)
by nybble41 (subscriber, #55106)
[Link] (5 responses)
$ x='test'
I can see some logic in blocking control codes, leading and trailing whitespace, and non-UTF-8 filenames (for systems with UTF-8 locale), but would stop short of trying to restrict all potential shell metacharacters.
Note that effectively detecting whitespace and control codes in UTF-8 filenames is significantly more complicated than just matching certain bytes; the filter will need to be Unicode-aware.
Posted May 15, 2016 9:51 UTC (Sun)
by neilbrown (subscriber, #359)
[Link] (4 responses)
Not quite. I agree that leading "~" and "{a,b}" are not expanded after variable substitution, but '*' and '?' and '[...]' are.
At least, that is the case for "bash". For "csh" the rules are different:
% set a="~neilb"
Not that anyone would actually write a script using csh would they! Would they??
Posted May 15, 2016 18:33 UTC (Sun)
by nybble41 (subscriber, #55106)
[Link] (3 responses)
Good catch. I really didn't expect to find the shell performing pathname expansion when all the other forms were inhibited, but a more careful reading of the manual page shows that this occurs last, after word-splitting. Tilde expansion occurs in the same pass as parameter expansion, as do arithmetic expansion, command substitution, and process substitution, while brace expansion occurs earlier. Only word-splitting and pathname expansion come afterward.
This can be controlled with "set -f" in bash (disabling pathname expansion), but then the script can't use glob patterns at all. It would be nice to have an explicit pathname expansion syntax that worked independently of "set -f", but this doesn't appear to exist, so scripts have to choose between safe(r) parameter expansion and the ability to use glob patterns. Of course, the safest thing is to only expand parameters inside quoted strings, which prevents both word-splitting and pathname expansion, but this requires extra effort in the normal case rather than the exceptional case.
> Not that anyone would actually write a script using csh would they! Would they??
Hopefully no one is writing *new* scripts for csh, but I have seen a few in legacy environments.
Posted May 16, 2016 14:50 UTC (Mon)
by cortana (subscriber, #24596)
[Link] (2 responses)
So 'rm *' removes one file with the name '*' but 'glob rm *' expands to 'glob rm foo bar baz' and so on.
You could have different builtins for different kinds of expansions (or flags to the glob built-in).
Posted May 17, 2016 18:37 UTC (Tue)
by flussence (guest, #85566)
[Link]
Posted May 20, 2016 22:11 UTC (Fri)
by Wol (subscriber, #4433)
[Link]
This sounds like Pr1mos ... :-)
The Pr1mos shell had expansion and completion, and it also had ways for programs to communicate to/from the shell. It's so long ago, I've forgotten the details, but there was "-verify" and "-confirm" or something like that. So if I typed the command
DELETE @@ -NO_CONFIRM
the @@ would expand to all the files in the directory. DELETE told the shell that its defaults were both verify and confirm, so as the shell expanded the @@, it would ask me to verify the expansion. But because I'd said no_confirm, the shell would then execute the command without asking.
Okay, it relied on the guy who wrote DELETE to get it right, but because you set flags in the executable that the shell picked up, it was pretty flexible. Obviously, something non-dangerous like COPY would default to no_verify no_confirm.
I've always felt that Unix completion is crippled compared to Pr1mos. But then, it is Eunuchs - a castrated Multics. Pr1mos was a Multics-derivative too :-)
Cheers,
Posted May 13, 2016 7:20 UTC (Fri)
by micka (subscriber, #38720)
[Link]
Posted May 15, 2016 7:55 UTC (Sun)
by robbe (guest, #16131)
[Link]
Because, as others have rightly pointed out, you’d have to block tons of stuff to protect from all shell metacharacters.
Which programs (except the shell) interpret ~user in a special way? I think, compared to hypthen, this is negligable.
Posted May 12, 2016 22:40 UTC (Thu)
by viro (subscriber, #7872)
[Link]
A "tilde-prefix" consists of an unquoted <tilde> character at the beginning of a word, followed by all of the characters preceding the first unquoted <slash> in the word, or all the characters in the word if there is no <slash>. In an assignment (see XBD Variable Assignment), multiple tilde-prefixes can be used: at the beginning of the word (that is, following the <equals-sign> of the assignment), following any unquoted <colon>, or both. A tilde-prefix in an assignment is terminated by the first unquoted <colon> or <slash>. If none of the characters in the tilde-prefix are quoted, the characters in the tilde-prefix following the <tilde> are treated as a possible login name from the user database. A portable login name cannot contain characters outside the set given in the description of the LOGNAME environment variable in XBD Other Environment Variables. If the login name is null (that is, the tilde-prefix contains only the tilde), the tilde-prefix is replaced by the value of the variable HOME. If HOME is unset, the results are unspecified. Otherwise, the tilde-prefix shall be replaced by a pathname of the initial working directory associated with the login name obtained using the getpwnam() function as defined in the System Interfaces volume of POSIX.1-2008. If the system does not recognize the login name, the results are undefined.
The pathname resulting from tilde expansion shall be treated as if quoted to prevent it being altered by field splitting and pathname expansion.
Mind you, there are other shell metacharacters they hadn't excluded, but when it comes to security theatre ours is not to wonder why...
Posted May 18, 2016 19:40 UTC (Wed)
by ballombe (subscriber, #9523)
[Link] (15 responses)
Posted May 18, 2016 22:00 UTC (Wed)
by mjg59 (subscriber, #23239)
[Link]
Posted May 19, 2016 11:09 UTC (Thu)
by anselm (subscriber, #2796)
[Link] (12 responses)
Basic compatibility between systems is covered by POSIX, and POSIX makes very restrictive assumptions as far as file names go. As long as you adhere to those (and there isn't really a compelling reason not to, even if you're only targeting Linux), you should be safe.
Posted May 19, 2016 17:40 UTC (Thu)
by micka (subscriber, #38720)
[Link] (1 responses)
Posted May 20, 2016 16:53 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link]
Let us see here now.
Posted May 19, 2016 18:17 UTC (Thu)
by hummassa (guest, #307)
[Link] (9 responses)
Users want to have spaces on their filenames. And accented chars (especially in non-english languages, that need lots of accented vowels). And commas and parentheses. All of those are forbidden by POSIX.
Not only that: even if a certain user (like me) does not like extraordinarily strange chars in his filenames, it will encounter tarballs/zips with such filenames on it, or use a mainstream web browser that appends the non-extension part of a filename with space-openparentheses-number-closeparentheses when a file with the same name already exists on the download directory (if 'file.txt' already exists, it will try to create 'file (1).txt'). Etc, etc.
Posted May 20, 2016 7:35 UTC (Fri)
by jem (subscriber, #24231)
[Link] (8 responses)
Hear, hear! The proposal to enforce stricter limits on file names would severly limit, say, how a user can name documents produced with a word processor. And yet it is not the word processor's fault, but mostly because of an unrelated program: the shell. We need to go to the root of the problem: what we need is a new shell. A shell that can handle file names as a collection of names without the risk of the names being split up just because they contain spaces. A shell where the collection of names is just that, and won't be mixed up with command options. A shell where zero is as good a number as any other number, i.e. an empty collection of names does not get special treatment (replaced with "*", for instance.)
Another solution would be to invent some new storage for documents which is off limits for the shell.
Posted May 20, 2016 8:06 UTC (Fri)
by NAR (subscriber, #1313)
[Link] (2 responses)
Accented characters are useful for e-mail, web pages, formatted text files where the encoding can be specified, but the filenames lack this information.
Posted May 20, 2016 12:34 UTC (Fri)
by jezuch (subscriber, #52988)
[Link]
Yeah, yeah, yeah. It used to be like that... in the '90s. I mean, I live in a country where a proper encoding is vital to all our ą's and ź's. It's all in the past now, dead and buried. (Except for an occasional system made by occasional clueless uni-lingual Americans ;) )
Posted May 21, 2016 23:23 UTC (Sat)
by mathstuf (subscriber, #69389)
[Link]
Posted May 25, 2016 17:07 UTC (Wed)
by nix (subscriber, #2304)
[Link] (4 responses)
Posted May 25, 2016 18:12 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
Posted Jun 1, 2016 20:08 UTC (Wed)
by nix (subscriber, #2304)
[Link] (1 responses)
Posted Jun 1, 2016 22:52 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link]
Posted May 26, 2016 12:52 UTC (Thu)
by madscientist (subscriber, #16861)
[Link]
Posted May 20, 2016 17:07 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link]
Safename: restricting "dangerous" file names
- user tries to delete a file named -R and encounters surprising results.
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
if I did B: CHDIR AWESOME then A:\SAUCE, there was no way to tell whether an access to "COM1" was intended to be a DOS 1.0-style access to the serial port, or a DOS 2.0-style access to B:\AWESOME\COM1
Really? The fact that DOS 1.0 program would use function 0F and DOS 2.0 program would use function 3D is not bin enough clue?Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Windows maintaining this in the Win16/Win32 API, however, has no excuse.
As a result, we needed an access to a file via 3Dh/3Fh/40h to access the same thing that 0Fh/14h/15h would, because otherwise whether (e.g.) COM2 was the second serial port or a file would depend on whether you were using the old bit of the codebase or the new...
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
This is the same issue as UTF-8 filenames that encode slashes and zeroes without using either the byte for the forward slash, or the zero: programs that consume the name by decoding it to latin1 or normalized UTF-8 end up with slashes and early terminators in paths they generate.
I don't see how filtering the bytes could fix this, nor how the UTF-8 case could be addressed without decoding UTF-8 within the kernel (either to pop EINVAL, or to cause a program to be unable to refer a denormal name it knows from having created it).
Safename: restricting "dangerous" file names
I feel that having the name on disk is not the real problem: the problem occurs when the name is processed or used and triggers a bug.
user tries to delete a file named -R and encounters surprising results.... would be foiled, because that file could not have been created in the first place.
attacker submits a web form. Broken server side script splits name in 2 and executes half of it as a command... [this] problem still exists.
A risk that the file name filtering can trigger is seen in a recent Windows trojan which installs a directory with a reserved name (I think named com1: or similar). With the result that directory cannot be deleted. But this sounds less of a risk if the filtering is done cleanly.
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
b) regression testing, including "how does it handle weird names" (c.f. the story about a directory on a Bell Labs system that kept acting as a minefield for all kinds of tree-walking code - very valuable thing, that).
c) don't use such names yourself outside of regression testing.
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
rm -{f,r,v}
and expect it to work.
Safename: restricting "dangerous" file names
(Exactly the same amount of characters too.) So hopefully there's no need for that.
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Some commands require -- for long options, others don't.
Some commands enforce (or at least warn about) the ordering of options (find, for instance), others don't.
Simple fix: Only add ./ if bash know is it a file
1) Bash knows that the glob is the file, because it had to scan the file system
2) Bash didn't have to scan the file system, so it safe to omit the './'
cd /home; find * -name naughty.jpg | sed s,/.*,,g | sort -u
Simple fix: Only add ./ if bash know is it a file
Simple fix: Only add ./ if bash know is it a file
$ mkdir -p quarterly-results/201{5,6,7}q{1,2,3,4}
Simple fix: Only add ./ if bash know is it a file
Simple fix: Only add ./ if bash know is it a file
dash$ ls {a,b}
ls: cannot access '{a,b}': No such file or directory
bash$ ls {a,b}
a b
Case 3=2b
Case 3=2b
Safename: restricting "dangerous" file names
Wol
Safename: restricting "dangerous" file names
Unix was a mistake
Unix was a mistake
Unix was a mistake
Unix was a mistake
Unix was a mistake
And it's so nice that MS decided to create a Linux emulation layer so that developers could run bash.
Unix was a mistake
Unix was a mistake
Unix was a mistake
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
> files=()
> while read -r; do files+=("$REPLY"); done < <(list-generator-command)
> rm -f -- "${files[@]}"
The shell doesn't (and shouldn't) have any knowledge of what these external commands do, or how they interpret their arguments.
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Usually, a badguy should not be able to add filesystems to the system. If the badguy can, you probably have other problems.
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
permitted_bytes_middle: The permitted set for the characters of the file name that are not the first or last (so file names of one or two characters are not subject to these requirements). By default, the value is 32-126,128-254, which leaves out control characters, delete, and 0xff.
Safename: restricting "dangerous" file names
It is not clear to me from the article whether this filter is only applied after testing the "utf8" Boolean or if it is an alternative mechanism. If the filter is applied to non-utf8 encodings it will definitely cause problems. For example 0xff is the representation of cyrillic character я in encoding iso-8859-5. I learned this the hard way when trying to pin down the cause of bug reports of premature file termination errors on read. And yes, there apparently are linux users in that part of the world whose native environment is iso-8859-5 rather than UTF-8.
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Some downloaded javascript from when I saved a webpage in firefox
A bunch of files in format NAME-000-999 where the NAME part is missing.
2 auto-generated file from running trinity
A few files called "-1.jpg" or -1.png
-13-degree-weather-has-brought-chicagos-ohare-airport-to-a-n.jpg
A -.orig file that patch created by accident
13 files that originally came from Windows and start with ~$ like "~$hool Policies Manual.doc"
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Wol
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Expecting user software to get it right is absolutely necessary. It's not sufficient by itself, but it's by far the most productive strategy available.
that's because of a push for correctness, not mitigations. We're incalculably more secure because of that approach.
Safename: restricting "dangerous" file names
Presumably because the shell would expand ~foo to the login directory of user "foo".
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Why not other chars?
Safename: restricting "dangerous" file names
$ a='~user bad$x'
$ echo $a
~user bad$x
$ eval "echo $a"
/home/user badtest
Safename: restricting "dangerous" file names
% a='*'
% echo $a
% echo $a
/home/neilb
% echo "$a"
~neilb
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
It's already doable:
Safename: restricting "dangerous" file names
$ set -f; glob() { set +f; $@; set -f; }
$ ls -d /proc/sys/*
ls: cannot access '/proc/sys/*': No such file or directory
$ glob !!
glob ls -d /proc/sys/*
/proc/sys/abi /proc/sys/dev /proc/sys/kernel /proc/sys/vm
/proc/sys/debug /proc/sys/fs /proc/sys/net
The question is, whether enough people could be convinced to work this way to make a difference.
Safename: restricting "dangerous" file names
Wol
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
Safename: restricting "dangerous" file names
=========
2.6.1 Tilde Expansion
=========
Safename: breaking compatibility beteen system
Safename: breaking compatibility beteen system
Safename: breaking compatibility beteen system
Safename: breaking compatibility beteen system
No accented character.
No, thank you.
Safename: breaking compatibility beteen system
No accented characters?
No, thank you, kind sir.
Safename: breaking compatibility beteen system
Safename: breaking compatibility beteen system
In my experience using accented characters always creates problems down the line. There's the problem of different encodings, so if the file was created using ISO-8859-2 encoding, but the terminal/application uses Unicode, the characters won't show properly. If the file was created using Unicode, it won't show properly when the application uses Code Page 852. If the file was created using Code Page 852, it won't show properly when the terminal is set to ISO-8859-2. Sometimes the language-specific encoding is not even available (or badly configured), so users get "inventive" and use õ or ô instead of ő and of course it won't show properly in ISO-8859-2. And then there's the problem when the given computer does not provide ways to enter accented characters, so one can't even type the name of that file.
Safename: breaking compatibility beteen system
Safename: breaking compatibility beteen system
Safename: breaking compatibility beteen system
Safename: breaking compatibility beteen system
Safename: breaking compatibility beteen system
Safename: breaking compatibility beteen system
Safename: breaking compatibility beteen system
Safename: breaking compatibility beteen system
Safename: breaking compatibility beteen system