Posted Aug 4, 2008 19:30 UTC (Mon) by jreiser (subscriber, #11027)
[Link]
Why can't one close the fd's in the child before exec()?
The problem is knowing/deciding/communicating which fd should be closed, and getting everybody to do the correct close().
Once fork() occurs, shouldn't no more fd's be added to the child?
Redirecting stdin/stdout/stderr to an explicit path adds a new fd to the child.
File descriptor handling changes in 2.6.27
Posted Aug 5, 2008 1:18 UTC (Tue) by mheily (guest, #27123)
[Link]
Here is a comment that I posted on udrepper's journal:
This seems like an overly complex solution to the race conditions you describe. It is the responsibility of the application to close all extraneous file descriptors after calling fork(2) but before calling execve(2). Currently, Linux/GNU doesn't offer an easy way to do this. I feel most programs will want to close all file descriptors except for the standard I/O (fd 0, 1, and 2).
Have you considered implementing the closefrom(3) function as found in NetBSD, OpenBSD, and Solaris? This allows a process to close all open file descriptors equal to or above fd #3. The code to accomplish this would look something like this:
if (fork() > 0) {
closefrom(3);
execve(...);
}
There are a couple of different ways to implement closefrom(3). Within the kernel, you add a flag named F_CLOSEM to the fcntl(2) system call. A few systems (such as AIX and IRIX) do not have closefrom(3), but allow you to call fcntl(2) with F_CLOSEM to achieve the same effect.
Another simpler approach, but not as efficient, is to iterate over /proc/$$/fd and close all of the open file descriptors listed there.
For maximum portability, here is what an application developer might write:
if (fork() > 0) {
/* Close all open file descriptors other than standard I/O */
#if HAVE_FCNTL_F_CLOSEM
fcntl(3, F_CLOSEM);
#elif HAVE_CLOSEFROM
closefrom(3);
#else
#warning This platform has no secure close-before-exec support
#endif
execve(...);
}
--
"Make everything as simple as possible, but not simpler."
- Albert Einstein
File descriptor handling changes in 2.6.27
Posted Aug 5, 2008 1:42 UTC (Tue) by bojan (subscriber, #14302)
[Link]
> if (fork() > 0) {
> closefrom(3);
> execve(...);
> }
Doing this would mean that you always control the fork()/execve(). But what if the
fork()/execve() is inside a plug-in you don't control (i.e. you can't fix its source)? With
O_CLOEXEC flag, you still get your fd closed. Without it, you don't. No?
File descriptor handling changes in 2.6.27
Posted Aug 5, 2008 6:07 UTC (Tue) by dark_knight (subscriber, #47846)
[Link]
Actually, no. While you're still between fork() and execve(), the child is already running,
but still as a copy of the parent (simplifying, same code and data segment), so you can call
whatever function you like. You lose the control of the child process only after the execve()
function, as the image of the process (again, simplifying, the code and data segment) is
substituted by the ones of the new executable file.
File descriptor handling changes in 2.6.27
Posted Aug 5, 2008 6:35 UTC (Tue) by bojan (subscriber, #14302)
[Link]
> While you're still between fork() and execve()
This is what I'm trying to say: if this fork() and execve() is not your code, but something
from a proprietary plugin for which you have no source, you cannot control it at all. In other
words, you cannot call any function there, including closefrom(). Hence, marking the fd
O_CLOEXEC on open() makes sure that on fork() this fd is closed.
File descriptor handling changes in 2.6.27
Posted Aug 5, 2008 10:17 UTC (Tue) by dark_knight (subscriber, #47846)
[Link]
Well, if the fork() is in alien code, and the fd is still open when you enter into the closed
source code, the information leakage potentially occurs even before the fork() ;)
File descriptor handling changes in 2.6.27
Posted Aug 5, 2008 22:05 UTC (Tue) by bojan (subscriber, #14302)
[Link]
I don't think anyone's trying to claim that O_CLOEXEC will address every possible security
issue.
File descriptor handling changes in 2.6.27
Posted Aug 5, 2008 13:33 UTC (Tue) by mheily (guest, #27123)
[Link]
> This is what I'm trying to say: if this fork() and execve() is not your code, but something
from a proprietary plugin for which you have no source, you cannot control it at all.
True. But consider the converse of your argument: what if the proprietary plugin has called
open() on several files, but failed to set the O_CLOEXEC flag? When you then go to call fork()
and execve() in your own code, you might accidentally leak file descriptors to the child
without even knowing it.
That is why it would be good to also have a closefrom() system call. This function ensures
that the only descriptors inherited are stdout, stdin, and stderr.
File descriptor handling changes in 2.6.27
Posted Aug 5, 2008 22:29 UTC (Tue) by bojan (subscriber, #14302)
[Link]
Yeah, different problems require different solutions.
File descriptor handling changes in 2.6.27
Posted Aug 9, 2008 23:12 UTC (Sat) by jlokier (guest, #52227)
[Link]
So true. But as noted, you can read /proc/self/fd/ on Linux to achieve the same thing -
writing your own closefrom(). Or if you're prepared to just call close() a lot, up to the
maximum number of open files in this process, you can do that too.
File descriptor handling changes in 2.6.27
Posted Mar 2, 2011 19:22 UTC (Wed) by gps (subscriber, #45638)
[Link]
But you can't do that... When a program has fork()ed but has not yet called exec() the only system calls it can make are async-signal-safe system calls. opendir() and readdir() are not on that list. Linux is left with no way to close all open file descriptors other than calling close() on all _possible_ file descriptors (a real slow pain in the ass when your max fd limit is in the millions) in between the fork() and exec() to deal with ones that may not have been opened with CLOEXEC.
Posted Aug 5, 2008 7:50 UTC (Tue) by tialaramex (subscriber, #21167)
[Link]
So, if I've opened fd 16 and I'm about to run a sub-process which will use this new descriptor
(perhaps I thought to pass '16' as an argument to exec), then I should...
Use dup to copy fd 3 somewhere else, and ensure all of my code can cope with this, perhaps
by entirely replacing file descriptors and anything that uses them (FILE * etc.) with my own
private indirection...
Close the now unused fd 3 and replace it with a copy of fd 16 using dup2, then close that
too, incurring all the above problems again
Call closefrom(4)
Are you sure this is simpler than fixing the design by adding close-on-exec as a potential
property of all descriptors from birth ?
Your approach clearly solves only a fraction of the problem (it doesn't consider fork + exec
by sub-routines you didn't write, e.g. in libraries, which is the most serious problem CLOEXEC
fixes), yet it incurs most of the same costs as the fix that's already been chosen and pushed
into 2.6.27.
That's not to say that closefrom() isn't an interesting API, and one which might be welcome in
Linux, but just that it doesn't actually appear to be a simpler solution, just an incomplete
one.
File descriptor handling changes in 2.6.27
Posted Aug 5, 2008 13:48 UTC (Tue) by mheily (guest, #27123)
[Link]
> Your approach clearly solves only a fraction of the problem
If by a "fraction", you mean 9/10ths of the problem.. The only fraction that Uli's solution
fixes that cannot be fixed using closefrom() is the case involving interaction with
proprietary plugins. I agree with an earlier poster; once you run a proprietary binary
program, you can never be sure that your data is safe. The possibility of leaking file
descriptors across an execve() call is the least of your worries.
The typical use case is a program calling fork() and then execve() to run an external program
that only needs access to stdin/stdout/stderr. Linux/GNU developers should optimize for making
it easy to close all other file descriptors. Instead, they have optimized for running
proprietary binary plugins. Yuck.
File descriptor handling changes in 2.6.27
Posted Aug 5, 2008 19:49 UTC (Tue) by nix (subscriber, #2304)
[Link]
closefrom() would only work if
1) it was a system call and thus could enforce atomicity
2) glibc took out a lock also taken by open(), dup(), et al, which means
yet more locking around those functions, harming performance
If you're adding a new system call anyway, why not adjust things so that
the *already existing* close-on-exec flag works properly, rather than
adding more band-aids atop the system to compensate for the unreliability
of the existing flag?
File descriptor handling changes in 2.6.27
Posted Aug 6, 2008 21:58 UTC (Wed) by quotemstr (subscriber, #45331)
[Link]
No!
Why would closefrom() need to be atomic? There is no race. After fork(), the new process only
has a *single* thread running.
Any race that required closefrom() to be atomic would also be a problem between the
closefrom() and the execve().
So no, closefrom does not need to be atomic.
File descriptor handling changes in 2.6.27
Posted Aug 6, 2008 22:37 UTC (Wed) by nix (subscriber, #2304)
[Link]
Yeah, sorry, missed that. Still *everyone* who forks off a child needs to
do it.
File descriptor handling changes in 2.6.27
Posted Mar 2, 2011 19:25 UTC (Wed) by gps (subscriber, #45638)
[Link]
But closefrom() does need to be async-signal-safe so that it can safely be called after a fork().
Posted Aug 5, 2008 19:53 UTC (Tue) by strcmp (guest, #46006)
[Link]
as tialaramex said, it is not only std* you want to inherit. you could want to inherit the
endpoints of a pipe, or some network sockets (plural...), but you may want to have the child's
output on the same tty for debugging purposes. as you don't always have control about the
order of fd-s (some gui library might open sockets depending on what the user clicked) you
will have to renumber painfully instead of just passing /dev/fd/1234 on the command line. and
even then your order of fd-s is just an implicit CLOEXEC flag, and as this flag already
existed it was simpler to just close the last holes than to implement a new, but more
primitive and more burdensome solution and still maintain the flag for backwards
compatibility.
File descriptor handling changes in 2.6.27
Posted Aug 6, 2008 9:16 UTC (Wed) by tialaramex (subscriber, #21167)
[Link]
I never mentioned proprietary plugins. We have these things called utility libraries. They use
fork+exec to run helper binaries among other reasons.
The utility library is not a trust boundary. If you think it is, then you've already screwed
up. However the resulting exec() is a trust boundary, the kernel provides for the executed
binary to receive different security privileges to the calling process.
If you think about it a little it's obvious that closefrom() isn't the appropriate interface
on its own because it requires you to export all the information about close-on-exec rules
into some arbitrary global structure and then have all utility libraries co-operate to use and
update that structure, whereas CLOEXEC pushes the relevant /security critical/ information
into each individual file descriptor. Sure enough the other systems you're talking about all
have CLOEXEC for exactly this reason. They just haven't fixed it yet and Linux has.
File descriptor handling changes in 2.6.27
Posted Aug 6, 2008 13:20 UTC (Wed) by mheily (guest, #27123)
[Link]
The addition of O_CLOEXEC to the open(2) system call is a good idea, and is part of the
solution to the problem of "secure file descriptor handling" as udrepper calls it. I didn't
mean to imply that the O_CLOEXEC flag is useless, or that closefrom(3) is the be-all-end-all
solution to the problem. They are both tools that programmers need to make software more
secure. My point is that closefrom() is simpler solution and is applicable to the majority of
the cases where fd leakage is a concern.
A utility library is the perfect example of where closefrom() is needed. Since a library runs
in the same process context as the program that it is linked against, it inherits all open
file descriptors after calling fork(2). These descriptors have meaning to the overall program,
but are meaningless to the library. The library should, on principle, close all unneeded
descriptors prior to calling execve(), but it cannot guarantee that the O_CLOEXEC flag has
been enabled on all of the descriptors. As an extra precaution, and a simple way of making
*certain* that there is no leakage, the library can call closefrom() to exclude an entire
range of descriptors.
File descriptor handling changes in 2.6.27
Posted Aug 5, 2008 9:00 UTC (Tue) by danpb (subscriber, #4831)
[Link]
Surely if you wanted maximum portability you'd just use the POSIX standard
sysconf(_SC_OPEN_MAX) rather than a non-standard OS specific API, and iterate closing all FDs
& ignoring EBADF for those which aren't open. Obviously not as efficient as closefrom(), but
achieves the same end goal without having to #ifdef for OS specific APIs.
int open_max = sysconf (_SC_OPEN_MAX);
for (int i = 3; i < open_max; i++)
close(i);
In any case, neither this, nor closefrom() really address the use case Uli was targetting. The
core issue is you can't guarrentee that everywhere in your program will take care to close()
FDs between fork & exec. Or the more complex scenario where you /want/ to let certain FDs
propagate to the child - the logic of which FDs to propagate to be kept at the place where the
FD is created - likely a totally different piece of code to that which does the fork/exec. Use
of O_CLOSEXEC gives the maximum flexiblity and control over FD handling
File descriptor handling changes in 2.6.27
Posted Aug 6, 2008 22:06 UTC (Wed) by quotemstr (subscriber, #45331)
[Link]
the logic of which FDs to propagate to be kept at the place where the
FD is created - likely a totally different piece of code to that which does the fork/exec
The code that calls fork()/exec() ought to be what decides what file descriptors the child inherits. Since that code is also responsible for closing file descriptors, there is no problem.
Conversely, code in some random library should not rely on its open file descriptors being inherited by child processes created by a completely different part of the code. That's spooky action at a distance.
Programs using fork()/exec() should always take care to close unneeded file descriptors. It's good programming. While numerous techniques to do that have been described in this thread, a closefrom() library call would go a long way toward making sure programs actually did what they were supposed to.
Remember: generally, the only way to make people do the Right Thing is to make the Right Thing the Easy Thing.
File descriptor handling changes in 2.6.27
Posted Aug 12, 2008 14:05 UTC (Tue) by endecotp (guest, #36428)
[Link]
Be aware that SC_OPEN_MAX could be a very large number. For example, I've seen the Apache
module portion of the subversion server get very slow because it does this. I would
definitely advocate first trying the /proc/self method, and only falling back to SC_OPEN_MAX
if there's no alternative.
File descriptor handling changes in 2.6.27
Posted Aug 5, 2008 19:37 UTC (Tue) by njs (guest, #40338)
[Link]
> Another simpler approach, but not as efficient, is to iterate over /proc/$$/fd and close all
of the open file descriptors listed there.
I believe that this is a standard approach, and indeed one that udrepper advocates in other
cases. (Arguably a single-syscall approach would be cleaner, to avoid the dependency on
having /proc mounted at a well-known location, but it's hard for me to imagine that efficiency
is really an issue here -- it's not like reading /proc will hit the disk, so the overhead is
just a few extra syscall entries.) Certainly this is useful functionality to have.
But you seem to arguing that -- since we have this other useful functionality -- close-on-exec
becomes a useless feature that would be better to ignore than to fix, while I would tend to
think that working close-on-exec and efficient closefrom are both valuable. It would be
easier to evaluate your argument if you addressed this point directly. The nominal benefit of
close-on-exec is that it allows locality of control -- the code that creates the fd is (often)
the code that is best prepared to know whether it should be kept local to the process or not.
If you don't have close-on-exec, then working out *which* descriptors should remain open and
which should remain closed requires long-distance coupling between the fork/exec code and all
code which creates file descriptors. Do you disagree?
File descriptor handling changes in 2.6.27
Posted Aug 5, 2008 21:45 UTC (Tue) by zlynx (subscriber, #2285)
[Link]
Performance *does* matter. I was working on speeding up gnome-terminal start. First I used
my own version of readahead (renamed to readlock) to mlock all required files into RAM. Then
it was still slower than xterm so I began stracing everything and I discovered that
gnome-terminal calls something like gnome-pty-helper, and that it does fork, then close 3-4096
(4096 was my max fd number), then pty-helper did it *again*.
I removed the close loop from pty-helper and also cut my max fds down to 256 and it was
noticably faster to start.
At any rate, using /proc and readdir to close only open fds is probably much faster than
blindly closing fds 3-256, let alone 3-4096.
File descriptor handling changes in 2.6.27
Posted Aug 6, 2008 22:13 UTC (Wed) by quotemstr (subscriber, #45331)
[Link]
the nominal benefit of
close-on-exec is that it allows locality of control -- the code that creates the fd is (often)
the code that is best prepared to know whether it should be kept local to the process or not.
I disagree with the locality of control argument. In a well written program, any code which creates a file descriptor to be inherited across an exec boundary ought to be intimately tied to that exec: consider shell pipeline setup. A piece of code unrelated to that exec (say, X11, or the DNS resolver) should not expect its file descriptors to propagate across an exec.
However, not all libraries will hygienically mark their internal file descriptors as close-on-exec. So, in a well-written program:
Code unrelated to an exec SHOULD mark internal file descriptors with O_CLOEXEC.
fork/exec code MUST close all extraneous file descriptors, as not every library will obey rule #1
File descriptor handling changes in 2.6.27
Posted Aug 7, 2008 6:48 UTC (Thu) by njs (guest, #40338)
[Link]
So it sounds like you're arguing that close-on-exec should be the default -- and if backwards
compatibility forbids it *actually* being default, then we should write code in such a way
that it becomes the default. I tend to agree. The original argument that I was responding
to, though, was suggesting that it didn't much matter if close-on-exec were broken, which
seems like the opposite of your point...
File descriptor handling changes in 2.6.27
Posted Aug 15, 2008 17:27 UTC (Fri) by sethml (subscriber, #8471)
[Link]
How about a call similar to closefrom(), but which takes a list of fds not to close, and
closes
all fds but those in the list? This avoids the brain-dead assumptions about fd ordering which
closefrom() makes, but makes it easy to leave just a select few fds open for the child. As
the
parent comment points out, any code which relies on leaving fds other than stdout/stdin/
stderr open for the child probably knows exactly which fds the child will need.
File descriptor handling changes in 2.6.27
Posted Mar 2, 2011 20:53 UTC (Wed) by nybble41 (subscriber, #55106)
[Link]
There are perfectly legitimate situations in which the code which calls exec() may not know which file descriptors need to be open. For example, let's say you have a shell script which takes a filename parameter and passes it to some other executable for processing. You run it as follows: "./my_script <(some_cmdline)". This causes the (bash) shell to create a pipe file descriptor, say FD 3, and pass it as "/dev/fd/3" to the script. In order for this to work, the script *must* preserve FD 3 when calling exec() for the lower-level executable so that the child's open() call can access the original descriptor. However, without knowing how the script will be called there is no way to know that FD 3 will even be open, much less that the executable will need access to it after the exec() call.
This can come up not only in shell scripts, but in any case where you might pass a filename received on the command-line to a child process. I would say that the current Linux model of marking file descriptors as "current process only" or "inheritable" in the open() call is the correct one, apart from the choice of default. Once an FD has been designated for use by child processes it should remain open by default across fork()/exec() calls, unless there is a compelling reason to close it. (I would, however, be in favor of a safe and simple way to explicitly close all but a designated set of descriptors without performing a close() syscall for potentially millions of possible FDs.)
File descriptor handling changes in 2.6.27
Posted Mar 2, 2011 21:12 UTC (Wed) by foom (subscriber, #14868)
[Link]
That's not true: bash knows (or could easily keep track of) which fds the shell script it's running has requested be opened, and thus which it should pass to future executed programs. (since the semantics of the shell scripting language are that all the opened fds get passed to all programs you run).
That doesn't imply that all *other* non-shell-script-requested exec that get called from bash (e.g. execing a program from a NSS plugin) should also pass those same FDs! The right place really would've been for the list to be specified in exec. But...it's too late for that.
File descriptor handling changes in 2.6.27
Posted Mar 2, 2011 23:18 UTC (Wed) by nybble41 (subscriber, #55106)
[Link]
The top-level interactive instance of bash knows, but the script *doesn't*, at least not without parsing the filename it was given. (I assume you agree that it would normally be a bad idea for programs to assign meaning to specific filename patterns?)
The first exec() is not the problem; as you say, bash knows that it opened a certain FD to pass to the script and would avoid closing it. The issue arises when the script tries to pass the /dev/fd/N filename it received to some other command. If the script closes all the file descriptors apart from stdin/stdout/stderr and any others *it* knows about--which would not include the FD opened by its parent process--the child process will either receive an error, or even duplicate an unrelated FD, when attempting to open the original path.
Keep in mind that this is a simple case; there could be any number of levels of fork()/exec() between that interactive session and the actual user(s) of the /dev/fd/N path; only the first is likely to be aware of the need to preserve the associated file descriptor.
I agree that there are cases (such as your NSS helper example) where it makes sense to close most or all file descriptors between fork() and exec(). However, at the very least, any time you pass on a filename received directly or indirectly from a parent process you should also pass on any file descriptors which were open when your process was started; anything less risks breaking the ability to use <(...) or >(...) from the shell in place of a regular file (among other uses).
File descriptor handling changes in 2.6.27
Posted Mar 2, 2011 23:40 UTC (Wed) by foom (subscriber, #14868)
[Link]
Ah, indeed. I had forgotten about that evil little non-portable hack. Well, if you instead use the temporary fifo implementation of <() (which bash already supports), you won't have that problem. And since we're talking in hypotheticals here (it's not like exec is actually going to change), I declare that a perfectly acceptable solution to the issue.
File descriptor handling changes in 2.6.27
Posted Aug 9, 2008 23:43 UTC (Sat) by jlokier (guest, #52227)
[Link]
the code that creates the fd is (often)
the code that is best prepared to know whether it should be kept local to the process or not.
I would modify that in multi-threaded programs. Code is best prepared to know whether its descriptors should be kept local to the process or passed to child processes it creates itself. Other threads, which may run unrelated code also doing fork+exec at the same time, should not pass the same descriptors to their child processes.
Any code (say in a utility library that you don't control yourself) that does fork+exec, may create a pipe or something to pass to its child process. It knows the descriptor should not be local to the process.
Trouble is, another thread can be doing something completely unrelated. So to be safe, all code including utility libraries must use O_CLOEXEC (or equivalent) for every descriptor they create, and later call FD_CLOEXEC inside the child after fork() to turn off close-on-exec, the opposite of what's normally done.
The other alternative is to have a global lock around all calls which create file descriptors and fork(). That's fine in code you control, and completely portable. But you can't expect all utility libraries to cooperate. Even gethostbyname() won't cooperate.
Another alternative is to close all possible file descriptors after fork() except those being inherited. But that's slow, sometimes very slow, and you still need all utility libraries which use fork() to do that themselves.
It's ugly however you look at it. Utility libraries are unlikely to do the right thing for a long time, if ever. You certainly can't trust them to do the right thing unless they explicitly document that they do, or unless you know for sure they don't create file descriptors.
(Btw, what I do in my "big server" app is a combination of the above: close unknown descriptors, but keep track of calls into utility libraries, assume a limit of the number of descriptors they each open at a time, and using knowledge of the POSIX first-free-number rule, close that many total descriptors that my app doesn't know about explicitly, so it's not too slow, and use O_CLOEXEC or lazy-F_CLOEXEC to manage descriptors that the app does know about explicitly. It's ugly, but wraps into a tidy enough API and scales well.)