|
|
Log in / Subscribe / Register

wait() on a PID that is not your child

wait() on a PID that is not your child

Posted Oct 8, 2011 0:27 UTC (Sat) by Richard_J_Neill (subscriber, #23093)
Parent article: A Plumber's Wish List for Linux

There are lots of uses for this! Its usually possible to work around, but sometimes not. A contrived example might be for a user to want kdialog to tell him as soon as apache quits; at the moment, it's necessary to poll.


to post comments

wait() on a PID that is not your child

Posted Oct 8, 2011 1:33 UTC (Sat) by HelloWorld (guest, #56129) [Link] (10 responses)

That is inherently racy. In order to do that, you first need to find out about the relevant process's PID, and then start waiting for it. But when you start waiting, the process may already have exited, and a new process might have its PID now, so you may end up waiting for another process than the one you intended. In order to fix this, larger PIDs would be necessary, so that every new process gets a PID that was never used before.

wait() on a PID that is not your child

Posted Oct 8, 2011 3:10 UTC (Sat) by neilbrown (subscriber, #359) [Link] (9 responses)

I would suggest that the best way to do this is not to use 'wait' at all but to support 'poll' on some file in /proc/$pid.
I would suggest /proc/$pid/status. It wouldn't be hard to get a poll on this file to report POLLERR when the process dies or changes state.
For extra points you could add an 'exit status' field that only appears after the process has exited - not sure if that is necessary though.

wait() on a PID that is not your child

Posted Oct 8, 2011 11:08 UTC (Sat) by HelloWorld (guest, #56129) [Link] (8 responses)

That doesn't fix the problem I pointed out.

wait() on a PID that is not your child

Posted Oct 9, 2011 3:10 UTC (Sun) by neilbrown (subscriber, #359) [Link] (2 responses)

You are correct, it doesn't. But it could.

I was thinking that /proc/$PID was some how linked to the actual process so that when the process died, that directory would become empty and would stay empty. However it isn't.
/proc/$PID is linked to $PID so if a new process appeared with the same pid, its details would appear in the same directory.
i.e. if you "cd /proc/$PID". then "kill -9 $PID", the directory will appear empty (or give an error on readdir) but if another process gets called $PID, "ls ." will start showing things again.

However this could easily be "fixed" for example by using a generation number similar to that used by NFS. Each new process gets a random generation number assigned to it and when you open /proc/$PID that number gets copied into the inode that is created. Then accesses to a process through that inode always check that the generation number is correct as well as the pid. About a dozen lines of code.

With that in place, your race would be trivial to avoid. Just "chdir" to the /proc/$PID directory, check again that this is the process that you are interested in, then open "status" and 'poll' for POL_ERR.

wait() on a PID that is not your child

Posted Oct 9, 2011 7:25 UTC (Sun) by ebiederm (subscriber, #35028) [Link] (1 responses)

/proc/$PID/ is linked to the process and it very much becomes empty when a process dies.

If a new process gets the same pid a different directory is created.

The tricky bit is actually is the way process death updates are implemented internally. The data structures are backwards and need to be turnedd around so poll on the file descriptor could be implemented.

There is still the race of changing into the directory at the top of this thread but pid reuse is typically slow enough that race should be hard to hit.

wait() on a PID that is not your child

Posted Oct 9, 2011 10:54 UTC (Sun) by neilbrown (subscriber, #359) [Link]

hmm... I must have missed an important piece in the code. I just tested and the old empty directory definitely stays empty as you said. Thanks for the correction.

I don't think the race at the top is real. Whatever mechanism was used to determine which pid to wait for can be repeated after the "cd /proc/$pid" to see if it is still the same. If it is the same, then it is perfectly safe to wait for files to disappear (if/when there is a mechanism to do that).

wait() on a PID that is not your child

Posted Oct 10, 2011 17:21 UTC (Mon) by mezcalero (subscriber, #45103) [Link] (4 responses)

I wonder if this could be fixed by actually having 64bit PIDs. It's not that easy making things overrun 2^64.

wait() on a PID that is not your child

Posted Oct 10, 2011 19:01 UTC (Mon) by HelloWorld (guest, #56129) [Link] (2 responses)

64 bit PIDs ought to be enough for anybody. If the system creates 10000
processes per second, the PIDs would overflow after 2^64/(60*60*24*365*10000) = 58494241 years.

wait() on a PID that is not your child

Posted Oct 12, 2011 22:15 UTC (Wed) by Baylink (guest, #755) [Link] (1 responses)

> should be enough for anybody.

You didn't *really* expect to get away with that, here, did you?

:-)

wait() on a PID that is not your child

Posted Oct 12, 2011 22:49 UTC (Wed) by HelloWorld (guest, #56129) [Link]

It was worth a try.

wait() on a PID that is not your child

Posted Oct 11, 2011 1:13 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

They would look quite ugly, though.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds