Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Posted Sep 1, 2023 22:29 UTC (Fri) by alkbyby (subscriber, #61687)Parent article: Race-free process creation in the GNU C Library
Perhaps it would help if someone could elaborate more what are exact valid or semi-valid uses that are raceful currently (or article could be updated).
I.e. classic posix_spawn and wait should just work. Don't wait{,pid} for your child until you've grabbed it's pidfd and you have no race.
I can only see one special case which is, if parent ignores SIGCHLD then child exiting status is automatically collected, so wait{,pid} won't see it. There is no zombie stage and there is no pid to find. And then, indeed, we could use pidfd bits including this new API to handle this case which would otherwise be raceful. I am not sure how much demand for this case there is, since it "breaks" wait{,pid} anyways.
Or am I missing anything ?
Posted Sep 1, 2023 23:28 UTC (Fri)
by bluca (subscriber, #118303)
[Link] (2 responses)
Posted Sep 2, 2023 0:26 UTC (Sat)
by alkbyby (subscriber, #61687)
[Link] (1 responses)
But my point is as long is we're able to guarantee that child's pid is not reused, there is no race if/when parent calls whatever set_xyz on child's pid (it may find child dead, but it'll never confuse this child with another process). And classic mechanism of zombies gives us exactly that. Child's pid won't get reused until parent collects child's status.
P.S. Also I was under the impression that lot/most of those "many things" (setsid, unshare etc) are typically what child does for itself (after clone_vfork but before exec, for which posix_spawn has numerous attributes).
Posted Sep 2, 2023 10:17 UTC (Sat)
by bluca (subscriber, #118303)
[Link]
Posted Sep 2, 2023 0:39 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (6 responses)
First, it's not possible to handle SIGCHLD meaningfully in many environments (e.g. in a lot of scripting languages). Second, even with SIGCHLD handlers, you have to walk on a tightrope to have a truly race-free code. You can only wait() on processes in exactly one thread (likely in the main event loop), that has to execute exclusively with any other code that might operate on processes. So the only thing your handler can do safely is to kick the event loop to perform a waitid()/waitpid() check.
And forget about multithreading and composability. It's simply impossible to write fully correct multithreaded process management code.
Posted Sep 2, 2023 1:41 UTC (Sat)
by alkbyby (subscriber, #61687)
[Link] (5 responses)
We might be misunderstanding each other, somehow. But what you said is untrue. A thread can easily posix_spawn sub-process and waitpid for it. Even from inside library. Yes if process does blanket wait() in some other thread it wont work, but this seems borked design to me. (Is that one of use-cases quoted by article? Is there non-trivial programs or libraries doing such a thing ?)
There are definitely libraries doing sub-process spawning. E.g. I recently learned tensorflow does to compile some hw accelerator codes.
Posted Sep 2, 2023 2:15 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
However, _any_ other wait/waitid() in the process can reap it, waits are not thread-scoped. So you can't have anybody in the process calling them. And if you ONLY do waitpid() calls, it might even be composable.
Except... you do have to call wait() periodically to avoid zombies, because your spawned process can die and reparent its children into your process.
Posted Sep 3, 2023 9:50 UTC (Sun)
by roc (subscriber, #30627)
[Link] (1 responses)
Posted Sep 4, 2023 3:32 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Sep 2, 2023 2:55 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
Then they are quite likely unsafe, though in practice they would work fine in the vast majority of cases because typical race windows are pretty narrow. You really need malicious input and/or users to exploit that.
Posted Sep 2, 2023 4:07 UTC (Sat)
by alkbyby (subscriber, #61687)
[Link]
Well, your comment above about reparenting is only right for pid 1. And I am not sure how much software there is that "steals" other modules/libraries dead kids. My impression is there shouldn't be much.
I quickly inspected libuv for sub-process spawning, they don't steal. And glib. They also do the right thing (even with pidfd when available, since pidfd can be nicely polled).
With all that I am still curious what might the use-cases that people try to fix by proposed pidfd_spawn API. So far we've established it could be:
a) when process breaks wait{,pid} by ignoring SIGCHLD
b) when process has things that steal dead kids
But perhaps there are more. And I am curious how common those "bad" cases might be.
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library
Race-free process creation in the GNU C Library