LWN: Comments on "Completing the pidfd API" https://lwn.net/Articles/794707/ This is a special feed containing comments posted to the individual LWN article titled "Completing the pidfd API". en-us Wed, 15 Oct 2025 18:14:26 +0000 Wed, 15 Oct 2025 18:14:26 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Completing the pidfd API https://lwn.net/Articles/994535/ https://lwn.net/Articles/994535/ jengelh <div class="FormattedComment"> <span class="QuotedText">&gt;one can write "pthread_detach(pthread_self())" to detach yourself, whereas presumably detaching yourself via a pidfd you got from "yourself" would be a no-op</span><br> <p> In glibc-nptl, pthread_self and _detach are functions that involve just userspace. There is not going to be a deadlock/deadlock-avoiding-noop as you envisioned.<br> </div> Thu, 17 Oct 2024 00:34:45 +0000 Completing the pidfd API https://lwn.net/Articles/943711/ https://lwn.net/Articles/943711/ bartoc <div class="FormattedComment"> Does this get rid of some shared data-structure in the kernel that maps PIDs to the actual resources of the process? I ask because I noticed that with pthreads one can write "pthread_detach(pthread_self())" to detach yourself, whereas presumably detaching yourself via a pidfd you got from "yourself" would be a no-op as any resources would be held open by the other pidfd opened by your parent thread. I noticed this when implementing pthreads/c11 threads on windows, where threads are represented by HANDLES that are recounted and work similarly to files. On windows I came to the conclusion that allowing threads to detach themselves would require a shared data-structure holding the mapping of PIDs to HANDLES.<br> </div> Wed, 06 Sep 2023 16:59:41 +0000 Completing the pidfd API https://lwn.net/Articles/934703/ https://lwn.net/Articles/934703/ Cyberax <div class="FormattedComment"> Version 1 and 2 UUIDs include system time, so they can't collide unless the kernel is compromised.<br> <p> But more practically, your system with time-based IDs is just ugly, just like UUIDs.<br> <p> And pidreserve() won't help against targeted wraparound attacks. <br> </div> Thu, 15 Jun 2023 01:42:35 +0000 Completing the pidfd API https://lwn.net/Articles/934693/ https://lwn.net/Articles/934693/ jredfox_ <div class="FormattedComment"> UUIDS MS creation time is way over 700! Also UUID collisions can and have occurred. currentMS states the current ms the creation time was fetched and the creation time of the process never changes. UUID's are problematic <br> </div> Wed, 14 Jun 2023 23:23:04 +0000 Completing the pidfd API https://lwn.net/Articles/934568/ https://lwn.net/Articles/934568/ Cyberax <div class="FormattedComment"> <span class="QuotedText">&gt; you should have in the arguments PID-CREATIONTIME where creation time is MS preferably. I don't like JIFFIES or UNIX seconds it's not precise enough.</span><br> <p> Why not UUIDs then?<br> <p> And pidreserve doesn't prevent all race attacks.<br> </div> Wed, 14 Jun 2023 05:07:56 +0000 Completing the pidfd API https://lwn.net/Articles/934565/ https://lwn.net/Articles/934565/ jredfox_ <div class="FormattedComment"> This is a quick fix for an invalid solution. the proper solution is the PID solution. you should have in the arguments PID-CREATIONTIME where creation time is MS preferably. I don't like JIFFIES or UNIX seconds it's not precise enough.<br> <p> Or an even better solution create a call called reservePID(unsigned long PID). this will reserve the PID until the process that called it is closed. For security reasons it should limit the number of reserves it can use to about 200 PID's for IPC(unrelated non child process's) per process and unlimited amount for child process's. <br> </div> Wed, 14 Jun 2023 02:44:42 +0000 Completing the pidfd API https://lwn.net/Articles/813851/ https://lwn.net/Articles/813851/ re:fi.64 <div class="FormattedComment"> Whoops, minor amendment: ECHILD, not ESRCH<br> </div> Thu, 05 Mar 2020 03:50:05 +0000 Completing the pidfd API https://lwn.net/Articles/813850/ https://lwn.net/Articles/813850/ re:fi.64 <div class="FormattedComment"> <font class="QuotedText">&gt; Beyond the ability to unambiguously specify which process should be waited for, this change will eventually enable another interesting feature: it will make it possible to wait for a process that is not a child — something that waitid() cannot do now. </font><br> <p> I think this is false, at least having tried it waitid will always return ESRCH.<br> </div> Thu, 05 Mar 2020 03:46:06 +0000 Completing the pidfd API https://lwn.net/Articles/801597/ https://lwn.net/Articles/801597/ rvk <div class="FormattedComment"> Would be nice to get this working with process events connector.<br> </div> Tue, 08 Oct 2019 02:45:37 +0000 Completing the pidfd API https://lwn.net/Articles/797463/ https://lwn.net/Articles/797463/ nix <div class="FormattedComment"> The fact that it's modelled on waitid() suggests not. waitid() throws away some of the ptrace()-necessary info waitpid() packs into its return value, so you can't use it if you're doing ptrace monitoring (though this is nowhere documented that I can see: you have to reverse-engineer it from the code and from the fact that the ptrace documentation never once mentions waitpid).<br> </div> Tue, 27 Aug 2019 19:17:49 +0000 Completing the pidfd API https://lwn.net/Articles/795155/ https://lwn.net/Articles/795155/ flussence <div class="FormattedComment"> We should probably replace CAP_SYS_ADMIN programs (e.g. ffmpeg kmsgrab without running explicitly as root) with IPC first. setuid is less subversive, as at least it's visible in ls. <br> </div> Fri, 02 Aug 2019 09:14:57 +0000 Completing the pidfd API https://lwn.net/Articles/795047/ https://lwn.net/Articles/795047/ mezcalero <div class="FormattedComment"> I think the lesson of this is probably not to introduce any new setuid programs anymore, and do privilege elevation only by IPC.<br> </div> Thu, 01 Aug 2019 07:16:46 +0000 Completing the pidfd API https://lwn.net/Articles/794874/ https://lwn.net/Articles/794874/ cyphar <div class="FormattedComment"> We need to be very careful about adding read()/write() support to control-related fds -- because you can always spawn a setuid program with a different set of stdio fds and potentially trick it into reading/writing something that was not intended to the control fd (and if the permission checks aren't done on open()-time then you have just created a security bug).<br> </div> Tue, 30 Jul 2019 09:21:03 +0000 Completing the pidfd API https://lwn.net/Articles/794817/ https://lwn.net/Articles/794817/ wahern <div class="FormattedComment"> FreeBSD has had pdfork for almost 8 years (9.0 released Jan 2012): <a href="https://www.freebsd.org/cgi/man.cgi?query=pdfork&amp;sektion=2">https://www.freebsd.org/cgi/man.cgi?query=pdfork&amp;sekt...</a><br> <p> The real dilemma after this is how to acquire process fds when children fork. The BSD kqueue framework has permitted tracking forks and exits of descendants since almost the beginning[1], though there's still no mechanism to acquire a process fd for them.<br> <p> I mention this because there's no grand theory for a better process model, unless you count Capsicum from whence pdfork came. But in the Capsicum security model forking is normally disabled in descendants. Arguably one of the reasons it's taken Linux so long to get a process fd is precisely because of all the open ended questions about where to go next, which while unanswered have the effect of casting doubt on the utility of process fds, notwithstanding that most people agree that in the abstract they're a great idea.<br> <p> [1] Sometime between 1999, when kqueue was originally merged, and 2003, the earliest hit I got with a naive Google search.<br> <p> </div> Mon, 29 Jul 2019 10:19:22 +0000 Completing the pidfd API https://lwn.net/Articles/794812/ https://lwn.net/Articles/794812/ naptastic <div class="FormattedComment"> I'm really excited to see this work happening, even though most of my work is far removed from the kernel.<br> <p> I think BSD saw the (valid, real) problems with /proc and took the wrong lesson, where Linux is now converging on something smarter: providing an even more UNIXy interface ("a process is now also a file") to the process space. I'm looking forward to using this functionality, even if only indirectly.<br> </div> Sun, 28 Jul 2019 22:34:47 +0000 Completing the pidfd API https://lwn.net/Articles/794780/ https://lwn.net/Articles/794780/ clugstj <div class="FormattedComment"> Oh, I see now. Just poll() for whatever processes/sockets you want. When the poll() returns saying the process has exited, use pidfd_wait() to get the result.<br> </div> Sun, 28 Jul 2019 01:02:20 +0000 Completing the pidfd API https://lwn.net/Articles/794779/ https://lwn.net/Articles/794779/ quotemstr <div class="FormattedComment"> This is not a useful comment. <br> </div> Sun, 28 Jul 2019 00:53:59 +0000 Completing the pidfd API https://lwn.net/Articles/794778/ https://lwn.net/Articles/794778/ doublez13 <div class="FormattedComment"> Thank you! :)<br> </div> Sun, 28 Jul 2019 00:47:52 +0000 Completing the pidfd API https://lwn.net/Articles/794771/ https://lwn.net/Articles/794771/ brauner <div class="FormattedComment"> Not sure about the actual LMKD work but the backports for the kernels at least do exist:<br> <a href="https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.9+backport%22">https://android-review.googlesource.com/q/topic:%22pidfd+...</a><br> <a href="https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.14+backport%22">https://android-review.googlesource.com/q/topic:%22pidfd+...</a><br> <a href="https://android-review.googlesource.com/q/topic:%22pidfd+polling+support+4.19+backport%22">https://android-review.googlesource.com/q/topic:%22pidfd+...</a><br> </div> Sat, 27 Jul 2019 20:17:01 +0000 Completing the pidfd API https://lwn.net/Articles/794770/ https://lwn.net/Articles/794770/ doublez13 <div class="FormattedComment"> Can we get a link to the patches/RFCs for the Android work mentioned? Thanks.<br> </div> Sat, 27 Jul 2019 20:07:14 +0000 Completing the pidfd API https://lwn.net/Articles/794750/ https://lwn.net/Articles/794750/ ale2018 <div class="FormattedComment"> Since it is an fd, it would seem natural to expect to be able to read or write to it. Reading a bit when the process exits is not quite managing, say, a pipe. A pipe?! Hm... stdpid?<br> <p> How do I know if the process is busy crunching, sleeping, or waiting for input?<br> <p> Just fooling...<br> </div> Sat, 27 Jul 2019 11:47:06 +0000 Completing the pidfd API https://lwn.net/Articles/794741/ https://lwn.net/Articles/794741/ quotemstr <div class="FormattedComment"> Spelling differences. Doesn't change the model.<br> </div> Sat, 27 Jul 2019 03:28:50 +0000 Completing the pidfd API https://lwn.net/Articles/794739/ https://lwn.net/Articles/794739/ Cyberax <div class="FormattedComment"> Well, yes. But we're getting a waitid() flag instead.<br> </div> Sat, 27 Jul 2019 03:19:49 +0000 Completing the pidfd API https://lwn.net/Articles/794738/ https://lwn.net/Articles/794738/ quotemstr <div class="FormattedComment"> Sure, but doesn't pidfd_wait serve the role of read?<br> </div> Sat, 27 Jul 2019 03:15:52 +0000 Completing the pidfd API https://lwn.net/Articles/794737/ https://lwn.net/Articles/794737/ Cyberax <div class="FormattedComment"> But not read() afterwards.<br> </div> Sat, 27 Jul 2019 03:01:31 +0000 Completing the pidfd API https://lwn.net/Articles/794736/ https://lwn.net/Articles/794736/ quotemstr <div class="FormattedComment"> poll on pidfds already works<br> </div> Sat, 27 Jul 2019 02:45:47 +0000 Completing the pidfd API https://lwn.net/Articles/794734/ https://lwn.net/Articles/794734/ roc <div class="FormattedComment"> The discussion does not mention how this interacts with ptrace. rr could potentially benefit from the ability to hand ptrace control of a traced task from one ptracer process to another. I guess even if some pidfd-based API let other processes read wait statuses, those other process still wouldn't be able to execute ptrace() commands because they're not the (sole) ptracer of the traced tasks.<br> <p> Another question is whether this new API follows the ptrace/waitpid behavior, i.e. each ptraced thread of a process reports exit independently and is independently reaped. I really want that to be true, because that would give us a sane and reliable way to wait for some specific subset of all traced threads to exit, which is currently impossible.<br> </div> Fri, 26 Jul 2019 22:53:57 +0000 Completing the pidfd API https://lwn.net/Articles/794733/ https://lwn.net/Articles/794733/ roc <div class="FormattedComment"> Yes, that seems like an obvious thing to want.<br> </div> Fri, 26 Jul 2019 22:42:56 +0000 Completing the pidfd API https://lwn.net/Articles/794729/ https://lwn.net/Articles/794729/ clugstj <div class="FormattedComment"> I think it would be better to extent poll() to allow it to receive the exit information of the process (maybe read the exit info. from the pidfd). That way, a thread could wait for process termination and socket activity at the same time.<br> </div> Fri, 26 Jul 2019 21:10:50 +0000