Cook: security things in Linux v5.1

Posted May 29, 2019 21:51 UTC (Wed) by rweikusat2 (subscriber, #117920)
In reply to: Cook: security things in Linux v5.1 by Cyberax
Parent article: Cook: security things in Linux v5.1

What's your problem with this simple fact? Using your pseudocode as example, it's impossible to determine if a pid in the processes list still refers to the process found by listProcesses by the time of the open.

Putting this in another way, processes with the given name which are started after listProcess returned and before the loop terminates will only be killed if they happen to reuse a pid which hasn't already been used in the loop.

Cook: security things in Linux v5.1

Posted May 29, 2019 21:54 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (11 responses)

> What's your problem with this simple fact?
Because you're stupidly wrong?

My code does exactly what the contract of pkill is: killing processes by name.

> Putting this in another way, processes with the given name which are started after listProcess returned and before the loop terminates will only be killed if they happen to reuse a pid which hasn't already been used in the loop.
This is true with ANY variant of pkill. A matching process can be started at the moment pkill exits and it'll be missed.

Yet my code guarantees that any matching processes that existed before the pkill launch will be killed.

Cook: security things in Linux v5.1

Posted May 29, 2019 23:50 UTC (Wed) by ms-tg (subscriber, #89231) [Link] (9 responses)

I’m wondering why the race condition question here is attracting such uncollegial responses (“obsession”, “stupidly”)?

My understanding of the race condition being asked about:
1. listProcesses is called. Imagine it atomically returns a set of (name, pid) tuples alive at the time of the call

2. a process with name: “foo” and pid: 42 is filtered by the lambda from that list

3. This process exits. A new process named “dont_kill_me” is started and received recycled pid 42

4. The pid fd for 42 is opened. It refers to dont_kill_me

5. This new process is killed

I’d be super interested in learning more about:

(a) does the pseudocode as presented actually avoid suffering from this race, and how?

(b) is there something offensive about asking about this?

Thanks!

Cook: security things in Linux v5.1

Posted May 29, 2019 23:57 UTC (Wed) by ms-tg (subscriber, #89231) [Link] (1 responses)

Also: my understanding of why parent process was mentioned by the original poster, is that an assumption is being referred to that only the parent process can racelessly acquire the pid. Any other process would be racing to get the pid, and open the pidfd from it, before it could die and be recycled.

Is that context correct? At least that’s the inference I made about the relevance of a parent process.

Cook: security things in Linux v5.1

Posted May 30, 2019 12:55 UTC (Thu) by rweikusat2 (subscriber, #117920) [Link]

Yes, this is correct.

[verbal redundancies due to LWN blocking 'yes' as an answer ...]

Cook: security things in Linux v5.1

Posted May 30, 2019 0:05 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (6 responses)

> I’m wondering why the race condition question here is attracting such uncollegial responses (“obsession”, “stupidly”)?
Because rweikusat2 seems obsessed with insisting that the old Unix API is the only source of truth.

> (a) does the pseudocode as presented actually avoid suffering from this race, and how?
Yes. Imagine that one of the initially matched processes dies and is replaced by another process.

In this case the pid_open() call will return the handle to that incorrect process. However, this is fine because the next check for checkProcess() will fail and nothing is going to be killed.

If this check does succeed then there are two options:
1) The new process also matches the target name and so it's OK to kill it (according to pkill's contract).
2) The new process has actually died in the time between the pid_open and checkProcess, and it's been replaced by a new process that matches the target. In this case the further pidfd_kill() call will be a no-op (since its target is dead), but this is still fine within the pkill's contract.

> (b) is there something offensive about asking about this?
No.

Cook: security things in Linux v5.1

Posted May 30, 2019 9:36 UTC (Thu) by moltonel (guest, #45207) [Link]

> Imagine that one of the initially matched processes dies and is replaced by another process.
>
> In this case the pid_open() call will return the handle to that incorrect process. However, this is fine because the next check for checkProcess() will fail and nothing is going to be killed.
>
> If this check does succeed then there are two options:
> 1) The new process also matches the target name and so it's OK to kill it (according to pkill's contract).
> 2) The new process has actually died in the time between the pid_open and checkProcess, and it's been replaced by a new process that matches the target. In this case the further pidfd_kill() call will be a no-op (since its target is dead), but this is still fine within the pkill's contract.

Might as well not do the initial matching then ? Just go through the unfiltered process list, open /proc/<pid>, match using content of /proc/<pid>/, and kill if it matches ? I imagine performance cost of opening /proc/<pid> is dwarfed by the general cost of process matching ?

The commit message doesn't say clearly whether "the entire procfs directory remains visible or just one entry" when /proc/<pid> is opened (it says "same as happens right now" but I don't know what that is). I'm tempted to understand that the whole directory remains visible (otherwise the patch only barely reduces the RC window), but at the same time the message says that opening /proc/<pid> doesn't reserve the pid, so I wonder.

Cook: security things in Linux v5.1

Posted May 30, 2019 12:07 UTC (Thu) by rweikusat2 (subscriber, #117920) [Link] (4 responses)

There's actually a third option: The program running in the process which matched the search criteria during the second check does an exec and thus, turns into a process which shouldn't have been killed, before the signal is being sent. This causes the contents of a /proc/$pid directory to change in-place, as can be trivially tested by doing an exec of 'something suitable' from a shell.

It's probably possible to prevent this somehow but at least, this wasn't mentioned so far.

Cook: security things in Linux v5.1

Posted May 30, 2019 16:12 UTC (Thu) by FLHerne (guest, #105373) [Link] (3 responses)

I don't see that as an issue - the process should have been killed at the time `pkill` was started; it might as well have been killed moments *before* changing, e.g. with scheduling differences or if `pkill` was a little faster.

Cook: security things in Linux v5.1

Posted May 30, 2019 17:21 UTC (Thu) by rweikusat2 (subscriber, #117920) [Link] (2 responses)

The process should have been killed if the pkill algorithm would work. Which would require freezing the system until the program is done examining the process list. As it cannot do this, it's just another process pkill shouldn't have killed as it didn't match the selection criteria: This utility is inherently broken/ unreliable.

Cook: security things in Linux v5.1

Posted May 30, 2019 22:57 UTC (Thu) by cyphar (subscriber, #110703) [Link] (1 responses)

pkill today might kill a process like this. You're describing a fundamental race condition in the contract of pkill -- "kill all processes with this name" obviously has a probabilistic chance of killing a process that is in the middle of changing its name. I highly doubt that any user of pkill wants the contract that you're claiming they want.

Also, pkill is a toy example. The real users of this syscall are more complicated systems which are sure that the pidfd is correct, and need to store it somewhere so they can detect the process dying properly (and send it signals if its not dead). Container runtimes and init systems are obvious contenders.

Cook: security things in Linux v5.1

Posted May 31, 2019 17:28 UTC (Fri) by rweikusat2 (subscriber, #117920) [Link]

I didn't write anything about "processes changing their name" (that's a prctl operation) or "processes in the middle of something". I wrote about pkill killing a process it shouldn't have killed because the program running in this process changed after pkill had selected it for killing but before sending a signal to it. That's a second TOCTOU race which is indeed -- as I already wrote - inherent in the pkill algorithm[*]. For a reliable pkill, ensuring that the pid is still valid isn't sufficient because the process can change as well.

I also didn't come up with the pkill example, I made a general statement about pidfd open calls in absence of some guarantee that the process the pid originally referred to didn't terminate in the meantime. pkill was presentend as an example where one can claim that the race wouldn't matter in practice for more or less good reasons.

[*] Both the "process terminates and pid gets reused" and the "process starts to execute a different program" races could be avoided without any new system calls by sending a SIGSTOP to a process before checking its attributes, BTW.

Cook: security things in Linux v5.1

Posted May 30, 2019 0:38 UTC (Thu) by rweikusat2 (subscriber, #117920) [Link]

I do not dispute your unrelated statement.