Hiding a process's executable from itself

By Jonathan Corbet
January 23, 2023

Back in 2019, a high-profile container vulnerability led to the adoption of some complex workarounds and a frenzy of patching. The immediate problem was fixed, but the incident was severe enough that security-conscious developers have continued to look for ways to prevent similar vulnerabilities in the future. This patch set from Giuseppe Scrivano takes a rather simpler approach to the problem.

The 2019 incident, which came to be known as CVE-2019-5736, involved a sequence of steps that culminated in the overwriting of the runc container-runtime binary from within a container. That binary should not have even been visible within the container, much less writable, but such obstacles look like challenges to a determined attacker. In this case, the attack was able to gain access to this binary via /proc/self/exe, which always refers to the binary executable for the current process.

Specifically, the attack opens the runc process's /proc/self/exe file, creating a read-only file descriptor — inside the container — for the target binary, which lives outside that container. Once runc exits, the attacker is able to reopen that file descriptor for write access; that descriptor can subsequently be used to overwrite the runc binary. Since runc is run with privilege outside of the container runtime, this becomes a compromise of the host as a whole; see the above-linked article for details.

This vulnerability was closed by having runc copy its binary image into a memfd area and sealing it; control is then be passed to that image before entering the container. Sealing prevents modifying the image, but even if that protection fails, the container is running from an independent copy of the binary that will never be used again, so overwriting it is no longer useful. It is a bit of an elaborate workaround, but it plugged the hole at the time.

Scrivano is proposing a different solution to the problem: simply disable access to /proc/self/exe as a way of blocking image-overwrite attacks of this type. Specifically, it adds a new prctl() command called PR_SET_HIDE_SELF_EXE that can be used to disable access to /proc/self/exe. Once this option has been enabled, any attempt to open that file from within the owning process will fail with an ENOENT error — as if the file simply no longer existed at all. Enabling this behavior is a one-way operation; once it has been turned on, it cannot be disabled until the next execve() call, which will reset this option to the disabled state, is made.

This behavior is necessarily opt-in; any program that wants to have its executable image protected from access in this way will have to request it specifically. The intent, though, is that this simple call will be able to replace the more complicated workarounds that are needed to prevent this sort of attack today. A prctl() is a small price to pay if it eliminates the need to create a new copy of the executable image every time a new container is launched.

This new option is thus a simple way of blocking this specific attack, but it leads to some related questions. Hiding the container-runtime binary seems like less satisfying solution than ensuring that this binary cannot be modified regardless of whether it is visible within a container. It seems like it closes off one specific path to a compromise without addressing the underlying problem.

More to the point, perhaps, is the question of just how many operations the kernel developers would like to add to prevent access to specific resources that might possibly be misused. There is, conceivably, no end to the files (under /proc and beyond) that might be useful to an attacker who is determined to take over the system. Adding a new prctl() option — and the necessary kernel code to implement it — for every such file could lead to a mess in the long term. There comes a point where it might make more sense to use a security module to implement this sort of policy.

If the development community feels that way, though, it's not saying so — or much of anything else. The patch set has been posted three times and has not received any substantive comments any of those times. There will, presumably, need to be an eventual discussion that decides whether this type of mechanism is the best way of protecting systems against attacks using /proc/self/exe. For the moment, though, it would appear that this change is simply waiting for the wider community to take notice of it.

Index entries for this article
Kernel	/proc
Kernel	Security/Security technologies

Hiding a process's executable from itself

Posted Jan 23, 2023 17:07 UTC (Mon) by TheGopher (subscriber, #59256) [Link] (26 responses)

Wouldn't it be better to just *not* mount /proc in the mount namespace (container)? I feel this is somehow attacking the problem from the wrong angle.

Hiding a process's executable from itself

Posted Jan 23, 2023 17:09 UTC (Mon) by dskoll (subscriber, #1630) [Link] (2 responses)

Lots of programs assume that /proc is mounted; not mounting it would cause all kinds of annoying failures like the inability to usefully run ps inside a container, for example.

Hiding a process's executable from itself

Posted Jan 23, 2023 18:10 UTC (Mon) by TheGopher (subscriber, #59256) [Link] (1 responses)

How common is this within containers though these days? And these tools should I think work just fine from the outside of the container?
It seems like a game of whack-a-mole to me.

Hiding a process's executable from itself

Posted Jan 24, 2023 8:46 UTC (Tue) by LtWorf (subscriber, #124958) [Link]

I can't recall if it's java or c#, but at least one of them, and perhaps both, requires /proc to be mounted to run at all. I know because I had tried to not mount /proc.

Hide /proc?

Posted Jan 23, 2023 17:21 UTC (Mon) by nickodell (subscriber, #125165) [Link] (1 responses)

Many useful tools use /proc. For example, ps reads from it. If you restricted access to it, you'd break those tools.

Hide /proc?

Posted Jan 23, 2023 21:10 UTC (Mon) by jccleaver (guest, #127418) [Link]

>Many useful tools use /proc. For example, ps reads from it. If you restricted access to it, you'd break those tools.

I think a bigger point is that we should probably start swinging back to running more lightweight VMs and fewer containers.
The middle-ground is really annoying for anyone who has to live-debug something that was built by an OS-oblivious Dev somewhere who felt that no one would ever have to, say, run a ping command to validate network paths and was super-excited to show how they just saved 28K.

Hiding a process's executable from itself

Posted Jan 23, 2023 17:24 UTC (Mon) by ejr (subscriber, #51652) [Link] (17 responses)

The tools like ps don't work. Painful for debugging / monitoring, but I'm not sure how painful.

Hiding a process's executable from itself

Posted Jan 24, 2023 1:32 UTC (Tue) by ringerc (subscriber, #3071) [Link] (15 responses)

Exceptionally painful.

ps is uselessly container-ignorant at the moment. It has no support for filtering by pid namespace, for example. It does not know how to use the NSPid fields in /proc/$pid/status to show a process's pid in another pid namespace, or to display a process identified by a pidns + a pid in that namespace. It can't even be told to display a tree of processes starting at a specific parent pid, you have to use the otherwise more limited pstree command for that.

And that's just ps. Tools like gdb are completely incapable of functioning usefully across container boundaries and rely on having a gdbserver injected into the target container. Which in turn requires access to /proc, a session with CAP_SYS_PTRACE, etc.

It's a nightmare. Container runtime tools are utterly primitive and provide no assistance whatsoever.

As far as I can tell, believers in os-less containers don't actually think interactive debugging is relevant. I guess you're supposed to use printf() debugging, tracing/APM, and psychic powers.

Hiding a process's executable from itself

Posted Jan 24, 2023 8:33 UTC (Tue) by patrakov (subscriber, #97174) [Link] (14 responses)

Interactive debugging is not relevant, because in production it is most of the time explicitly prohibited by regulations, and auditors regularly check compliance. And for non-production, you are welcome to have a separate debug container.

Hiding a process's executable from itself

Posted Jan 24, 2023 10:38 UTC (Tue) by tpo (subscriber, #25713) [Link] (11 responses)

> Interactive debugging is not relevant

Objection. It is relevant. I have to do it all the time.

Hiding a process's executable from itself

Posted Jan 24, 2023 15:11 UTC (Tue) by khim (subscriber, #9252) [Link] (10 responses)

That just means that you don't have production environment, then.

Remember? Everybody has a testing environment. Some people are lucky enough enough to have a totally separate environment to run production in.

I recommend you to get that separate environment to run production in.

Hiding a process's executable from itself

Posted Jan 25, 2023 5:20 UTC (Wed) by ringerc (subscriber, #3071) [Link] (9 responses)

Allow me to introduce you to the swear-words "intermittent fault" and "heisenbug".

Sometimes you just have to debug what's in prod, because that's where the user/customer is hitting an issue.

You definitely don't want to have to re-deploy a bunch of new container versions with printf-style debugging hacked in, different compile flags with debuginfo, etc as you try to understand the problem.

Prod should be debuggable. Run services with minimal privileges, in bare-bones containers, yes. But have a way to fetch the debuginfo for compiled executables when required, inject an elevated process with a debugger, etc, so you can debug-in-place if and when required.

Hiding a process's executable from itself

Posted Jan 25, 2023 12:00 UTC (Wed) by khim (subscriber, #9252) [Link] (8 responses)

> Prod should be debuggable.

If your prod is debuggable then it just means you company is not large enough to have a prod.

> Allow me to introduce you to the swear-words "intermittent fault" and "heisenbug".

Oh, sure. There are lots of stories like these. But they are not the reason to give someone access to prod with customer's data. Precisely because they come and go they are annoying (and sometimes take years to find out and fix) but they are not enough to permit access of some random developer to customer's data.

> You definitely don't want to have to re-deploy a bunch of new container versions with printf-style debugging hacked in, different compile flags with debuginfo, etc as you try to understand the problem.

Not only you have to do that, but you have to ensure that all your temporary logging changes would be properly vetoed and approved by a security team. Or else regulators would eat you alive.

> But have a way to fetch the debuginfo for compiled executables when required, inject an elevated process with a debugger, etc, so you can debug-in-place if and when required.

As I have said: if you are allowed to do that the you don't have a prod but a testing environment exposed to customers.

While valuable in many cases this requires their explicit assent and it's not a prod.

You can afford to do thing differently in testing environment.

Hiding a process's executable from itself

Posted Jan 25, 2023 12:22 UTC (Wed) by tpo (subscriber, #25713) [Link] (4 responses)

You are assuming that there is exactly one context and that is your context with your givens and everything works like your context and only that way of working is correct and valid. Let me assure you there do exist other contexts than yours (really, I am not making this up!) that have other constraints than yours and work (maybe have to work!) differently.

Hiding a process's executable from itself

Posted Jan 25, 2023 17:28 UTC (Wed) by khim (subscriber, #9252) [Link] (3 responses)

Nope. It's not about context. It's about definition. If you can do ssh to your “production” system and do something there then it's “advanced testing” environment or something else, but that's not a “production environment”.

Most startups don't have production and that's normal. But that doesn't mean you need gdb access in production. You don't. It's not a production till you need it.

Hiding a process's executable from itself

Posted Jan 25, 2023 18:05 UTC (Wed) by tpo (subscriber, #25713) [Link] (2 responses)

Nope. Neither is there a canonical definition of "production environment" nor is it up to you in particular to tell anybody what the constraints to be applied are to be allowed to use that term.

Hiding a process's executable from itself

Posted Jan 25, 2023 18:09 UTC (Wed) by tpo (subscriber, #25713) [Link] (1 responses)

The definition of "production environment is" is, I would say: it is used in production.

Hiding a process's executable from itself

Posted Jan 25, 2023 18:54 UTC (Wed) by ejr (subscriber, #51652) [Link]

Please, everyone, hold off. Everyone's "production environment" is different and comes with different levels of restrictions.

Hiding a process's executable from itself

Posted Jan 25, 2023 16:14 UTC (Wed) by paulj (subscriber, #341) [Link] (2 responses)

> If your prod is debuggable then it just means you company is not large enough to have a prod.

If you think it is possible to have a full replica in test of your prod, then it just means you havn't worked at the largest tech companies.

Hiding a process's executable from itself

Posted Jan 25, 2023 17:35 UTC (Wed) by khim (subscriber, #9252) [Link] (1 responses)

> If you think it is possible to have a full replica in test of your prod

What gave you that illusion? Yes, you can not tests everything before deployment, but that just means that your logging and telemetry becomes important.

Especially if you deploy not just to your servers but to millions of devices around the world.

But this still doesn't change the fact that production is where you can not rung gdb. It's just definition of “production”. It deals with sensitive customer's data. That's why gdb is not welcome there.

Hiding a process's executable from itself

Posted Jan 25, 2023 17:46 UTC (Wed) by paulj (subscriber, #341) [Link]

The largest prods are very debuggable. Sophisticated telemetry, with profiling and at-scale data gathering and analysis systems, right down to GDB on the instance if you need it (though this would be unusual, I'd say).

It's a bit nonsensical to think that banning gdb from prod will protect customer data from the people who have access either the running instance or the code.

What does happen is that you can only access instances for the software that you are responsible. Everything is as compartmentalised as possible. (Though, this still has limits, given there are groups of people responsible for system level software on various classes of the fleet).

Hiding a process's executable from itself

Posted Jan 24, 2023 20:47 UTC (Tue) by geofft (subscriber, #59789) [Link]

I think you need to be much more specific - I work in a regulated industry and we certainly do not have auditors who object to interactive container access. We do care about making sure that sessions to production are approved and logged, but there's nothing wrong with running ps inside an approved and logged session, and I think the relevant regulators would in fact be less happy if we said that we have no way for a human (i.e., the actual entity that's subject to regulations) to investigate our production environments.

I don't dispute that there are some use cases in some industries where specific regulations in particular jurisdictions may prohibit this, but it's certainly not universal (and of course there are plenty of use cases that legitimately count as production that aren't in regulated industries).

Hiding a process's executable from itself

Posted Jan 27, 2023 15:00 UTC (Fri) by jwarnica (subscriber, #27492) [Link]

Rather than try to jump in deep down the other side: administrators/auditors/developers can flip this around the other way:

If you need to do interactive debugging in, er, some highly constrained environment, then you have two bugs, one of which is your system isn't verbose enough, doesn't have good enough logging and metrics, etc. And then whatever the other bug is.

Hiding a process's executable from itself

Posted Jan 25, 2023 18:13 UTC (Wed) by ejr (subscriber, #51652) [Link]

I regret posting this.

There are environments where having *any* access to the client's / outside data is forbidden.

There are environments where the deploy-ers can have access to anything to keep the deployed working.

And there are those at various levels between. The number of quite correct for their environment possibilities, to me, argues against any OS taking "sides." I'm not sure if the levels of access (MLS, etc.) are sufficiently well-defined for a general OS kernel to mediate that access.

Whether or not the kernel can be configured to an outside model unfortunately is not separate. Unless maybe to the extremes? And let another level / ring dictate access? I'm just a silly library / application developer.

Hiding a process's executable from itself

Posted Jan 24, 2023 9:41 UTC (Tue) by smcv (subscriber, #53363) [Link]

> Wouldn't it be better to just *not* mount /proc in the mount namespace (container)?

Even if you don't want the ability to run tools like ps inside the container, increasingly much lower-level library functionality relies on having /proc mounted, mainly for /proc/self/fd, which is used to emulate fd-relative I/O (for example fexecve(3) was unimplementable without /proc mounted until execveat(2) was added to Linux), and is still necessary even on the latest kernels if you want to pass a fd-relative path to a subprocess that expects a filename as a command-line option.

Hiding a process's executable from itself

Posted Jan 24, 2023 12:08 UTC (Tue) by rcampos (subscriber, #59737) [Link]

Also, runc initialization needs /proc to be mounted.

Hiding a process's executable from itself

Posted Jan 24, 2023 12:12 UTC (Tue) by judas_iscariote (guest, #47386) [Link]

the C library needs /proc mounted, otherwise several things do not work as expected. (things that can only be done that way as they were not properly implemented with syscalls *ejem*)

Hiding a process's executable from itself

Posted Jan 23, 2023 17:53 UTC (Mon) by jepler (subscriber, #105975) [Link] (1 responses)

Would the same problem apply to /proc/self/map_files entries? overwriting libc.so in the host would be even worse than overwriting runc, if the same or similar technique applies!

Hiding a process's executable from itself

Posted Jan 23, 2023 18:50 UTC (Mon) by gscrivano (subscriber, #74830) [Link]

they are already protected by the following check:

static const char *
proc_map_files_get_link(struct dentry *dentry,
struct inode *inode,
struct delayed_call *done)
{
if (!checkpoint_restore_ns_capable(&init_user_ns))
return ERR_PTR(-EPERM);

return proc_pid_get_link(dentry, inode, done);
}

where checkpoint_restore_ns_capable is defined as:

static inline bool checkpoint_restore_ns_capable(struct user_namespace *ns)
{
return ns_capable(ns, CAP_CHECKPOINT_RESTORE) ||
ns_capable(ns, CAP_SYS_ADMIN);
}

So you must either have CAP_SYS_ADMIN in the initial user namespace, or have CAP_CHECKPOINT_RESTORE in the user namespace.

Hiding a process's executable from itself

Posted Jan 23, 2023 18:30 UTC (Mon) by nomeata (subscriber, #16315) [Link] (3 responses)

I am probably ignorant of something obvious here, but isn't

“Once runc exits, the attacker is able to reopen that file descriptor for write access”

the fundamental problem? Why is it ok to turn a read-only filedescriptor into a read-write filedescriptor?

Hiding a process's executable from itself

Posted Jan 23, 2023 18:53 UTC (Mon) by floppus (guest, #137245) [Link] (2 responses)

Alternatively, "if the stuff running in the container is untrusted, why does that stuff have permission to write to the runc binary?"

As the linked LWN article says, this is an issue specifically for "containers that run with access to the host root user ID (i.e. UID 0), which, sadly, covers most of the containers being run today." That's the real problem.

Hiding a process's executable from itself

Posted Jan 24, 2023 12:11 UTC (Tue) by rcampos (subscriber, #59737) [Link] (1 responses)

Sure, we are working with Giuseppe to use user namespaces in Kubernetes, and so pods using it won't be able to do that attack. This was mentioned in the CVE announce too.

But this is taking some years, and probably some more to be enabled by default. Furthermore, it will not eliminate completely the need to run _some_ things as root on the host.

Therefore, it is still very useful.

Hiding a process's executable from itself

Posted Jan 31, 2023 5:31 UTC (Tue) by donald.buczek (subscriber, #112892) [Link]

I doubt that the kernel should become more and more complicated and ugly while developers play whac-a-mole fixing leaks as they are discovered, just because some popular container tools don't use the mechanism provider for the priviliege separation, namely user namespaces.

Hiding a process's executable from itself

Posted Jan 23, 2023 19:10 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (5 responses)

> More to the point, perhaps, is the question of just how many operations the kernel developers would like to add to prevent access to specific resources that might possibly be misused. There is, conceivably, no end to the files (under /proc and beyond) that might be useful to an attacker who is determined to take over the system. Adding a new prctl() option — and the necessary kernel code to implement it — for every such file could lead to a mess in the long term. There comes a point where it might make more sense to use a security module to implement this sort of policy.

In principle, couldn't the kernel simply allow processes to call unlink(2) on proc files (which would make those files vanish from *that process's* view of procfs, but would not affect any other process's view of procfs)? This would presumably require some bookkeeping on the part of the kernel, but I can't imagine it would be all that onerous.

Hiding a process's executable from itself

Posted Jan 24, 2023 12:13 UTC (Tue) by rcampos (subscriber, #59737) [Link] (4 responses)

And if the process forks, can it read /proc/parent/exe?

Hiding a process's executable from itself

Posted Jan 24, 2023 16:07 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (3 responses)

The child inherits a copy of the parent's view of procfs (it resets on execve). So if the parent has removed /proc/self/exe, then that (probably) should also remove /proc/[PID]/exe, and everything that symlinks to it.

I dunno, maybe the parent has to resolve the symlink, for symmetry with the "real filesystem" case?

Hiding a process's executable from itself

Posted Jan 24, 2023 19:20 UTC (Tue) by rcampos (subscriber, #59737) [Link] (2 responses)

Well, then you can re-exec yourself and get the link? I don't see how that can be better. Am I missing something?

Hiding a process's executable from itself

Posted Jan 24, 2023 19:56 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (1 responses)

We're talking about an instance of the confused deputy problem, where you trick some other program (runc) into exec'ing /proc/self/exe. If you can persuade runc to execute arbitrary code, then you've already won anyway, and don't even need to bother with making it exec itself in the first place.

Hiding a process's executable from itself

Posted Jan 25, 2023 0:27 UTC (Wed) by rcampos (subscriber, #59737) [Link]

Runc runs arbitrary commands: it runs your container image. If you fork and exec you are probably in another PID namespace, but if you use the host PID namespace you have access to /proc/parent/exe. It seems fragile, IMHO, your suggestion

This is how the CVE is exploited, you run your command so that it opens /proc/self/exe and you get that pointing to the runc binary.

See this blog post for an example exploit of the CVE: https://kinvolk.io/blog/2019/02/runc-breakout-vulnerabili...

IMHO what seems buggy is that you can open /proc/self/exe and that is the runc binary. I don't know now if that is something we can avoid, though. Probably not, for some reason I don't remember now.

Hiding a process's executable from itself

Posted Jan 23, 2023 20:41 UTC (Mon) by developer122 (guest, #152928) [Link] (7 responses)

Going back to the initial article about the vulnerability, this seems more like a failure to blacklist runc from launching itself. The is self-referential trick using /proc/self/exe is obviously not something that ought to be allowed and yet runc accepts it.

Hiding a process's executable from itself

Posted Jan 24, 2023 5:37 UTC (Tue) by developer122 (guest, #152928) [Link] (6 responses)

Really, I can't see a reason why runc should agree to run any executable that's outside the container, even if it's a tricky /proc/ one.

Hiding a process's executable from itself

Posted Jan 24, 2023 10:25 UTC (Tue) by matthias (subscriber, #94967) [Link] (4 responses)

Running the executable is not the problem. The problem is that it is possible to open the /proc/pid/exe file for writing from inside the container.

Usually it is opt-in to have any files inside the container, as the filesystem namespace for the container is explicitly created. The problem is, that procfs includes the executables inside /proc/pid/exe, even if they should not be accessible from within the container. The solution is this prctl that hides the executables again. Not mounting /proc could be another solution. Unfortunately many programs rely on /proc being available.

And the executable is not outside of the container. As soon as /proc is mounted, it is inside of the container, in much the same way as any executable that is accessible through bind mounts. The executable should be outside of the container, but without this new prctl there only was some ugly work-around to achieve this if /proc should be mounted.

Hiding a process's executable from itself

Posted Jan 25, 2023 2:26 UTC (Wed) by developer122 (guest, #152928) [Link] (3 responses)

The executable IS outside the container. It is the executable of runc itself, on the host.

That executable was effectively linked into the container when runc was commanded to launch itself, with /proc/self/exe effectively acting like a convoluted symlink. The runc outside was told to run /proc/self/exe which pointed to it's own executable (presumably in the host's /bin), and upon launching it, the /proc/self/exe of the runc instance inside the container continued to point to the same location, all the way back to the runc binary on the host.

If runc will only launch executables that are already inside the container, no such reference can be brought inside.

Hiding a process's executable from itself

Posted Jan 25, 2023 6:10 UTC (Wed) by matthias (subscriber, #94967) [Link] (2 responses)

At the point where runc enters the container (specifically the PID namespace), the runc executable is inside the container (/proc/self/exe). This should never happen. Executing /proc/self/exe (directly or through a synlink) is just done to keep it available, as on exec(), proc/self/exe would be replaced again.

And it is quite difficult to test whether someone is tricking you into executing /proc/self/exe. It is not just symlinks. Think of an executable starting with #!/proc/self/exe. The problem is that /proc/self/exe is there in the first place, not that runc can be tricked into executing it. This is just one exploit. There can be others.

Also running docker exec would probably always have a race window, where runc is available inside the container. A malicious process inside the container can just wait for any docker exec to happen.

Hiding a process's executable from itself

Posted Jan 25, 2023 22:00 UTC (Wed) by developer122 (guest, #152928) [Link] (1 responses)

> At the point where runc enters the container (specifically the PID namespace), the runc executable is inside the container (/proc/self/exe).

Fair enough, but the vulnerable executable is only visible to runc.

> It is not just symlinks. Think of an executable starting with #!/proc/self/exe.

Executing it in that way would almost certainly refer to the shell binary inside the container.

Hiding a process's executable from itself

Posted Jan 25, 2023 22:37 UTC (Wed) by matthias (subscriber, #94967) [Link]

>> At the point where runc enters the container (specifically the PID namespace), the runc executable is inside the container (/proc/self/exe).
> Fair enough, but the vulnerable executable is only visible to runc.

Yes, indeed. runc sets the non-dumpable attribute which should prevent other processes from playing with /proc/pid/exe. However, this gets reset when runc exec()s itself. Therefore the self-exec in the exploits.

>> It is not just symlinks. Think of an executable starting with #!/proc/self/exe.
> Executing it in that way would almost certainly refer to the shell binary inside the container.

As a container does not need to have a shell, runc executes commands directly. And if runc does exec() an executable with the shebang inside, then the proc entry refers to the runc binary.

The key point is, all changes to runc to avoid being tricked into execing itself is just curating the symptoms. The underlying problem is the runc executable being part of the container. And this should just be avoided.

O_BENEATH

Posted Jan 31, 2023 0:32 UTC (Tue) by vinipsmaker (guest, #126735) [Link]

This other patchset should help to prevent such bogus path resolutions, but it hasn't made into mainline yet AFAIK: https://lwn.net/ml/linux-kernel/20190213030851.1881-1-cyp...

Hiding a process's executable from itself

Posted Jan 24, 2023 0:15 UTC (Tue) by josh (subscriber, #17465) [Link] (4 responses)

As written, couldn't this be trivially worked around by forking a process and accessing /proc/your-parent-pid/exe?

Hiding a process's executable from itself

Posted Jan 24, 2023 0:38 UTC (Tue) by walters (subscriber, #7396) [Link] (3 responses)

AFAICS the code is querying the flag from the target task, it's not specific to /proc/self.

Hiding a process's executable from itself

Posted Jan 24, 2023 1:54 UTC (Tue) by josh (subscriber, #17465) [Link] (2 responses)

Ah, so once you set the flag, /proc/yourpid/exe becomes unreadable? That sounds great, then.

Hiding a process's executable from itself

Posted Jan 24, 2023 12:28 UTC (Tue) by adobriyan (subscriber, #30858) [Link] (1 responses)

Sounds terrible, exploits will use this prctl too.

Hiding a process's executable from itself

Posted Jan 24, 2023 15:31 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

Exploits will use syscalls too. What's your point? Exploits have always had the same toolbox available as the "good guys". The difference is that the "good guys" play by the Rules. Sometimes the Rules have loopholes (usually bugs) or are blind to entire realms of possibility (spectre, meltdown, rowhammer) and new Rules need to be made.

Obviously the solution is to not write/release exploitable software in the first place, but we (as a species) seem incapable of doing that. Or at least unwilling to pay the price it would take to do that (longer launch runways, better code review, etc.).

Hiding a process's executable from itself

Posted Jan 24, 2023 6:02 UTC (Tue) by rambolized (guest, #160860) [Link] (2 responses)

Container runtimes like runC need to enter namespaces of containers to execute commands. Race conditions may exist between being opened by mal-processes and calling prctl(PR_HIDE_SELF_EXE).

Hiding a process's executable from itself

Posted Jan 24, 2023 10:16 UTC (Tue) by matthias (subscriber, #94967) [Link] (1 responses)

Where should there be a race condition? The container is only created after the call to prctl. And only after the container is configured, the processes within the container are spawned. Note, that the whole article is only about attacks from inside the container. If there is a privileged mal-process outside of the container all bets are off anyway. There is simply no need to access /proc/pid/exe if you can directly access the executable in the filesystem.

Hiding a process's executable from itself

Posted Jan 24, 2023 12:44 UTC (Tue) by rambolized (guest, #160860) [Link]

Maybe you are right. Actually I'm not clear how runC will utilize this feature, e.g., when dealing with requests like `docker exec`.

Good interface

Posted Jan 30, 2023 22:47 UTC (Mon) by vinipsmaker (guest, #126735) [Link]

I hope this patch gets merged.

Lately I've been developing my own sandboxing solution making use of Linux namespaces (for those interested: <https://emilua.gitlab.io/docs/api/0.4/tutorial/linux_name...>). However the use case is not containers. The use case is compartmentalised application development. Long story short, one should be able to make use of Linux namespaces to spawn actors (Lua VMs in my project) inside isolated processes (the goal is to bring a model closer to Capsicum and actor systems to Linux).

In this scenario, the cost of creating a new actor is cheap:

1. fork()'ing a process that was created near main() with very few allocations and fds open.
2. Initial setup for the new namespaces (e.g. mount() calls).

Containers will usually will have extra steps (that I skip in my project):

3. Mount an image for a whole Linux mini-distro.
4. exec() into some binary.

However the lack of an exec() call at the end means that I must be careful to not leak resources from the host as exec() is the only call that flushes the address space. /proc/self/exe is one of the things I must be careful about.

The beauty about PR_SET_HIDE_SELF_EXE is that it'll protect not only containers that exec() at the end, but it can also be used in projects such as the one that I'm developing where no exec() call at the end ever happens. I really hope this patch gets merged.