O_MAYEXEC — explicitly opening files for execution

By Jonathan Corbet
May 11, 2020

Normally, when a kernel developer shows up with a proposed option that doesn't do anything, a skeptical response can be expected. But there are exceptions. Mickaël Salaün is proposing the addition of a new flag (O_MAYEXEC) for the openat2() system call that, by default, will change nothing. But it does open a path toward tighter security in some situations.

Executing a file on a Unix-like system requires that said file have an applicable execute-permission bit set. The file must also not reside on a filesystem that has been mounted with the noexec option. These checks can prevent the execution of unwanted code on a tightly controlled system, but there is a significant hole in this protection: interpreters that will happily read and execute code found in a file. If a file contains Perl code, for example, it cannot be executed by typing its name if it fails either of the above two tests. If an attacker is able to pass that file as a parameter to a perl -e command, though, its contents will still be executed.

The new O_MAYEXEC flag is a way for language interpreters (or other programs, such as dynamic linkers, that execute code) to indicate to the kernel that a file is being opened with the intent of executing its contents. This flag is totally ignored by open() which, because it never checked for invalid flags, is difficult to extend in general. The newer openat2() system call, instead, does fail when unknown flags are passed to it; it has been extended to recognize O_MAYEXEC. But, by default, nothing will change if that flag is present.

The patch set also adds a new sysctl knob called fs.open_mayexec_enforce that can bring out a change in behavior. Its default value (zero) naturally preserves current behavior so that nobody's system is broken by mistake. If, instead, bit 0 is set, an openat2() call with O_MAYEXEC will fail if the filesystem holding the target file was mounted with the noexec option. Bit 1 will cause such an open to fail if the file lacks execute permission. Setting both bits will thus cause O_MAYEXEC opens to fail in the situations where a direct attempt at execution would also fail.

Integrity measurement is another subsystem that can benefit from O_MAYEXEC. The kernel's integrity-measurement subsystem can be configured to block the execution of files that do not meet the integrity criteria but, once again, passing a file directly to an interpreter can bypass this check. This patch set adds a hook by which files opened with O_MAYEXEC can be passed to the integrity-measurement code for vetting before an open is allowed to succeed.

Finally, as one might expect, security modules can also note the existence of this flag and respond accordingly. It would be relatively straightforward to write a policy for SELinux or Smack that prevents execution-by-interpreter of files that lack a certain label (or to prevent such execution entirely, of course).

The above discussion skips over one little detail, though: this mechanism will only work if the programs that execute code from files cooperate and provide the O_MAYEXEC flag. That would require getting patches into various language interpreters, linkers, etc. to properly mark the opening of any files that might lead to the execution of code. Actions such as opening files passed on the command line, importing code in modules, and more would need this flag. Getting all of the commonly installed interpreters patched is likely to be a project that takes some time, even if all of the relevant projects go along with the idea.

The good news is that some projects, at least, are aware of the issue. The Python project, for example, has been working since (at least) 2017 to provide audit information to the underlying operating system; that work is currently formalized as PEP 578 ("Python runtime audit hooks"), which was approved in May 2019 and appears to be on track for the Python 3.9 release. Simply supporting O_MAYEXEC doesn't require the addition of an entire new subsystem, though, so adding this support to other interpreters need not be a multi-year project.

This patch set is in its fifth revision as of this writing. It has changed considerably as the result of review comments. The original version, posted at the end of 2018, predated openat2() and relied on the Yama security module for enforcement. Developers seem relatively happy with the current version, though, so this feature may be getting close to being ready to enter the mainline. Only then can the task of adding support to various interpreters begin.

Index entries for this article
Kernel	Filesystems/Virtual filesystem layer
Kernel	Security/Language interpreters
Python	Python Enhancement Proposals (PEP)/PEP 578

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 4:07 UTC (Tue) by pabs (subscriber, #43278) [Link] (9 responses)

Could this lead to a way to block people from running `curl | bash`?

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 4:17 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (8 responses)

What's wrong with "curl | bash"?

How is it different from "pip install"?

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 4:31 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (4 responses)

pip traditionally downloads from PyPI, although you can feed it a raw URL if you enjoy living dangerously. PyPI is not categorically immune from unintentionally hosting malware, of course, but it's a poor choice for intentional malware distribution, because it can be audited by security researchers et al. upon request. Most malware tries to obfuscate itself to make it harder to block, and a well-known host which complies with legal requests is a poor vector for this and other reasons.

Having said all that, bash does not call openat2() on stdin (nor should any remotely reasonable program), so I'm going to say this whole question is moot.

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 4:37 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

PyPI not only can be used to distribute actual malware, but it HAS been used to do that more than once. Ruby gems and JS NPMs also have been used for this purpose.

Running "curl | bash" is at least honest about its possible impact.

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 5:27 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (1 responses)

> PyPI not only can be used to distribute actual malware, but it HAS been used to do that more than once.

I don't dispute that. My point was merely that a smart malware author would probably choose a different host, in practice, most of the time. That is a vastly different claim from "PyPI software is always safe," which I certainly did not say. Rather, my claim is more focused on the possible remediation after a malware event. If you know you got it from PyPI, you can pass that information on to security researchers and authorities, who can then study the malware and make recommendations to others. If you got it from curl | bash, who's to say the site is even still there?

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 5:34 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

A PyPI package can trivially do an equivalent of "system('curl | bash')". There really is no difference here.

Pretty much the only semi-reliable package source are native Linux distribution packages. And even that is likely borderline.

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 9:41 UTC (Tue) by ballombe (subscriber, #9523) [Link]

Agreed. PyPi/NPM provide a false sense of security while being very convenient, which is a combination that leads to disasters.
curl | bash provide a correct sense of insecurity while being rather inconvenient. Much less likely to lead to a disaster.

O_MAYEXEC — explicitly opening files for execution

Posted May 14, 2020 8:43 UTC (Thu) by mina86 (guest, #68442) [Link] (2 responses)

The one real issue with ‘curl | bash’ is an interrupted connection. If the script being downloaded isn’t written properly, this can lead to execution of corrupted script and e.g. ‘rm -rf /’ instead of ‘rm -rf /tmp/my-temp-file’.

For those wondering, the way to guard against that is to wrap the entire script with ‘_() { …; }; |’.

O_MAYEXEC — explicitly opening files for execution

Posted May 14, 2020 16:23 UTC (Thu) by dkg (subscriber, #55359) [Link] (1 responses)

I'm pretty sure you mean to wrap the entire script in:

_() {
    […]
}
_

the trailing pipe doesn't make sense.

O_MAYEXEC — explicitly opening files for execution

Posted May 14, 2020 16:28 UTC (Thu) by mina86 (guest, #68442) [Link]

Yes, sorry, a typo. I meant underscore, not a pipe.

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 5:59 UTC (Tue) by jezuch (subscriber, #52988) [Link] (2 responses)

Me wonders, what would happen if the Java VM was modified to use this flag, then some distribution flicked the switch and everyone was forced to chmod +x all their .jar files? ;)

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 16:10 UTC (Tue) by l0kod (subscriber, #111864) [Link]

General purpose distros will not enforce anything by default but will let sysadmins tune their system (according to its mount points and files). Only tailored security distros will embedded such security policy.

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 18:42 UTC (Tue) by error27 (subscriber, #8346) [Link]

I think you're right that that's what will happen. But you can disable it with a sysctl until everything is fixed. It's just like SELinux and people will eventually get used to it.

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 10:42 UTC (Tue) by dullfire (guest, #111432) [Link] (6 responses)

While the idea of improving security measures is commendable, I am not sure that this idea actually buys us a whole lot.
For example "echo 'print("Hello foobar")' |python" will still work, and as somebody else has already rightly pointed out: this can't be used to stop interpreters from using stdin (and in fact attempting to do so sounds like a terrible idea).
Furthermore, it sounds like this is attempting to create a link between kernel semantics (is the kernel allowed to create a process image out of this file), and pure user-land semantics (can any program, anywhere, use this file as flow control). If that's the case, does that mean all html, and js (that want to be executed/view locally rather than severed) potentially need to be marked as executable?

If we want to lock it down that badly, we probably also need to disable access to /proc/self. I'm pretty sure I could use a bash shell script to do some ROP in the very same shell to get the shell to MMAP some code as executable and then dd in w/e data I wanted.... now I have code execution. Seccomp could probably stop that (unless bash actually uses mmap, not sure).

My point is while the goals are laudable, I don't think this approach is a very well thought out way to attempt to tackle this problem.

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 13:24 UTC (Tue) by smurf (subscriber, #17840) [Link] (3 responses)

I don't think teaching interpreters to not execute non-interactive stdin is a bad idea. No sane system should require that.

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 13:49 UTC (Tue) by dskoll (subscriber, #1630) [Link]

The only way to detect "non-interactive" stdin is to check if it's a tty, and that's easily spoofed with a pseudoterminal.

O_MAYEXEC — explicitly opening files for execution

Posted May 13, 2020 9:18 UTC (Wed) by ballombe (subscriber, #9523) [Link]

What ? We do this all the time in HPC.

O_MAYEXEC — explicitly opening files for execution

Posted May 13, 2020 17:29 UTC (Wed) by ianmcc (subscriber, #88379) [Link]

Its a little difficult to parse that quadruple negative statement. Is something of the form 'cat file | interpreter' an example of the interpreter reading a non-interactive stdin? I'd have thought this kind of thing was ubiquitous on unix systems.

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 16:37 UTC (Tue) by l0kod (subscriber, #111864) [Link]

As explained in the documentation patch, the ability to restrict code execution must be thought as a system-wide policy: https://lwn.net/ml/linux-kernel/20200505153156.925111-6-m...
As with other security policies, enforcing such execution prevention do not make sense on all system installations, especially developers' ones.

O_MAYEXEC is only one part of the solution. According to your threat model, using stdin (or other ways) to push code to interpreters may be legitimate or not. O_MAYEXEC doesn't help for this problem, but there is other solutions (which don't require kernel modification). You can get inspiration from CLIP OS 4: https://github.com/clipos-archive/clipos4_portage-overlay...

The difference between code and data is relative. According to your threat model, one way to draw a line is to identify which kind of input (Python, JavaScript, HTML, CSS…) can do system calls (which could lead to kernel attacks) or can have fine control of CPU instructions (which could lead to side channel attacks): https://lore.kernel.org/lkml/d1a81d06-7530-1f2b-858a-e42b...

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 21:01 UTC (Tue) by edeloget (subscriber, #88392) [Link]

Honnestly, that's a feature that I'm going use -- I've been trying to find a way to close the loophole of scripts that can be executed from a noexec filesystem for a while, and I'm quite happy with it :)

Of course it's still possible to pipe a string to an interpreter but the change is still interesting to me.

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 10:47 UTC (Tue) by LiPo (guest, #129784) [Link] (3 responses)

Suppose that Python is updated. If I cannot do "python abc", what prevents me from doing "cat abc | python" or "python < abc"?

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 11:59 UTC (Tue) by dskoll (subscriber, #1630) [Link]

For the latter case, you'd need a fcntl informing the system of your intent to execute stdin and it could do its checks. For the former case, I guess you'd have to fail the fcntl if stdin does not refer to a plain file. That would be a big and backwards-incompatible change.

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 15:56 UTC (Tue) by matthias (subscriber, #94967) [Link]

This mechanism should prevent python from accidentially executing code from the data filesystem instead of the directory where you approved scripts are. It can step in if other measures have failed and an attacker tricked python to open the wrong script, e.g., because some user input on a web server was not properly checked/escaped. If an attacker is able to execute arbitrary shell commands like "cat abc | python" it is way too late. The chances are quite bad to stop an attacker at that point.

This mechanism is not meant to stop local users. If you have an interactive shell you can just start python and type any command that is inside abc. In many attack scenarios, the attacker has very little control over what commands can be executed and needs some vector to get arbitrary commands into the system. Not everyone, who can trick python to execute the wrong script (e.g., a script in some data directory) also has the ability to control the stdin of the python process.

O_MAYEXEC — explicitly opening files for execution

Posted May 12, 2020 22:52 UTC (Tue) by embe (subscriber, #46489) [Link]

From the patch series description:

Additional Python security improvements (e.g. a limited interpreter
withou -c, stdin piping of code) are on their way.

O_MAYEXEC — explicitly opening files for execution

Posted May 13, 2020 5:59 UTC (Wed) by kalvdans (subscriber, #82065) [Link] (1 responses)

There is already a system call for this single purpose: access(filename, X_OK). Passing it as a flag to open() brings atomicity, but to me none of the usecases discussed in the Linux mailing list or the PEP need atomicity.

O_MAYEXEC — explicitly opening files for execution

Posted May 14, 2020 8:51 UTC (Thu) by mina86 (guest, #68442) [Link]

Time-of-check/time-to-use is why atomicity is required. Security checks without atomicity are pointless.

O_MAYEXEC — explicitly opening files for execution

Posted May 13, 2020 10:54 UTC (Wed) by mst@redhat.com (subscriber, #60682) [Link]

The flag makes sense. Sysctl knob is weird
why not do it in a securoty module? Ideas?

O_MAYEXEC — explicitly opening files for execution

Posted May 13, 2020 12:41 UTC (Wed) by mirabilos (subscriber, #84359) [Link] (12 responses)

There’s an O_MAYEXEC already in existence, from CLIP OS: https://lwn.net/Articles/768819/

I already support that in mksh, so adding it elsewhere MUST NOT change semantics!

O_MAYEXEC — explicitly opening files for execution

Posted May 13, 2020 12:46 UTC (Wed) by mirabilos (subscriber, #84359) [Link]

(That is, passing it as flag to open(2), or’d with O_RDONLY, because that’s what CLIP OS documents.)

O_MAYEXEC — explicitly opening files for execution

Posted May 14, 2020 2:09 UTC (Thu) by dvdeug (guest, #10998) [Link] (10 responses)

It's a suggestion to be consistent with previous semantics, but minor operating systems implementing things do not bind major operating systems to do things the same way, and to do so would be pretty crippling on the ability to innovate.

O_MAYEXEC — explicitly opening files for execution

Posted May 14, 2020 16:06 UTC (Thu) by mirabilos (subscriber, #84359) [Link] (9 responses)

Not exactly, sure, but if they use the same name *and* are aware of prior art, the semantics should be the same, as otherwise userspace is going to break ;-)

O_MAYEXEC — explicitly opening files for execution

Posted May 14, 2020 17:21 UTC (Thu) by mjg59 (subscriber, #23239) [Link] (7 responses)

Userspace that depends on the semantics of a patch that isn't in upstream doesn't get to complain if upstream ends up choosing different semantics.

O_MAYEXEC — explicitly opening files for execution

Posted May 14, 2020 22:36 UTC (Thu) by mirabilos (subscriber, #84359) [Link] (3 responses)

Erm… please start brain before commenting.

This userspace uses O_MAYEXEC, which was defined by CLIP OS, if O_MAYEXEC is present, and using the CLIP OS semantics.

This userspace isn’t even natively developed on/for Linux, it’s just one of dozens of ports. It even runs on DEC Ultrix.

So, no, I don’t see why it’s the fault of this userspace if Linux were to decide to implement a functionality that already exists in another operating system, but with the same name but different semantics.

Also, this isn’t fault-assigning; merely making people aware of the CLIP OS prior art and requesting to be compatible.

O_MAYEXEC — explicitly opening files for execution

Posted May 14, 2020 22:40 UTC (Thu) by mjg59 (subscriber, #23239) [Link] (1 responses)

CLIP OS is a Linux derivative. If they choose to add new semantics without discussion with Linux upstream, you don't get to expect those semantics to be consistent. CLIP OS could have namespaced the flag to reduce the probability of conflict but didn't, so here we are.

> merely making people aware of the CLIP OS prior art and requesting to be compatible.

> adding it elsewhere MUST NOT change semantics!

Ok.

O_MAYEXEC — explicitly opening files for execution

Posted May 19, 2020 13:58 UTC (Tue) by robbe (guest, #16131) [Link]

Would it change anything if CLIP OS were a BSD variant?

I’d support the popularity argument – an OS used by n people should not be hobbled by an OS that only n/10000 people use. But genealogy should not play into it.

O_MAYEXEC — explicitly opening files for execution

Posted May 15, 2020 0:36 UTC (Fri) by dvdeug (guest, #10998) [Link]

If it runs on DEC Ultrix, etc., it's well aware of the fact that functionality in different operating systems may have the same name and different semantics. Compatibility has a cost, and it seems too high to pay here for compatibility with an obscure system like CLIP OS and the few people who might have added compatibility with it.

O_MAYEXEC — explicitly opening files for execution

Posted May 15, 2020 8:31 UTC (Fri) by l0kod (subscriber, #111864) [Link] (2 responses)

I'm the main O_MAYEXEC (upstream patch series) developer, and I've developed CLIP OS for more than 10 years. We (CLIP OS developers) wanted to upstream this new flag and the associated security features. Because CLIP OS is fully controlled (all applications are built and may be patched by ourselves), changing the kernel ABI (even if not desirable) is not a major issue. This is the reason why we were able to implement and maintained such non-upstream flag. The upstream process has the benefit to create discussions, but upstreaming also requires to deal with existing systems compatibility. CLIP OS relies on the Linux kernel, and the additional O_MAYEXEC flag (among other things) is, until now, a specificities of our distro, and we have the burden to check that it works and doesn't break for each system update.

This patch series brings some improvements to our original implementation, and more importantly enables other systems to progressively adopt this new feature. This is why this flag will only be handled by openat2(2) and is controlled by a sysctl, which weren't part of the original implementation because it wasn't require thanks to our total control of user space.

The O_MAYEXEC semantic is unchanged (according to the sysctl configuration), and it is highly unlikely that new flags will be added to the (old) open(2) and openat(2). This should then not be an issue to keep this specificity of CLIP OS 4. Anyway, none of these compatibility issues of a derivative kernel should concern upstream: it is part of the deal to implement new features without upstreaming them first. Moreover, a new CLIP OS 5 is on its way to replace the forth version.

O_MAYEXEC — explicitly opening files for execution

Posted May 15, 2020 16:11 UTC (Fri) by mirabilos (subscriber, #84359) [Link] (1 responses)

Thanks for responding here.

Does this mean, then, that I should remove the current O_MAYEXEC support from mksh?

There’s no way it, as a portable application, is going to use a different syscall for opening, though.

I’m getting the hint that CLIP OS 5 would also use openat2() only, then.

O_MAYEXEC — explicitly opening files for execution

Posted May 15, 2020 18:35 UTC (Fri) by l0kod (subscriber, #111864) [Link]

> Does this mean, then, that I should remove the current O_MAYEXEC support from mksh?

Yes, you can.

> There’s no way it, as a portable application, is going to use a different syscall for opening, though.

If you want to use different features from different OS, then you can use #ifdef and such. There is a lot of applications and libraries implementing differently similar features (e.g. managing temporary files with mkstemp, O_TMPFILE, etc.).

> I’m getting the hint that CLIP OS 5 would also use openat2() only, then.

It is planned to use upstream features as much as possible.

O_MAYEXEC — explicitly opening files for execution

Posted May 14, 2020 19:57 UTC (Thu) by dvdeug (guest, #10998) [Link]

I might buy that argument for FreeBSD, though I also might buy that argument that FreeBSD used such a common name that we're not going to avoid it when doing something similar but slightly different. When you add support for obscure systems like CLIP OS, it's no more reasonable to expect that Linux will follow suit in the exact same way than it is to add support for a feature in C++2x and expect the final version of the standard to support it in the exact same way.

What's the fopen() equivalent?

Posted May 13, 2020 14:34 UTC (Wed) by david.a.wheeler (subscriber, #72896) [Link] (2 responses)

A lot of systems use fopen(), not open2(). What's the letter equivalent in fopen()? That should be agreed-on soon so everyone can use the same one.

What's the fopen() equivalent?

Posted May 14, 2020 8:21 UTC (Thu) by Karellen (subscriber, #67644) [Link]

Why would that be necessary?

fopen() is the C standard library, lowest-common-denominator function for opening files. It doesn't have many options, in order to be maximally portable. open()/openat() is more powerful, but specific to POSIX (and openat2() to Linux), and there are already a lot of open() flags that don't have equivalents in fopen(), like O_EXCL, O_CLOEXEC, O_NOATIME, etc...

In fact, C doesn't even have exec(), so wouldn't it be a bit odd to define a O_MAYEXEC flag for fopen()?

What's the fopen() equivalent?

Posted May 14, 2020 8:34 UTC (Thu) by dtlin (subscriber, #36537) [Link]

Not necessary,

fdopen(openat2(AT_FDCWD, pathname, O_RDONLY | O_MAYEXEC), "r")

should work fine.

O_MAYEXEC — explicitly opening files for execution

Posted May 15, 2020 18:03 UTC (Fri) by amarao (guest, #87073) [Link] (2 responses)

Where is the cut line for this flag? If browser is reading file://path/script.js, is this an 'may exec'? If so, is HTML file may exec? They can contain js...

Is ansible playbook 'exec'? Is ansible inventory 'exec'? It may contain variables with Jinja templates and 'command' lookup plugin, which is local code execution by the way.

The border line is extremely blurry. Is mailbox executed by mua? It may contain js...

O_MAYEXEC — explicitly opening files for execution

Posted May 15, 2020 19:34 UTC (Fri) by mjg59 (subscriber, #23239) [Link]

A browser could try to open with O_MAYEXEC, and if that fails fall back to opening without it but disabling the Javascript interpreter.

O_MAYEXEC — explicitly opening files for execution

Posted May 16, 2020 5:15 UTC (Sat) by neilbrown (subscriber, #359) [Link]

I see the cut-line a different way.

Unless an application is configured to be careful about what it runs, then it doesn't ever set the flag.
So if /bin/sh will accept "-c code" on the command line, and will execute code read from stdin, then it will equally execute code read from a file opened without O_MAYEXEC, and so never bothers to set O_MAYEXEC.

But is it has been configured - either at compile-time or by some /etc/bash-security.conf file - to disable -c and reading from stdin, then it will set O_MAYEXEC whenever it opens any file to read commands.

Similarly any other app, whether browser or music player or editor, might accept a "be secure" configuration which causes it to start using O_MAYEXEC. If "secure" isn't uni-valued for the particular app, then the configuration will fill in the details.

O_MAYEXEC isn't a tool to enforce security. It is a tool to help people who so desire to build more secure systems. The person who builds the final system decides what O_MAYEXEC means exactly, just as they choose what permissions to put on different files.