|
|
Subscribe / Log in / New account

Disallowing perf_event_open()

By Jake Edge
August 3, 2016

An Android security measure that limits the ability of processes to access the perf events subsystem ran up against some, perhaps surprising, resistance when it was recently proposed for the mainline. The patch simply provides another setting for the kernel.perf_event_paranoid sysctl parameter to disallow unprivileged processes from accessing the perf_event_open() system call at all. It is currently used in both Android and Debian kernels, but some kernel developers see it as too much of a "big hammer" approach.

Jeff Vander Stoep posted the patch on July 27. It adds a another value that can be set for the sysctl parameter (i.e. kernel.perf_event_paranoid=3) that restricts perf_event_open() to processes with the CAP_SYS_ADMIN capability. Currently, perf_event_paranoid is set to 2 by default, which disallows access to some perf features (raw tracepoint access, CPU event access, and kernel profiling) to processes without the proper capabilities; the patch does not change the default. He also submitted another patch that would allow configuring the kernel to make 3 be the default perf_event_paranoid value.

In the first patch, he noted five vulnerabilities worthy of CVE numbers that have recently been found in perf and argued that allowing access to it increases the attack surface of the kernel. For production kernels, that may not make sense, so the patch is intended to allow administrators to restrict access to perf, while still providing a means for developers and others to access the tool as needed (by granting CAP_SYS_ADMIN). The patches are based on the grsecurity PERF_HARDEN feature and were first proposed by Ben Hutchings back in January. At that time, he said it had been running in the Debian kernel since August 2015 with no complaints.

It is a fairly simple and straightforward change, but Peter Zijlstra objected that providing a way to turn off perf because of some bugs was heavy-handed: "We have bugs we fix them, we don't kill complete infrastructure because of them." He also thought that it would inhibit new and innovative uses for the tool:

So the problem I have with this is that it will completely inhibit development of things like JITs that self-profile to re-compile frequently used code.

I would much rather have an LSM hook where the security stuff can do more fine grained control of things. Allowing some apps perf usage while denying others.

Daniel Micay noted that the functionality would still be available to privileged processes and that Android will allow access by unprivileged processes, but that capability must be enabled from the adb shell. Furthermore:

It isn't even possible to disable the perf events infrastructure via kernel configuration for every architecture right now. You're forcing people to have common local privilege escalation and information leak vulnerabilities for something few people actually use.

This patch is now a requirement for any Android devices with a security patch level above August 2016. The only thing that not merging it is going to accomplish is preventing a mainline kernel from ever being used on Android devices, unless you provide an alternative it can use for the same use case.

Micay was skeptical that an LSM-based approach would work, as was Kees Cook, who said: "I'm not against an LSM, but I think it's needless complexity when there is already a knob for this but it just doesn't go 'high' enough." He also noted that bugs live a long time ("an average of 5 years from introduction to fix") and they can last even longer when you take product update lifecycles into account. He argued that administrators need the ability to reduce the attack surface of their systems:

Being able to remove attack surface is a fundamental first step of security defense, and things like perf, user namespaces, and similar APIs, expose a lot of attack surface when they are enabled. And the evidence for this attack surface being a real-world risk is in the history of security vulnerabilities (that we know about!) in these various APIs.

Now, obviously, these API have huge value, otherwise they wouldn't exist in the first place, and they wouldn't be built into end-user kernels if they were universally undesirable. But that's not the situation: the APIs are needed, but they lack the appropriate knobs to control their availability.

The use case Zijlstra mentioned might be a good reason to change the value of the setting, Cook said, but there are other use cases where administrators want to be able to reduce their systems' attack surface when running a pre-built kernel. But Zijlstra disagreed; he is concerned that having this knob available will mean that administrators blindly apply it. That would have the effect of stopping the development of use cases like he described:

Having this knob will completely inhibit development of such applications. Worse it will probably render perf dead for quite a large body of developers.

Cook was undeterred, however, saying that the feature is based on a risk analysis of the attack surface, and that there are "hundreds of millions of end-users for whom perf is not needed". Beyond that, though, Zijlstra's argument assumes that the knob is not available, but that simply isn't true:

I've never suggested it be default disabled: I'm wanting to upstream the sysctl setting that is already in use on distros where the distro kernel teams have deemed this is [a] needed knob for their end-users. All of the objections you're talking about assume that the knob doesn't exist, but it does already. It's just not in upstream.

Ingo Molnar agreed with Zijlstra that the approach was "too coarse". He suggested that perf is not just a "narrow debugging mechanism" that can simply be turned off, but that it is a "wide scope performance measurement and event logging infrastructure that is being utilized not just by developers but by apps and runtimes as well".

Micay pointed out that the wide scope of perf is part of its problem from a security perspective. Because it has been a "frequent source of vulnerabilities", it has been disabled by some distributions. Part of the problem also lies outside of the core kernel, he said: "It's extended by lots of vendor code to specific to platforms too, so it isn't just some core kernel code that's properly reviewed."

The coarseness of the setting also concerned Eric W. Biederman. He suggested that many of the features to reduce the attack surface amount to a "system wide off switch" for features like user namespaces and perf. The result is that new applications cannot take advantage of these features, which turns the attack-surface reduction into "great big denial of service attacks on legitimate users". He also suggested several ideas for ways to make the feature less coarse: "I vote for sandboxes. Perhaps seccomp. Perhaps a per userns sysctl. Perhaps something else."

That's about where things stand at this point. The second patch to allow configuring the kernel to default to denying access to perf has seemingly been dropped. The first will undoubtedly live on in grsecurity, Android, and Debian (at least), which seems to undermine the concern that Zijlstra, Molnar, and Biederman have—as Cook said, the change has already happened in some places. Whether a more fine-grained approach emerges remains to be seen, but it is a little hard to see who would work on it. Distributions already have their solution at this point.


Index entries for this article
KernelPerformance monitoring
KernelSecurity/Kernel hardening
SecurityLinux kernel/Hardening


to post comments

Disallowing perf_event_open()

Posted Aug 4, 2016 12:52 UTC (Thu) by deater (subscriber, #11746) [Link] (9 responses)

Any word on how the CVEs were found? Via a fuzzer or some other way? It is a bit annoying to find internal bugs in your own architectural driver and your solution is to disable a global interface for everyone.

At the same time, the perf_fuzzer can still reliably crash any x86 machine within a few hours despite years of trying to get that fixed, so it probably is a good idea to provide a way to block the syscall.

I see both sides of the issue here, as having perf_event disabled by default is really going to cause annoyance for high-performance computing users (it hasn't started problems yet because most users aren't using Debian). If you start requiring root or sysadmin action in order for perf to work, people are going to go back to skipping perf altogether for tools like LIKWID (they also usually require root or sysadmin intervention).

Disallowing perf_event_open()

Posted Aug 4, 2016 20:01 UTC (Thu) by nix (subscriber, #2304) [Link]

Quite. That's not the only use case either. My firewall, in common with much semi-embedded hardware, has no useful PMUs and runs no software that would benefit (and even if it did, I frankly don't care if I break self-tuning JITs on a firewall: it's not doing anything else so the performance of things like that is of minimal interest). But as firewalls tend to be, it is network-exposed to hostile attackers and really should not have huge unused lumps of security-critical code on it at all, let alone exposed to unprivileged users.

I have no idea why there isn't a config knob to compile perf out entirely. There surely should be. Not everything needs it; not everything can benefit; and those things that don't are purely harmed by its distinctly non-zero-sized vulnerability surface.

Disallowing perf_event_open()

Posted Aug 4, 2016 22:51 UTC (Thu) by ballombe (subscriber, #9523) [Link] (7 responses)

> (it hasn't started problems yet because most users aren't using Debian)
or maybe because perf is not disabled by default on Debian.

Disallowing perf_event_open()

Posted Aug 5, 2016 0:17 UTC (Fri) by deater (subscriber, #11746) [Link] (6 responses)

> or maybe because perf is not disabled by default on Debian.

I always forget that most people do not run unstable.

Disallowing perf_event_open()

Posted Aug 5, 2016 0:57 UTC (Fri) by creemj (subscriber, #56061) [Link] (1 responses)

I upgraded to testing two or three weeks ago and had to spend some time working out why my profiling code stopped working. It was a bit annoying to find that perf events was disabled in the kernel from scratch, but I managed to write my first ever systemd init config to write a sensible value to /proc/sys/kernel/perf_event_paranoid on boot up so that I wouldn't be pissed off every time I rebooted.

The pain level has to be quite high before I would report a bug or issue back to Debian --- I hate the reportbug interface and find 'apt-get purge crappy-buggy-package' a much easier and quicker solution in most cases.

I am wondering, however, whether there is a kernel option to change the default? That would be a neater solution.

Disallowing perf_event_open()

Posted Aug 5, 2016 4:36 UTC (Fri) by nybble41 (subscriber, #55106) [Link]

> …but I managed to write my first ever systemd init config to write a sensible value to /proc/sys/kernel/perf_event_paranoid on boot up…

Why a systemd init config? Isn't this exactly why we have /etc/sysctl.conf and /etc/sysctl.d/?

# echo "kernel.perf_event_paranoid = 0" > /etc/sysctl.d/perf_events.conf
# update-initramfs -k all -u

Disallowing perf_event_open()

Posted Aug 6, 2016 17:35 UTC (Sat) by anton (subscriber, #25547) [Link] (3 responses)

I run stable, and I find that the default restrictions of perf are broken for my uses, so I have this nice line in /etc/rc.local:
echo 0 >/proc/sys/kernel/perf_event_paranoid

Disallowing perf_event_open()

Posted Aug 6, 2016 18:06 UTC (Sat) by spender (guest, #23067) [Link] (2 responses)

I like that we have two comments now in this thread alone demonstrating how people have no problem at all with the serious security side-effects of the coarseness of the toggle in the other direction, yet all the complaints are focused on the mere option of an additional, more secure setting. When you consider that even CONFIG_NET can be disabled, but PERF_EVENTS cannot, it's obvious that all the pushback has to do with an arch maintainer also being the maintainer of a pet project he wants to force on everyone, spending more time on adding new bells and whistles and not on (for instance) fixing known issues deater has been talking about here for years now exposed by simple fuzzing, meanwhile offering users no way to mitigate those issues.

-Brad

Disallowing perf_event_open()

Posted Aug 8, 2016 15:09 UTC (Mon) by deater (subscriber, #11746) [Link] (1 responses)

And it turns out that 4.8-rc1 falls over even more quickly than normal with the perf_fuzzer.

The fun part is that the serial console only manages to print
[163405.758086] BUG: unable to handle kernel
before completely locking up.

I'll try reporting this to linux-kernel, but the reports are usually ignored and the deep seated problems the fuzzer is tickling are a bit out of my league to track down.

Disallowing perf_event_open()

Posted Aug 9, 2016 15:34 UTC (Tue) by deater (subscriber, #11746) [Link]

For those keeping track, I ran the perf_fuzzer on a Haswell machine running 4.8-rc1.

The bug mentioned previously that quickly locked the machine turns out to be a known bug (found with trinity a few weeks ago) with a working patch that hasn't hit upstream yet. Once I patched for the bug I let perf_fuzzer run again.

It lasted 6 hours before completely crashing. Along the way it found:

2 known WARNings
2 new WARNings
3 gpf/slab poisoning warnings

and then it finally crashed hard.

Disallowing perf_event_open()

Posted Aug 15, 2016 13:41 UTC (Mon) by MarcB (subscriber, #101804) [Link]

Somehow, I get the impression some kernel developers have disconnected from reality. Or maybe live in a different one, that consists only of developer workstations, back-end systems and maybe super-computing. Obviously, perf is always fine and useful in this reality.

However, there is a significant, other reality that has nasty places like regular end user devices (usually called "smartp hones") or even shared hosting environments.

The latter ones face malicious code and malicious, local users on a daily basis. The operating system running them needs to cope with this or it is simply unfit for this purpose.
Nowadays, mainline Linux *is* unfit for such environments - and the unavoidable perf is a major reason for this.

Now, if this were about removing perf from the kernel, I could understand the opposition. But it is simply about giving the option to disable it. Arguing that "bugs will be fixed" is irrelevant: The risk is still increased. Arguing that administrators are clueless is hubris: Experienced administrators are lazy. If the default configuration works for them, they stick with it. If they change it - or even start patching the kernel - there must be a very good reason and kernel developer would be well-advised to try to understand it.


Copyright © 2016, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds