Kernel runtime security instrumentation
Finding ways to make it easier and faster to mitigate an ongoing attack against a Linux system at runtime is part of the motivation behind the kernel runtime security instrumentation (KRSI) project. Its developer, KP Singh, gave a presentation about the project at the 2019 Linux Security Summit North America (LSS-NA), which was held in late August in San Diego. A prototype of KRSI is implemented as a Linux security module (LSM) that allows eBPF programs to be attached to the kernel's security hooks.
Singh began by laying out the motivation for KRSI. When looking at the security of a system, there are two sides to the coin: signals and mitigations. The signals are events that might, but do not always, indicate some kind of malicious activity is taking place; the mitigations are what is done to thwart the malicious activity once it has been detected. The two "go hand in hand", he said.
For example, the audit subsystem can provide signals of activity that might be malicious. If you have a program that determines that the activity actually is problematic, then you might want it to update the policy for an LSM to restrict or prevent that behavior. Audit may also need to be configured to log the events in question. He would like to see a unified mechanism for specifying both the signals and mitigations so that the two work better together. That is what KRSI is meant to provide.
He gave a few examples of different types of signals. For one, a process that executes and then deletes its executable might well be malicious. A kernel module that loads and then hides itself is also suspect. A process that executes with suspicious environment variables (e.g. LD_PRELOAD) might indicate something has gone awry as well.
On the mitigation side, an administrator might want to prevent mounting USB drives on a server, perhaps after a certain point during the startup. There could be dynamic whitelists or blacklists of various sorts, for kernel modules that can be loaded, for instance, to prevent known vulnerable binaries from executing, or stopping binaries from loading a core library that is vulnerable to ensure that updates are done. Adding any of these signals or mitigations requires reconfiguration of various parts of the kernel, which takes time and/or operator intervention. He wondered if there was a way to make it easy to add them in a unified way.
eBPF + LSM
He has created a new eBPF program type that can be used by the KRSI LSM. There is a set of eBPF helpers that provide a "unified policy API" for signals and mitigations. They are security-focused helpers that can be built up to create the behavior required.
![KP Singh [KP Singh]](https://static.lwn.net/images/2019/lssna-singh-sm.jpg)
Singh is frequently asked why he chose to use an LSM, rather than other options. Security behaviors map better to LSMs, he said, than to things like seccomp filters, which are based on system call interception. Various security-relevant behaviors can be accomplished via multiple system calls, so it would be easy to miss one or more, whereas the LSM hooks intercept the behaviors of interest. He also hopes this work will benefit the overall LSM ecosystem, he said.
He talked with some security engineers about their needs and one mentioned logging LD_PRELOAD values on process execution. The way that could be done with KRSI would be to add a BPF program to to the bprm_check_security() LSM hook that gets executed when a process is run. So KRSI registers a function for that hook, which gets called along with any other LSM's hooks for bprm_check_security(). When the KRSI hook is run, it calls out to the BPF program, which will communicate to user space (e.g. a daemon that makes decisions to add further restrictions) via an output buffer.
The intent is that the helpers are "precise and granular". Unlike the BPF tracing API, they will not have general access to internal kernel data structures. His slides [PDF] had bpf_probe_read() in a circle with a slash through it as an indication of what he was trying to avoid. The idea is to maintain backward compatibility by not tying the helpers to the internals of a given kernel.
He then went through various alternatives for implementing this scheme and described the problems he saw with them. To start with, why not use audit? One problem is that the mitigations have to be handled separately. But there is also a fair amount of performance overhead when adding more things to be audited; he would back that up with some numbers later in the presentation. Also, audit messages have rigid formatting that must be parsed, which might delay how quickly a daemon could react.
Seccomp with BPF was up next. As he said earlier, security behaviors map more directly into LSM hooks than to system-call interception. He is also concerned about time-of-check-to-time-of-use (TOCTTOU) races when handling the system call parameters from user space, though he said he is not sure that problem actually exists.
Using kernel probes (kprobes) and eBPF was another possibility. It is a "very flexible" solution, but it depends on the layout of internal kernel data structures. That makes deployment hard as things need to be recompiled for each kernel that is targeted. In addition, kprobes is not a stable API; functions can be added and removed from the kernel, which may necessitate changes.
The final alternative was the Landlock LSM. It is geared toward providing a security sandbox for unprivileged processes, Singh said. KRSI, on the other hand, is focused on detecting and reacting to security-relevant behaviors. While Landlock is meant to be used by unprivileged processes, KRSI requires CAP_SYS_ADMIN to do its job.
Case study
He then described a case study: auditing the environment variables set when executing programs on a system. It sounds like something that should be easy to do, but it turns out not to be. For one thing, there can be up to 32 pages of environment variables, which he found surprising.
He looked at two different designs for an eBPF helper, one that would return all of the environment variables or one that just returned the variable of interest. The latter has less overhead, so it might be better, especially if there is a small set of variables to be audited. But either of those helpers could end up sleeping because of a page fault, which is something that eBPF programs are not allowed to do.
Singh did some rough performance testing in order to ensure that KRSI was not completely unworkable, but the actual numbers need to be taken with a few grains of salt, he said. He ran a no-op binary 100 times and compared the average execution time (over N iterations of the test) of that on a few different systems: a kernel with audit configured out, a kernel with audit but no audit rules, one where audit was used to record execve() calls, and one where KRSI recorded the value of LD_PRELOAD. The first two were measured at a bit over 500µs (518 and 522), while the audit test with rules came in at 663µs (with a much wider distribution of values than any of the other tests). The rudimentary KRSI test clocked in at 543µs, which gave him reason to continue on; had it been a lot higher, he would have shelved the whole idea.
There are plenty of things that are up for discussion, he said. Right now, KRSI uses the perf ring buffer to communicate with user space; it is fast and eBPF already has a helper to access it. But that ring buffer is a per-CPU buffer, so it uses more memory than required, especially for systems with a lot of CPUs. There is already talk of allowing eBPF programs to sleep, which would simplify KRSI and allow it to use less memory. Right now, the LSM hook needs to pin the memory for use by the eBPF program. He is hopeful that discussions in the BPF microconference at the Linux Plumbers Conference will make some progress on that.
As part of the Q&A, Landlock developer Mickaël Salaün spoke up to suggest working together. He went through the same thinking about alternative kernel facilities that Singh presented and believes that Landlock would integrate well with KRSI. Singh said that he was not fully up-to-speed on Landlock but was amenable to joining forces if the two are headed toward the same goals.
[I would like to thank LWN's travel sponsor, the Linux Foundation, for
funding to travel to San Diego for LSS-NA.]
Index entries for this article | |
---|---|
Kernel | BPF/Security |
Kernel | Security/Security modules |
Security | Linux kernel/BPF |
Security | Linux Security Modules (LSM) |
Conference | Linux Security Summit North America/2019 |
Posted Sep 4, 2019 20:40 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (27 responses)
Posted Sep 4, 2019 22:41 UTC (Wed)
by kpsingh (subscriber, #112411)
[Link] (26 responses)
Also "antivirus" is an overloaded term, I guess you mean it as something that detects and prevents malicious activity based on known signals?
As to what could go wrong? We will find out when we try it :)
Posted Sep 4, 2019 23:23 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (23 responses)
> mitigation based on the audited data
Posted Sep 6, 2019 13:40 UTC (Fri)
by cpitrat (subscriber, #116459)
[Link] (19 responses)
This could also be useful in honeypots.
Posted Sep 6, 2019 16:24 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (18 responses)
> Having ways to react automatically and limit attacker's possibilities is still useful.
Posted Sep 6, 2019 16:53 UTC (Fri)
by cpitrat (subscriber, #116459)
[Link] (17 responses)
This is just one scenario. This seems like a flexible solution that allows for some interesting tools.
Posted Sep 6, 2019 16:54 UTC (Fri)
by cpitrat (subscriber, #116459)
[Link] (1 responses)
Posted Sep 6, 2019 20:33 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Sep 6, 2019 16:58 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (13 responses)
> This is just one scenario. This seems like a flexible solution that allows for some interesting tools.
Posted Sep 6, 2019 18:19 UTC (Fri)
by cpitrat (subscriber, #116459)
[Link] (12 responses)
Posted Sep 6, 2019 20:35 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (11 responses)
Posted Sep 7, 2019 16:47 UTC (Sat)
by kpsingh (subscriber, #112411)
[Link] (10 responses)
Whether you chose to block the specific malicious activity or the host itself is a decision you can make.
Posted Sep 7, 2019 18:44 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (7 responses)
Posted Sep 7, 2019 18:58 UTC (Sat)
by kpsingh (subscriber, #112411)
[Link] (6 responses)
PS: I don't intend to reply further if your communication stays unprofessional.
Posted Sep 7, 2019 19:07 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (5 responses)
All I see is handwaving like:
> There could be dynamic whitelists or blacklists of various sorts, for kernel modules that can be loaded, for instance, to prevent known vulnerable binaries from executing, or stopping binaries from loading a core library that is vulnerable to ensure that updates are done.
For me personally the last thing I want is more of SELinux-style security theater that _will_ inevitably break in various exciting ways.
Posted Sep 7, 2019 19:21 UTC (Sat)
by kpsingh (subscriber, #112411)
[Link] (4 responses)
Patching / deleting a binary on a really huge number of servers cannot be done in seconds.
Can you audit environment variables with audit? No you cannot!
What do you need to do to add support? Change a lot of stuff, the policy language, auditd, parsers etc.
The development cycle for adding a new signal and then some new policy based on the signal, e.g. a permission error if you set the same environment variable twice, touches many components and this is an attempt to fix that.
Posted Sep 7, 2019 20:26 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (3 responses)
> Patching / deleting a binary on a really huge number of servers cannot be done in seconds.
> Can you audit environment variables with audit? No you cannot!
> The development cycle for adding a new signal and then some new policy based on the signal, e.g. a permission error if you set the same environment variable twice, touches many components and this is an attempt to fix that.
Posted Sep 7, 2019 20:52 UTC (Sat)
by kpsingh (subscriber, #112411)
[Link] (2 responses)
>> Patching / deleting a binary on a really huge number of servers cannot be done in seconds.
It's got to do with your statement: "If you have a "vulnerable binary" then why the hell it's not deleted?"
> Do environment variables actually pose a significant threat to warrant a new full-blown, user-controlled arbitrary code injection facility on the critical paths? Can it itself be abused to create livelocks/deadlocks? Can an adversary use it to frustrate efforts to recover? ....
Environment variables is one use case where one needs to use a signal that audit does not currently provide. We are **not** talking about unprivileged eBPF here, it needs CAP_SYS_ADMIN and CAP_MAC_ADMIN. If privileged users want to shoot themselves in their feet, they have plenty other opportunities.
> The development cycle for adding a new signal and then some new policy based on the signal, e.g. a permission error if you set the same environment variable twice, touches many components and this is an attempt to fix that.
I disagree. It's about building defense in depth. The more hoops an attacker has to jump to attack you, the slower and harder it gets for them. Anyways, I am happy to hear if you have a constructive solution. Otherwise, this discussion is simply leading nowhere.
Posted Sep 7, 2019 22:04 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
> It's got to do with your statement: "If you have a "vulnerable binary" then why the hell it's not deleted?"
> Environment variables is one use case where one needs to use a signal that audit does not currently provide.
> I disagree. It's about building defense in depth. The more hoops an attacker has to jump to attack you, the slower and harder it gets for them. Anyways, I am happy to hear if you have a constructive solution. Otherwise, this discussion is simply leading nowhere.
Posted Sep 7, 2019 22:17 UTC (Sat)
by kpsingh (subscriber, #112411)
[Link]
Feel free to go that route and suggest / make improvements to audit. Audit does not meet our other key requirement of having the MAC and signaling (auditing) possible with a single API, which is something that you are not constrained by (based on your comments)
Posted Sep 11, 2019 5:17 UTC (Wed)
by ssmith32 (subscriber, #72404)
[Link] (1 responses)
Posted Sep 11, 2019 5:48 UTC (Wed)
by cpitrat (subscriber, #116459)
[Link]
Otherwise, I can do it too:
Ok you said: "because you're knowingly providing connected resources to a bad actor." But anybody can have connected resources and it's very cheap. Look, I'm using one to answer you.
If you think about a botnet of honeypots, I think your either overestimating the number of honeypots, their lifespan or underestimating the number of hosts required for a useful botnet.
Posted Sep 11, 2019 4:54 UTC (Wed)
by ssmith32 (subscriber, #72404)
[Link]
Yeah, cheap shot, but toooo easy. I'll be quiet now.
Posted Sep 8, 2019 6:48 UTC (Sun)
by jezuch (subscriber, #52988)
[Link] (2 responses)
Posted Sep 8, 2019 7:08 UTC (Sun)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
Unfortunately, the patch authors don't seem to have nearly enough experience with that kind of stuff. Modern Windows antiviruses have multiple layers or defenses, they intrude into the very heart of the OS. Windows itself scans and checksums its internal control structures (PatchGuard, CodeIntegrity), and antiviruses tune it up to 11. Which is kinda awe inspiring - it's like watching CoreWar.
Yet it's still not enough. All the OS protections have been bypassed ( https://www.symantec.com/content/dam/symantec/docs/securi... ) and malware now routinely bypasses antiviruses. This is because attacks don't get worse, they always keep getting better.
Posted Sep 11, 2019 5:05 UTC (Wed)
by ssmith32 (subscriber, #72404)
[Link]
Some rumblings about anti trust later, an API was provided, Symantec realized windows was a dying revenue stream, and you haven't seen much work in the area since. So it's a bit of an unknown.
Posted Sep 11, 2019 4:51 UTC (Wed)
by ssmith32 (subscriber, #72404)
[Link] (1 responses)
{rant}
In other words, there are a few differences too... ;)
Posted Sep 11, 2019 7:51 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Kernel runtime security instrumentation
Kernel runtime security instrumentation
Kernel runtime security instrumentation
The only valid mitigation for a detected intrusion is to bring down or isolate the host.
Kernel runtime security instrumentation
Kernel runtime security instrumentation
For example?
Then why not do it from the start?
Kernel runtime security instrumentation
Kernel runtime security instrumentation
Kernel runtime security instrumentation
Kernel runtime security instrumentation
1) A single critical server.
2) Accessible through the Internet.
3) To a script kiddie.
This seems like an overengineered solution for a non-problem.
Kernel runtime security instrumentation
I didn't say there was a single one. There can be multiple one and they all get compromised at (more or less) the same time by the same person.
2) Accessible through the Internet.
If the service is available through the Internet, that's unavoidable. The server could have been exploited through the service it provides.
3) To a script kiddie.
See 2)
Kernel runtime security instrumentation
Kernel runtime security instrumentation
Kernel runtime security instrumentation
And so why does this need yet even more eBPF crap?
Kernel runtime security instrumentation
Kernel runtime security instrumentation
If you have a "vulnerable binary" then why the hell it's not deleted?
Kernel runtime security instrumentation
Kernel runtime security instrumentation
And so will be eBPF. There's also a low-hanging fruit of using BPF to JIT-compile the audit rules.
What does it have to do with audit slowness?
> What do you need to do to add support? Change a lot of stuff, the policy language, auditd, parsers etc.
Do environment variables actually pose a significant threat to warrant a new full-blown, user-controlled arbitrary code injection facility on the critical paths? Can it itself be abused to create livelocks/deadlocks? Can an adversary use it to frustrate efforts to recover? ....
I contend that none of this is even needed, as it's going to be useless and trivial to bypass.
Kernel runtime security instrumentation
^^^^^^^^^^^^^^^^^^^
We are doing performance comparisons and it's not.
> What does it have to do with audit slowness?
> I contend that none of this is even needed, as it's going to be useless and trivial to bypass.
Kernel runtime security instrumentation
Then improve it. Translate audit rules into BPF and run them.
How do you recognize that a binary was used for nefarious purposes?
Then extend it, rather than create a completely new system. Is there anything else that is not covered by audit subsystem and that is not a trivial addition?
The constructive solution is simple - improve audit subsystem instead of adding more eBPF.
Kernel runtime security instrumentation
> Then improve it. Translate audit rules into BPF and run them.
Kernel runtime security instrumentation
Kernel runtime security instrumentation
If you don't isolate the host, you're not being a jerk.
Kernel runtime security instrumentation
Kernel runtime security instrumentation
Kernel runtime security instrumentation
Kernel runtime security instrumentation
Kernel runtime security instrumentation
On the other hand, KRSI is not attempting to call out to central servers, nor is it maintaining a large and ever-growing blacklist in something referred to as a "memory-mapped" file, which actually refers to home brewed code designed to swap memory contents to disk. A list from which nothing can be removed, even if designed to detect old DOS floppy malware, out of fear of failing some dog & pony show that putatively somehow relates to how effective it is. Nor does it incorporate code from a wide variety of engineers that is completely unreviewed and never QA'd. Nor does it attempt to do whatever it can to insert itself wherever it can in kernel memory. And, oh yeah, is produced and managed solely for profit.
{\rant}
Kernel runtime security instrumentation
In case of Kaspersky Antivirus this VM actually is (or used to be as of ~5 years ago) position-independent x86 code that is checked to not include loops or external calls. Kinda like JIT-compiled eBPF :)