Who audits the audit code?

By Jonathan Corbet
May 29, 2014

The Linux audit subsystem is not one of the best-loved parts of the kernel. It allows the creation of a log stream documenting specific system events — system calls, modifications to specific files, actions by processes with certain user IDs, etc. For some, it is an ideal way to get a handle on what is being done on the system and, in particular, to satisfy various requirements for security certifications (Common Criteria, for example). For others, it is an ugly and invasive addition to the kernel that adds maintenance and runtime overhead without adding useful functionality. More recently, though, it seems that audit adds some security holes of its own. But the real problem, perhaps, is that almost nobody actually looks at this code, so bugs can lurk for a long time.

The system call auditing mechanism creates audit log entries in response to system calls; the system administrator can load rules specifying which system calls are to be logged. These rules can include various tests on system call parameters, but there is also a simple bitmask, indexed by system call number, specifying which calls might be of interest. One of the first things done by the audit code is to check the appropriate bit for the current system call to see if it is set; if it is not, there is no auditing work to be done.

Philipp Kern recently noticed a little problem with how that code works with the x32 ABI. When code running under that ABI invokes a system call, it does not use the normal system call numbers defined by the x86 architecture; instead, x32 system calls (which require compatibility handling for some parameters) are marked by setting an additional bit (0x40000000) in that number. The audit code fails to remove that bit before checking the system call number in its bitmask; as one might imagine, the results are not as one might wish. Philipp included a patch to strip out the x32 bit, but it turns out that the problem is a bit bigger than that.

Andy Lutomirski, in looking at Philipp's patch, realized that the code wasn't just failing to strip out one bit; there are, in fact, no bounds checks on the system call number at all. User space can pass in any system call number it wants, and the kernel will use that number to index into its bitmask array; the result for a sufficiently large system call number is a predictable kernel oops. Andy also suggested that this failure could be used to determine the value of specific bits in kernel space, leading to an information-disclosure vulnerability.

Andy submitted a patch to fix this particular problem, but he didn't stop there. He has come to the conclusion that the audit subsystem is beyond repair, so his patch marks the whole thing as being broken, making it generally inaccessible. He cited a number of problems beyond this security issue: it hurts performance even when it is not being used, it is not (in his mind) reliable, it has problems with various architectures, and "its approach to freeing memory is terrifying". All told, Andy said, we're better off without it:

In summary, the code is a giant mess. The way it works is nearly incomprehensible. It contains at least one severe bug. I'd love to see it fixed, but for now, distributions seem to think that enabling CONFIG_AUDITSYSCALL is a reasonable thing to do, and I'd argue that it's actually a terrible choice for anyone who doesn't actually need syscall audit rules. And I don't know who needs these things.

It is unsurprising that Eric Paris, who maintains the audit code, disagrees with this assessment. His point of view is that this is just another bug in need of fixing; it does not indicate any systemic problem with the audit code.

It is telling, though, that this particular vulnerability has existed in the audit subsystem almost since its inception. The audit code receives little in the way of review; most kernel developers simply turn it off for their own kernels and look the other way. But this subsystem is just the sort of thing that distributors are almost required to enable in their kernels; some users will want it, so they have to turn it on for everybody. As a result, almost all systems out there have audit enabled (look for a running kauditd thread), even though few of them are using it. These systems take a performance penalty just for having audit enabled, and they are vulnerable to any issues that may be found in the audit code.

If audit were to be implemented today, the developer involved would have to give some serious thought, at least, to using the tracing mechanism. It already has hooks applied in all of the right places, but those hooks have (almost) zero overhead when they are not enabled. Tracing has its own filtering mechanism built in; the addition of BPF-based filters will make that feature more capable and faster as well. In a sense, the audit subsystem contains yet another kernel-based virtual machine that makes decisions about which events to log; using the tracing infrastructure would allow the removal of that code and a consolidation to a single virtual machine that is more widely maintained and reviewed.

The audit system we have, though, predates the tracing subsystem, so it could not have been based on tracing. Replacing it without breaking users would not be a trivial job, even in the absence of snags that have been glossed over in the above paragraph (and such snags certainly exist). So we are likely stuck with the current audit subsystem (which will certainly not be marked "broken" in the mainline kernel) for the foreseeable future. Hopefully it will receive some auditing of its own just in case there are more old surprises lurking therein.

Index entries for this article
Kernel	Auditing
Security	Linux kernel

Fact checking

Posted May 30, 2014 6:50 UTC (Fri) by bnorris (subscriber, #92090) [Link] (10 responses)

> As a result, almost all systems out there have audit enabled

$ grep CONFIG_AUDIT /boot/config-`uname -r`
CONFIG_AUDIT_ARCH=y
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_AUDIT_WATCH=y
CONFIG_AUDIT_TREE=y

> (look for a running kauditd thread)

None here.

> even though few of them are using it. These systems take a performance penalty just for having audit enabled, and they are vulnerable to any issues that may be found in the audit code.

I'm not an expert on the kaudit subsystem (in fact, I just learned of it), but it looks like kauditd is only spawned in response to a user-space request for it (e.g. from SELinux auditd). See kernel/audit.c:

static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
{
[...]
/* As soon as there's any sign of userspace auditd,
* start kauditd to talk to it */
if (!kauditd_task) {
kauditd_task = kthread_run(kauditd_thread, NULL, "kauditd");
[...]
}

So it looks like my Ubuntu system doesn't have any overhead from kauditd; just the overhead of listening (likely low?).

Now, I don't know what other kinds of overhead CONFIG_AUDIT* might have besides kauditd, but I am at least doubtful of these claims now.

Fact checking

Posted May 30, 2014 13:03 UTC (Fri) by alonz (subscriber, #815) [Link] (1 responses)

According to the posting by Andy, the effect of CONFIG_AUDITSYSCALLS is

It forces all syscalls into the slow path and it can do crazy things
like building audit contexts just in case actual handling of the
syscall triggers an audit condition so that the exit path can log the
syscall.  That's way worse than a single branch.

Try it: benchmark getpid on Fedora and then repeat the experiment with
syscall auditing fully disabled.  The difference is striking.

Fact checking

Posted May 30, 2014 18:07 UTC (Fri) by bnorris (subscriber, #92090) [Link]

Right, I wasn't really objecting to the statement that the audit subsystem causes syscall overhead by default (I haven't run enough tests to confirm/deny). I was just objecting to the wording of this article; it points users to the 'kauditd' kernel thread, which AFAICT does not necessarily run by default. This is potentially misleading.

Fact checking

Posted May 30, 2014 17:16 UTC (Fri) by luto (guest, #39314) [Link] (7 responses)

Try running your favorite syscall-heavy workload and seeing if __audit_syscall_xyz shows up.

To be clear: I don't really object to CONFIG_AUDIT -- it's just CONFIG_AUDITSYSCALL. Once audit has been enabled, you're stuck with syscall auditing overhead until the next reboot. There's a workaround:

# auditctl -a task,never

I'm currently lobbying for Fedora to turn off syscall auditing in their default configuration:

https://fedorahosted.org/fesco/ticket/1311

Fact checking

Posted Jun 1, 2014 5:13 UTC (Sun) by dirtyepic (guest, #30178) [Link] (6 responses)

Isn't AUDITSYSCALL required for consolekit to work properly?

Fact checking

Posted Jun 1, 2014 5:21 UTC (Sun) by luto (guest, #39314) [Link]

Wow. That's hideous.

I have no clue what loginuid and sessionid (which appears to be completely unrelated to the POSIX session id) have to do with syscall auditing. It would be easy to split that out from CONFIG_AUDITSYSCALL, since it seems to be almost completely unrelated to syscall auditing, other than the fact that syscall auditing logs the loginuid and sessionid.

Fact checking

Posted Jun 3, 2014 19:48 UTC (Tue) by zdzichu (guest, #17118) [Link] (4 responses)

ConsoleKit is so dead and so irrelevant today, it's not worth a dime.

Fact checking

Posted Jun 4, 2014 4:24 UTC (Wed) by dirtyepic (guest, #30178) [Link] (3 responses)

I can't argue that, but it's disappointing. Systemd isn't really an option here (in fact it's proven impossible for us to make any kind of big change - we're still on cvs FFS).

Fact checking

Posted Jun 14, 2014 9:19 UTC (Sat) by Duncan (guest, #6647) [Link] (2 responses)

Systemd requires it too (or at least gentoo's systemd ebuild checks for it and warns if the thing isn't enabled), probably for exactly the same thing consolekit uses it for, if not more.

But no sign of kauditd. I might just try disabling CONFIG_AUDIT or at least CONFIG_AUDITSYSCALL next time I do a kernel build and see if systemd can run properly with it disabled. I've wondered why I actually needed it ever since I first enabled it. At least that way if I have to actually reenable it, I'll know exactly what I was unbreaking by doing so.

Fact checking

Posted Jun 14, 2014 13:56 UTC (Sat) by Duncan (guest, #6647) [Link] (1 responses)

Well, whatever systemd "requires" CONFIG_AUDIT for, doesn't seem to affect me turning it off. I rebuild without it, rebooted, logged in, started X... launched a terminal window and zgrepped /proc/config.gz for CONFIG_AUDIT to verify I hadn't somehow booted the old kernel... Loaded up firefox and LWN again to write this update...

Still don't know what systemd "requires" it for, but it's off now and doesn't seem to hurt me, so... I'm leaving it off.

Fact checking

Posted Jun 16, 2014 11:13 UTC (Mon) by cortana (subscriber, #24596) [Link]

It's funny because systemd's README explicitly instructs users who want to run systemd inside a container to disable the audit subsystem.

Who audits the audit code?

Posted May 30, 2014 15:12 UTC (Fri) by and (guest, #2883) [Link]

> If audit were to be implemented today, the developer involved would have to give some serious thought, at least, to using the tracing mechanism.

wouldn't it be "kind of straightforward" to rip out the current audit subsystem and replace it with a compatibility layer that translates everything which it exposes to userspace to the tracing subsystem?