BPF will have terrible performance, severe limitations and require some sort of ad-hoc compiler.
Simply allowing kernel modules to filter syscalls is a far simpler and faster approach.
Compilation is not a problem because systems like DKMS can automatically compile kernel modules as needed, and Chromium/Firefox/whatever can just install a DKMS module package.
Not to mention that having a proper unified security and mitigation model would be even better.
If this isn't yet in the mainline kernel, I hope Linus never accepts this.
Posted Mar 26, 2012 16:47 UTC (Mon) by Cyberax (✭ supporter ✭, #52523)
[Link]
So you're going to put GCC and full kernel headers on a ChromeOS netbook? And what about dynamic policies? For example, I want to create a sandbox and allow it to access '/home/myname/workarea'. How would you do it?
Using BPF to filter syscalls is a stroke of genius. BPF is already used in heavy-duty network filtering code (hey, do you think that iptables are slow?) and it has a simple JIT to work even faster.
Besides, in typical seccomp configurations you won't get a lot of syscalls.
Cook: seccomp filter now in Ubuntu
Posted Mar 26, 2012 16:59 UTC (Mon) by slashdot (guest, #22014)
[Link]
Just ship the compiled kernel module in the ChromeOS distribution then.
DKMS is for those building their own kernels.
> For example, I want to create a sandbox and allow it to access '/home/myname/workarea'. How would you do it?
Use filesystem namespaces or chroot, not syscall filtering (that's under "proper unified security model").
Syscall filtering should be used ONLY to mitigate potential kernel bugs by reducing the attack surface, and certainly not for providing security.
Cook: seccomp filter now in Ubuntu
Posted Mar 26, 2012 17:02 UTC (Mon) by Cyberax (✭ supporter ✭, #52523)
[Link]
>Use filesystem namespaces or chroot, not syscall filtering (that's under "proper unified security model").
Both require root access to set up. Not acceptable.
Besides, chroot would require a lot of "mount --bind" magic.
>Syscall filtering should be used ONLY to mitigate potential kernel bugs by reducing the attack surface, and certainly not for providing security.
Why? Syscalls are the primary method of talking with the kernel. It's kinda logical to add filtering there, at the topmost level.
Cook: seccomp filter now in Ubuntu
Posted Mar 26, 2012 17:08 UTC (Mon) by slashdot (guest, #22014)
[Link]
> Why? Syscalls are the primary method of talking with the kernel. It's kinda logical to add filtering there, at the topmost level.
Not at all, because syscalls don't directly map to operations to secure.
For example, filtering access to a path requires hooking dozen of syscalls, requires to reconstruct paths in syscalls such as openat(), handle ioctls that might take in paths, and so on.
Of course, then there are race conditions if you just filter, and you actually need to "reissue" syscalls somehow, at tremendous performance cost.
There's simply no way to do it properly, and that's why Linux provides the LSM interface to do that.
Using system call filtering to provide security (and not just mitigation of bugs) is not just insane, it's totally broken.
Cook: seccomp filter now in Ubuntu
Posted Mar 26, 2012 17:15 UTC (Mon) by Cyberax (✭ supporter ✭, #52523)
[Link]
You don't get it.
Seccomp is not going to be used to sandbox arbitrary executables. It's used to sandbox *your* *own* code, where you *know* which access patterns you'll need to support.
For example, for a JavaScript interpreter it would be "can read anything, but can write only to descriptors passed from the parent". This kind of restriction CAN NOT be expressed using chroot/namespaces. It can be done using SELinux, probably, but only SELinux developers could write policy for this.
Cook: seccomp filter now in Ubuntu
Posted Mar 26, 2012 22:16 UTC (Mon) by aliguori (subscriber, #30636)
[Link]
Syscall filtering is extremely hard if you're trying to implement some sort of access control mechanism. SELinux will be much easier to use to implement this.
There's really only two sane ways to use syscall filtering: as a slightly more powerful mode 1 where you allow a few more obviously safe calls, like select(), or as a mechanism to reduce the kernel's attack surface.
Cook: seccomp filter now in Ubuntu
Posted Mar 26, 2012 23:17 UTC (Mon) by Cyberax (✭ supporter ✭, #52523)
[Link]
Yup. Seccomp is mostly useful to sandbox specific pieces of code, it's not useful as a full-system security solution.
Cook: seccomp filter now in Ubuntu
Posted Mar 26, 2012 23:24 UTC (Mon) by dpquigl (subscriber, #52852)
[Link]
For people interested in why this is the case Robert Watson published a good paper back in 2007 on the topic. Link found below [1].
Posted Mar 27, 2012 11:26 UTC (Tue) by Da_Blitz (guest, #50583)
[Link]
If i have been paying attention correctly the use of BPF, doing this in kernel was to defeat the attack specified in the pdf you have linked.
the pdf exploits the fact that the syscall wrapper has to perform some policy work before copying the data and performing the syscall and relies on another thread to change the data behind the syscalls back after it has performed the check but before the syscall is executed
by doing it in the kernel side i am assuming that things cant be changed as the values are passed in the registers on most platforms and the BPF checks only check the values of the syscall and not any mem they may point to in the case of pointer
so safe due to being limited in scope (corse grain syscall blocking, ie specific syscalls and perhaps an arg or two), section 8.3 also indicates that this attack can be mitigated by using an in kernel system
Cook: seccomp filter now in Ubuntu
Posted Mar 26, 2012 19:56 UTC (Mon) by aliguori (subscriber, #30636)
[Link]
We are planning on using seccomp in QEMU specifically for mitigating against kernel bugs. This is in additional to use SELinux (via sVirt) to provide security (beyond that provided from running as a non-privileged user).
I agree that syscall filtering is strictly to reduce the kernel's attack surface. Access control should be done via an LSM module like SELinux.
Cook: seccomp filter now in Ubuntu
Posted Mar 26, 2012 19:16 UTC (Mon) by scientes (guest, #83068)
[Link]
> BPF will have terrible performance, severe limitations and require some sort of ad-hoc compiler.
Posted Mar 26, 2012 19:50 UTC (Mon) by slashdot (guest, #22014)
[Link]
There's no chance that an in-kernel JIT compiler for a limited language can produce code comparable to gcc -O2.
Also, BPF programs need to be heavily restricted for security reasons, while kernel modules can look at kernel data structures, keep state, allocate memory, use lookup tables, etc.
Of course, for tcpdump and similar, the need to allow unprivileged users to input arbitrary expressions and instantly get results trumps all other considerations, but sandboxing doesn't have this need.
Cook: seccomp filter now in Ubuntu
Posted Mar 26, 2012 20:12 UTC (Mon) by Cyberax (✭ supporter ✭, #52523)
[Link]
BPF is a VERY simple language, it translates naturally to machine code. BPF JIT doesn't need to keep state, complex data structures and so on.
And BPF programs _by_ _design_ can not be used to attack the kernel. Simply because they don't allow arbitrary expressions, only a safe verifiable subset.
Cook: seccomp filter now in Ubuntu
Posted Mar 27, 2012 2:21 UTC (Tue) by kevinm (guest, #69913)
[Link]
gcc -O2 isn't an option, because what we're talking about here is some arbitrary application, potentially running as a non-root user, telling the kernel "these are the patterns for the system calls I make; don't let me make any others". The application in most cases doesn't even have permissions to load a kernel module.
The idea would be for the original author of the application to write the BPF code, not the system administrator.
Cook: seccomp filter now in Ubuntu
Posted Mar 27, 2012 12:01 UTC (Tue) by slashdot (guest, #22014)
[Link]
Applications are installed by the package manager running as root, and thus can install kernel modules perfectly fine.
Yes, you lose the ability for the unprivileged user to install random syscall filters, but does it matter?
Cook: seccomp filter now in Ubuntu
Posted Mar 27, 2012 14:51 UTC (Tue) by renox (subscriber, #23785)
[Link]
> Yes, you lose the ability for the unprivileged user to install random syscall filters, but does it matter?
Yes, it matter if installing an application implies installing a kernel module.
Cook: seccomp filter now in Ubuntu
Posted Mar 28, 2012 17:51 UTC (Wed) by nix (subscriber, #2304)
[Link]
Yes, definitely. Applications may know that they expect different sets of syscalls at different times. With a BPF syscall filter, they can express this by switching filters at runtime. You cannot possibly expect them to unload and reload kernel modules at runtime (not least because the sorts of programs this is targetted at -- things like, well, Chromium -- aren't going to be running as root anyway.)
Cook: seccomp filter now in Ubuntu
Posted Mar 28, 2012 19:14 UTC (Wed) by dpquigl (subscriber, #52852)
[Link]
Wait so you're telling me that you can deregister and reregister a filter? What stops me from dropping my own custom filter in an exploit and installing the new filter that says I have everything? This needs to be something that can only be done once per process invocation.
Cook: seccomp filter now in Ubuntu
Posted Mar 28, 2012 19:58 UTC (Wed) by Cyberax (✭ supporter ✭, #52523)
[Link]
>What stops me from dropping my own custom filter in an exploit and installing the new filter that says I have everything? This needs to be something that can only be done once per process invocation.
The _parent_ process can start children with arbitrary filters. Children can't override filters (in fact, they are _forced_ to have NNP flag set).
Cook: seccomp filter now in Ubuntu
Posted Mar 28, 2012 21:34 UTC (Wed) by dpquigl (subscriber, #52852)
[Link]
That doesn't address the issue that if there is an exploit in that parent process that I can have it install a new filter. The process itself is what installs the filter. Also from your description here it seems that if you put a filter in bash then no process executed from a shell could use filters. Maybe I'm missing something here. The NNP flag seems completely disjoint from seccomp filtering.
Cook: seccomp filter now in Ubuntu
Posted Mar 28, 2012 23:51 UTC (Wed) by khc (subscriber, #45209)
[Link]
Or you can just run the exploit code in the parent process, if you have already exploited the parent process why bother with the child process?
The assumption is the child process is the one that's loading untrusted data, and so is more likely to be exploitable.
Cook: seccomp filter now in Ubuntu
Posted Mar 29, 2012 0:12 UTC (Thu) by Cyberax (✭ supporter ✭, #52523)
[Link]
khc has already answered about exploiting the parent process.
NNP flag is a prerequisite for BPF filtering to avoid repeating the infamous Sendmail bug.
Cook: seccomp filter now in Ubuntu
Posted Mar 27, 2012 7:40 UTC (Tue) by rvfh (subscriber, #31018)
[Link]
I think you are trolling and don't know what you are talking about, as shown by the answers you are receiving.
I wish you would stop behaving like this, and stop thinking that other people are stupid and you are so much smarter.
Cook: seccomp filter now in Ubuntu
Posted Mar 27, 2012 7:55 UTC (Tue) by gowen (guest, #23914)
[Link]
The guy's username is "slashdot". I'd assumed it was a self-evident parody account, sending up someone who's not nearly as well informed as they believe themselves to be.
Cook: seccomp filter now in Ubuntu
Posted Mar 27, 2012 8:45 UTC (Tue) by robert_s (subscriber, #42402)
[Link]
Ah suddenly it all makes sense.
Cook: seccomp filter now in Ubuntu
Posted Mar 27, 2012 13:27 UTC (Tue) by Jannes (subscriber, #80396)
[Link]
finally someone who says it! I was starting to think it was just me going crazy from all the randomly combined computer related terms used to disguise the weirdest bold assertions.
I hope we don't need a user filter or moderation system.