Capsicum: practical capabilities for UNIX
The Capsicum capabilities framework has been around for a couple of years now, and support for it was added to the recent FreeBSD 9.0 release. Capsicum takes a very different approach from other capabilities systems (like Linux capabilities or POSIX capabilities), and is geared toward sandboxing applications to limit the damage that can be caused by buggy or misbehaving programs. While the FreeBSD support is "experimental", it is available for researchers and others to try out.
Capsicum came out of a collaboration between the University of Cambridge's Computer Laboratory and Google. That resulted in a prototype implementation for FreeBSD along with modification of several different programs to take advantage of Capsicum. One of the main applications of interest is the Chromium web browser, but several FreeBSD utilities (tcpdump, dhclient, and gzip) were also converted, as described in the Capsicum paper [PDF].
The idea behind Capsicum is to extend the standard Unix APIs by adding ways that applications can "self-compartmentalize". Essentially, applications can choose to restrict themselves to a sandbox that will disallow many "dangerous" operations, while still allowing them to get their job done via the capabilities they allow for themselves or those that are passed in using special file descriptors (which are also, perhaps unfortunately, called capabilities). It is, in some ways, conceptually similar to programs that drop their privileges using the setuid() call but, instead of being restricted to what a particular user is allowed to do (which is often far more than the application needs), Capsicum allows much finer-grained control over what restrictions are in place.
The starting point for a Capsicum-enabled process is the new cap_enter() system call. This is a one-way gate that puts a process and any subsequent children into "capability mode". It turns off "ambient authority", which is a term for the normal Unix process model where a process has all of the permissions of the UID it is running as. Capability mode restricts access to any of the global namespaces, like the filesystem namespace, PID namespace, network namespace, and others. Any system calls that operate on these global namespaces are either disallowed entirely, or their arguments are constrained.
For example, the sysctl() call is constrained to only allow around 30 (of a possible 3000) of the different system parameters to be examined via that call. The shared memory creation call, shm_open(), is only allowed to create anonymous memory objects, while the openat() family is restricted to allow access to files at or below the directory file descriptor passed in (by essentially disallowing "/" or ".." at the start of the path). There are some other miscellaneous restrictions that come with capability mode including disallowing the loading of kernel modules or the execution of setuid and setgid binaries.
Capsicum wraps normal file descriptors with additional capability information that restricts what can be done with the file. If a capability file descriptor has the CAP_READ capability, that's all that can be done to it, unlike a file descriptor for a file that is opened read-only which can still be used to make metadata changes (via fchmod() for example). In order to change positions in the file, the CAP_SEEK capability is required. A capability file descriptor can also wrap a directory file descriptor, which allows the capability set to be applied to all members of that directory. That would allow Apache to set up workers that only have access to a certain subset of the web directory hierarchy, or for a sandboxed application to access a library path, for example.
The capability file descriptors can be already open at the time that cap_enter() is called (and wrapped by a set of capabilities specified in an earlier cap_new() call) or passed to the process using Unix sockets. That means that a fairly simple program can decrease its ability to cause harm by setting up the file descriptors it needs and then calling cap_enter() before performing more "dangerous" operations. The tcpdump example given in the paper is instructive, as it simply enters capability mode after setting up the packet filter (which is a privileged operation), but before entering the processing loop. That way, errors in the packet decoding code are very limited in the kind of damage they can cause.
The simple two-line change to tcpdump() did expose a few problems, however. For example the glibc DNS resolver code requires access to the filesystem (/etc/resolv.conf) and to the network namespace (to talk to the DNS server), which led to reduced functionality. Switching tcpdump to use a lightweight local resolver restored that feature.
In addition to the "raw" Capsicum interface using cap_enter(), the framework provides a libcapsicum that can be used to more thoroughly isolate the sandboxed processes without each application having to do its own start-up management of a sandboxed process. It handles closing all undelegated file descriptors (those that are not meant for the sandbox), forking the new sandboxed process, flushing the address space using fexecve(), and setting up a Unix socket that can be used for communication between the privileged and unprivileged processes. None of the examples in the paper use libcapsicum as it generally requires major changes to the application in order to be used, so it may be more suitable for new development.
The examples do show that substantial improvements in the security of programs can be had with minimal code changes, though. Roughly 100 new lines of code were all that was required to use Capsicum in Chromium on FreeBSD, largely because the browser was written with privilege separation in mind. Chromium already uses various techniques, depending on the OS, to separate the rendering process from other renderers and the rest of the browser. That made it fairly straightforward to adapt Chromium and the paper says that switching to a libcapsicum-based implementation should not be significantly harder.
Capsicum is an interesting idea that bears watching as it rolls out in FreeBSD. The 9.0 release only contains the kernel changes required for Capsicum but doesn't ship any applications that use the facility. 9.1 is slated to have some of those, presumably starting with Chromium. Beyond this brief introduction, those interested should take a look at the paper, this article [PDF] from ;login: magazine, as well as the documentation page.
Index entries for this article | |
---|---|
Security | BSD |
Security | Capabilities |
Posted Feb 23, 2012 18:36 UTC (Thu)
by ThinkRob (guest, #64513)
[Link]
Kidding aside, this is a fascinating and welcome improvement to FreeBSD. It'll be interesting to see how it stacks up against seatbelt on OS X, particularly with regards to use restricting applications that *weren't* written with capabilities restriction in mind.
Posted Feb 23, 2012 18:41 UTC (Thu)
by giggls (subscriber, #48434)
[Link] (2 responses)
Posted Feb 24, 2012 0:13 UTC (Fri)
by tobiasu (subscriber, #72521)
[Link]
For those interested here is the USENIX talk about Capsicum: https://www.youtube.com/watch?v=raNx9L4VH2k
Posted Jul 8, 2012 1:08 UTC (Sun)
by jbytau (guest, #75301)
[Link]
https://git.chromium.org/gitweb/?p=chromiumos/third_party...
Posted Feb 27, 2012 2:34 UTC (Mon)
by zooko (guest, #2589)
[Link]
Posted Feb 12, 2014 21:55 UTC (Wed)
by rdahlgren (guest, #95523)
[Link] (1 responses)
Additionally, a test-suite is available to verify that your newly built capsicum kernel is functioning as designed -> https://github.com/google/capsicum-test
Finally, a fledgling IRC channel is available on Freenode at #capsicum
Posted May 2, 2015 10:02 UTC (Sat)
by xose (subscriber, #535)
[Link]
Capsicum: practical capabilities for UNIX
Capsicum: practical capabilities for UNIX
Capsicum: practical capabilities for UNIX
Capsicum: practical capabilities for UNIX
Capsicum: practical capabilities for UNIX
Capsicum: practical capabilities for UNIX
Capsicum Implementation Status: Linux and FreeBSD