filtering for a syscall looks like a very hard problem (both in what filters make sense and in performance)
it would seem to me that the right approach is to implement a limiting function that can do either
allow
block
filter
but initially just implment the allow/block modes, and have some sort of experimental loadable module support for the filter mode so that different filters can be experimented with easily
Posted Oct 21, 2011 12:31 UTC (Fri) by davecb (subscriber, #1574)
[Link]
Another possible approach is to retarget the mechanism once used for SCO emulation to do something quite close to what dw suggested.
If a process is started under a cgroup with syscall control enabled, it gets a different "interpreter" with a different syscall mapping table. Cgroups without syscall imitations get the standard one.
One then has the ability to permit, deny or filter in an arbitrary way the syscalls a given cgroup sees. The management would be in user-space, the implementation a hook and a set of "interpreter" syscall tables in a kernel module. The rest of the interpreter mechanisms would continue unchanged, which is important as they're still used for running alien binaries on Linux.
--dave
Limiting system calls via control groups?
Posted Oct 22, 2011 17:20 UTC (Sat) by alonz (subscriber, #815)
[Link]
Unfortunately, the “personality” mechanism (used for SCO emulation) hinges on the difference in syscall ABIs between Linux and SCO (specifically: Linux uses sysenter/syscall instructions, while SCO used lcall7).
The existing seccomp uses the trace path, which is a nice compromise—it requires a single hook in the (performance-critical) system-call-entry code for any non-standard behavior, which translates to either tracing or seccomp-limitation of the system calls. To be workable, any solution will need to maintain this level of performance (= nearly zero impact when disabled).