|| ||Pavel Emelyanov <firstname.lastname@example.org>|
|| ||Andrew Morton <email@example.com>|
|| ||[PATCH 0/3]Sysctl: clean the code and prepare for secure use in containers|
|| ||Wed, 27 Feb 2008 16:47:10 +0300|
|| ||Linux Kernel Mailing List <firstname.lastname@example.org>|
Many (most of) sysctls do not have a per-container sense. E.g.
kernel.print_fatal_signals, vm.panic_on_oom, net.core.netdev_budget
and so on and so forth. Besides, tuning then from inside a container
is not even secure. On the other hand, hiding them completely from
the container's tasks sometimes causes user-space to stop working.
When developing net sysctl, the common practice was to duplicate
a table and drop the write bits in table->mode, but this approach
was not very elegant, lead to excessive memory consumption and
was not suitable in general.
Here's the alternative solution. To facilitate the per-container
sysctls ctl_table_root-s were introduced. Each root contains a
list of ctl_table_header-s that are visible to different namespaces.
The idea of this set is to add the permissions() callback on the
ctl_table_root to allow ctl root limit permissions to the same
The main user of this functionality is the net-namespaces code,
but later this will (should) be used by more and more namespaces,
containers and control groups.
Actually, this idea's core is in a single hunk in the third patch.
First two patches are cleanups for sysctl code, while the third
one mostly extends the arguments set of some sysctl functions.
Signed-off-by: Pavel Emelyanov <email@example.com>