LWN.net Logo

Removing binary sysctl

By Jonathan Corbet
November 11, 2009
The "sysctl" mechanism is used by the kernel to export a wide variety of tuning options to user space. Sysctl is actually two interfaces which have been awkwardly joined together: the sysctl() system call and the /proc/sys directory hierarchy. Of the two, /proc/sys is much more widely used, to the point that developers rarely even think about the system call. But the sysctl() implementation is a significant amount of code which suffers from chronic neglect. It has thus been the target of removal attempts for years.

The problem with removing sysctl(), of course, is that it is part of the kernel ABI. As long as the possibility of broken applications exists, this ABI cannot be removed. So it continues to sit in the kernel, despite the fact that its absence would be noted by few people.

Eric Biederman has come up with a new approach to the problem. His patch set removes the current sysctl() implementation, getting rid of a few thousand lines of unloved code. He then adds back a new wrapper which emulates the sysctl() ABI by way of /proc/sys. So any applications using sysctl() should continue to work, but the code dedicated to making it work is much reduced from what was there before.

The patch set still concerns some developers. The compatibility wrapper has its own configuration option, leading some to worry that distributions might disable it and cause obscure things to break. Going through /proc/sys will make access to these variables much slower than it was before. That should not really be a problem: access to sysctl variables is not normally a performance-critical operation. So there does not appear to be any sort of real obstacle to the merging of these patches; maybe, someday, binary sysctl() will truly vanish into the past.


(Log in to post comments)

Removing binary sysctl

Posted Nov 16, 2009 10:56 UTC (Mon) by ebiederm (subscriber, #35028) [Link]

Thanks for writing this up.

Removing binary sysctl

Posted Dec 10, 2009 8:45 UTC (Thu) by wahern (subscriber, #37304) [Link]

Ugh. Now those who use chroot will have even more headaches to deal with.

For instance, my portable arc4random (which uses KERN_RANDOM) will break. Requiring people to seed before the chroot happens, or requiring users to create device files in the chroot tree doesn't help; those things aren't required on other platforms.

One plus is that there'd be less kernel exposure in a chroot without either /proc or sysctl. And certainly in general removing code is good, though /proc has historically been riddled with kernel exploits; far more than sysctl ever produced. Indeed, the mere existence of /proc outside the chroot has its own problems, like exposing file descriptors--pipes, socketpairs--that would otherwise be unaddressable by other processes. Thus one of the strongest security characteristics--using descriptors as ad hoc "capability" tokens--is totally broken. File permissions aren't nearly as strong a security mechanism as the inability to reference the object.

Removing binary sysctl

Posted Dec 10, 2009 10:02 UTC (Thu) by michich (subscriber, #17902) [Link]

Indeed, the mere existence of /proc outside the chroot has its own problems, like exposing file descriptors--pipes, socketpairs--that would otherwise be unaddressable by other processes.
Would this solve your concern?: mount --bind /proc/sys/kernel/random /some/dir/inside/your/chroot

Removing binary sysctl

Posted Dec 10, 2009 15:24 UTC (Thu) by Spudd86 (guest, #51683) [Link]

I get the impression that you don't need to have /proc available to process making the syscall...

Copyright © 2009, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds