LWN.net Logo

Unexporting the system call table

A linux-kernel reader recently complained that Red Hat had applied a patch to the kernel in its 8.0 distribution which made the sys_call_table data structure unavailable to modules. He will not have been pleased with the 2.5.41 kernel release, which did the same thing.

sys_call_table is a special table used to dispatch system calls within the kernel. It is a simple array, indexed by the system call number passed in from user space. The reason for wanting this array to be exported, of course, is to allow modules to add or modify system calls. A classic example is a module implementing the "streams" interface, which is unlikely to ever be part of the mainline kernel. Some users need streams, though; an exported system call table allows them to load a module and have the streams call work as expected.

So why would this capability be taken away? The stated reason is that tweaking the system call table is nonportable and unsafe. Each architecture has a different system call table format, so code which wants to be portable has to understand how each architecture does things. There is also no locking mechanism for the system call table, so run-time changes are subject to race conditions. And finally, there are even errata problems on some processors; changing a table used the way sys_call_table is used can have unfortunate and unexpected results.

Many of these problems could be worked around with a bit of coding. But the simple fact is that many kernel developers do not want loadable modules to be able to add or change system calls. Binary modules are tolerated as long as they stick to the "published" interfaces and implement straightforward features (such as device drivers and filesystems). A module which can add or change system calls can go well beyond that interface. Removing access to the system call table keeps modules in their place.

Working around this problem not all that difficult for modules which need to do so. A patch was quickly posted which made streams work again, for example. The solution is to have a set of stub system calls wired into the kernel; when the associated module is loaded, the stubs can make the appropriate calls with the necessary locks. Otherwise they return an ENOSYS error.


(Log in to post comments)

Copyright © 2002, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds