LWN.net Logo

Security modules and ioctl()

By Jonathan Corbet
February 16, 2011
The ioctl() system call has a bad reputation for a number of reasons, most of which are related to the fact that every implemented command is, in essence, a new system call. There is no way to effectively control what is done in ioctl(), and, for many obscure drivers, no way to really even know what is going on without digging through a lot of old code. So it's not surprising that code adding new ioctl() commands tends to be scrutinized heavily. Recently it turned out that there's another reason to be nervous about ioctl() - it doesn't play well with security modules, and SELinux has been treating it incorrectly for the last couple of years.

SELinux works by matching a specific access attempt against the permissions granted to the calling process. For system calls like write(), the type of access is obvious - the process is attempting to write to an object. With ioctl(), things are not quite so clear. In past times, SELinux would attempt to deal with ioctl() calls by looking at the specific command to figure out what the process was actually trying to do; a FIBMAP command, for example (which reads a map of a file's block locations) would be allowed to proceed if the calling process had the permission to read the file's attributes.

There are a couple of problems with this approach, starting with the fact that the number of possible ioctl() commands is huge. Even without getting into obscure commands implemented by a single driver, trying to enumerate them all and determine their effects is a road to madness. But it gets worse, in that the intended behavior of a given command may not match what a specific driver actually does in response to that command. So the only way to really know what an ioctl() command will do is to figure out what driver is behind the call, and to have some knowledge of what each driver does. Simply creating this capability is not a task for sane people; maintaining it would not be a task for anybody wanting to remain sane. So security module developers were looking for a better way.

They thought they had found one when somebody realized that the command codes used by ioctl() implementations are not random numbers. They are, instead, a carefully-crafted 32-bit quantity which includes an 8-bit "type" field (approximately identifying the driver implementing the command), a driver-specific command code, a pair of read/write bits, and a size field. Using the read/write bits seemed like a great way to figure out what sort of access the ioctl() call needed without actually understanding the command. Thus, a patch to SELinux was merged for 2.6.27 which ripped out the command recognition and simply used the read/write bits in the command code to determine whether a specific call should be allowed or not.

That change remained for well over two years until Eric Paris noticed that, in fact, it made no sense at all. Most ioctl() calls involve the passing of a data structure into or out of the kernel; that structure describes the operation to be performed or holds data returned from the kernel - or both. The size field in the command code is the size of this structure, and the permission bits describe how the structure will be accessed by the kernel. Together, that information can be used by the core ioctl() code to determine whether the calling process has the proper access rights to the memory behind the pointer passed to the kernel.

What those bits do not do, as Eric pointed out, is say anything about what the ioctl() call will do to the object identified by the file descriptor passed to the kernel. A call passing read-only data to the kernel may reformat a disk, while a call with writable data may just be querying hardware information. So using those bits to determine whether the call should proceed is unlikely to yield good results. It's an observation which seems obvious when spelled out in this way, but none of the developers working on security noticed the problem at the time.

So that code has to go - but, as of this writing, it has not been changed in the mainline kernel. There is a simple reason for that: nobody really knows what sort of logic should replace it. As discussed above, simply enumerating command codes with expected behavior is not a feasible solution either. So something else needs to be devised, but it's not clear what that will be.

Stephen Smalley pointed out one approach which was posted back in 2005. That patch required drivers (and other code implementing ioctl()) to provide a special table associating each command code with the permissions required to execute the command. The obvious objections were raised at that time: changing every driver in the system would be a pain, ioctl() implementations are already messy enough as it is, the tables would not be maintained as the driver changed, and so on. The idea was eventually dropped. Bringing it back now seems unlikely to make anybody popular, but there is probably no other way to truly track what every ioctl() command is actually doing. That knowledge resides exclusively in the implementing code, so, if we want to make use of that knowledge elsewhere, it needs to be exported somehow.

Of course, the alternative is to conclude that (1) ioctl() is a pain, and (2) security modules are a pain. Perhaps it's better to just give up and hope that discretionary access controls, along with whatever checks may be built into the driver itself, will be enough. That is, essentially, the solution we have now.


(Log in to post comments)

Security modules and ioctl()

Posted Feb 17, 2011 8:41 UTC (Thu) by michaeljt (subscriber, #39183) [Link]

> The obvious objections were raised at that time: changing every driver in the system would be a pain, ioctl() implementations are already messy enough as it is, the tables would not be maintained as the driver changed, and so on.

Making ioctls more painful to maintain might encourage people to add less new ones though, which would probably make many people happy.

Security modules and ioctl()

Posted Feb 17, 2011 13:31 UTC (Thu) by nix (subscriber, #2304) [Link]

Well, one possible fix is to rip the ioctl() and unlocked_ioctl() operations out of file_operations completely, turning them into a mandatory lookup into a (per-driver? per-filesystem?) map from ioctl request to (read_required, write_required, function to call to implement this operation).

Upside: makes it impossible to define a new ioctl() request without specifying whether it is a read or write op. Downside: this is... unlikely to be a nondisruptive change, and it's only really for the benefit of LSMs (since the read/write permission bits on devices supporting ioctl() are not used to validate this sort of thing, though they should be, but that would probably break too much of userspace). Which is probably why nobody's done it already.

Security modules and ioctl()

Posted Feb 17, 2011 10:07 UTC (Thu) by mezcalero (subscriber, #45103) [Link]

BTW, the background of this issue is this bug:

https://bugzilla.redhat.com/show_bug.cgi?id=669672

systemd's readahead implementation was triggering write access AVCs due to the wrong ioctl() handling all over the place.

Unexpected bugs

Posted Feb 20, 2011 15:21 UTC (Sun) by man_ls (guest, #15091) [Link]

Wow, that particular bug must have been hard to track down. I'm impressed that it took just a few days, considering how unexpected the solution must have been.

Security modules and ioctl()

Posted Feb 18, 2011 2:14 UTC (Fri) by jengelh (subscriber, #33263) [Link]

So ioctl is nasty, but alternate communication channels, such as Netlink, would just suffer from the same, would they not?

Security modules and ioctl()

Posted Feb 21, 2011 4:24 UTC (Mon) by Baylink (subscriber, #755) [Link]

They would: the problem is, essentially: "Should we allow this interaction, which is between a process and something else outside that process?"

The answer, clearly, depends on what the interaction is, which means that the security module doing the evaluation must *know* all the possible interactions.

That way, clearly, lies madness, as our Esteemed Editor implies.

Your expansion, though, explains why this hasn't been fixed: the problem isn't syntactic. It's semantic. It doesn't really matter how you express it: there needs to be a way to have these conversations, on all by the most trivial implementations, and there's no way to predict what they will be... 10 years from now.

Security modules and ioctl()

Posted Feb 21, 2011 13:21 UTC (Mon) by jthill (guest, #56558) [Link]

How does the 2005 patch force anything at all? The ioctl entrypoints drive all the checking themselves, no matter what it seems -- ioctl_perm() is just a linear search, you'd express it directly in C++ with plain find().

It looks like the mistake with the DIR bits was made immediately when the earlier patch was proposed, and the resulting bad patch was just Smalley taking somebody's word for it on what those bits mean.

Aren't ioctl numbers part of the userland ABI, set permanently? If so, how is drift a concern here?

Security modules and ioctl()

Posted Mar 2, 2011 18:21 UTC (Wed) by robbe (subscriber, #16131) [Link]

But isn't the same problem already present with DAC? Somebody, somewhere, already has to prevent object-changing ioctls on read-only file descriptors.

What does SELinux want to add to the mix? Is it only so that an object-changing ioctl needs the current role to have {write} rights, while for other ioctls you only need {read}? Does this offer anything in addition to the DAC check above, which is always done anyway?

I think the ideas in this direction are sufficiently vague because ioctls do such a wide range of things.

Let's start with the basics: in which manpage is (for example) the mentioned FIEMAP documented?

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds