User: Password:
|
|
Subscribe / Log in / New account

The trouble with CAP_SYS_RAWIO

Benefits for LWN subscribers

The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

By Michael Kerrisk
March 13, 2013

A February linux-kernel mailing list discussion of a patch that extends the use of the CAP_COMPROMISE_KERNEL capability soon evolved into a discussion of the specific uses (or abuses) of the CAP_SYS_RAWIO capability within the kernel. However, in reality, the discussion once again exposes some general difficulties in the Linux capabilities implementation—difficulties that seem to have no easy solution.

The discussion began when Kees Cook submitted a patch to guard writes to model-specific registers (MSRs) with a check to see if the caller has the CAP_COMPROMISE_KERNEL capability. MSRs are x86-specific control registers that are used for tasks such as debugging, tracing, and performance monitoring; those registers are accessible via the /dev/cpu/CPUNUM/msr interface. CAP_COMPROMISE_KERNEL (formerly known as CAP_SECURE_FIRMWARE) is a new capability designed for use in conjunction with UEFI secure boot, which is a mechanism to ensure that the kernel is booted from an on-disk representation that has not been modified.

If a process has the CAP_COMPROMISE_KERNEL capability, it can perform operations that are not allowed in a secure-boot environment; without that capability, such operations are denied. The idea is that if the kernel detects that it has been booted via the UEFI secure-boot mechanism, then this capability is disabled for all processes. In turn, the lack of that capability is intended to prevent operations that can modify the running kernel. CAP_COMPROMISE_KERNEL is not yet part of the mainline kernel, but already exists as a patch in the Fedora distribution and Matthew Garrett is working towards its inclusion in the mainline kernel.

H. Peter Anvin wondered whether CAP_SYS_RAWIO did not already suffice for Kees's purpose. In response, Kees argued that CAP_SYS_RAWIO is for governing reads: "writing needs a much stronger check". Kees went on to elaborate:

there's a reasonable distinction between systems that expect to strictly enforce user-space/kernel-space separation (CAP_COMPROMISE_KERNEL) and things that are fiddling with hardware (CAP_SYS_RAWIO).

This in turn led to a short discussion about whether a capability was the right way to achieve the goal of restricting certain operations in a secure-boot environment. Kees was inclined to think it probably was the right approach, but deferred to Matthew Garrett, implementer of much of the secure-boot work on Fedora. Matthew thought that a capability approach seemed the best fit, but noted:

I'm not wed to [a capability approach] in the slightest, and in fact it causes problems for some userspace (anything that drops all capabilities suddenly finds itself unable to do something that it expects to be able to do), so if anyone has any suggestions for a better approach…

In the current mainline kernel, the CAP_SYS_RAWIO capability is checked in the msr_open() function: if the caller has that capability, then it can open the MSR device and perform reads and writes on it. The purpose of Kees's patch is to add a CAP_COMPROMISE_KERNEL check on each write to the device, so that in a secure-boot environment the MSR devices are readable, but not writeable. The problem that Matthew alludes to is that this approach has the potential to break user space because, formerly, there was no capability check on MSR writes. An application that worked prior to the introduction of CAP_COMPROMISE_KERNEL can now fail in the following scenario:

  • The application has a full set of privileges.
  • The application opens an MSR device (requires CAP_SYS_RAWIO).
  • The application drops all privileges, including CAP_SYS_RAWIO and CAP_COMPROMISE_KERNEL.
  • The application performs a write on the previously opened MSR device (requires CAP_COMPROMISE_KERNEL).

The last of the above steps would formerly have succeeded, but, with the addition of the CAP_COMPROMISE_KERNEL check, it now fails. In a subsequent reply, Matthew noted that QEMU was one program that was broken by a scenario similar to the above. Josh Boyer noted that Fedora has had a few reports of applications breaking on non-secure-boot systems because of scenarios like this. He highlighted why such breakages are so surprising to users and why the problem is seemingly unavoidable:

… the general problem is people think dropping all caps blindly is making their apps safer. Then they find they can't do things they could do before the new cap was added…

Really though, the main issue is that you cannot introduce new caps to enforce finer grained access without breaking something.

Shortly afterward, Peter stepped back to ask a question about the bigger picture: why should CAP_SYS_RAWIO be allowed on a secure-boot system? In other words, rather than adding a new CAP_COMPROMISE_KERNEL capability that is disabled in secure-boot environments, why not just disable CAP_SYS_RAWIO in such environments, since it is the possession of that capability that permits compromising a booted kernel?

That led Matthew to point out a major problem with CAP_SYS_RAWIO:

CAP_SYS_RAWIO seems to have ended up being a catchall of "Maybe someone who isn't entirely root should be able to do this", and not everything it covers is equivalent to being able to compromise the running kernel. I wouldn't argue with the idea that maybe we should just reappraise most of the current uses of CAP_SYS_RAWIO, but removing capability checks from places that currently have them seems like an invitation for userspace breakage.

To see what Matthew is talking about, we need to look at a little history. Back in January 1999, when capabilities first appeared with the release of Linux 2.2, CAP_SYS_RAWIO was a single-purpose capability. It was used in just a single C file in the kernel source, where it governed access to two system calls: iopl() and ioperm(). Those system calls permit access to I/O ports, allowing uncontrolled access to devices (and providing various ways to modify the state of the running kernel); hence the requirement for a capability in order to employ the calls.

The problem was that CAP_SYS_RAWIO rapidly grew to cover a range of other uses. By the time of Linux 2.4.0, there were 37 uses across 24 of the kernel's C source files, and looking at the 3.9-rc2 kernel, there are 69 uses in 43 source files. By either measure, CAP_SYS_RAWIO is now the third most commonly used capability inside the kernel source (after CAP_SYS_ADMIN and CAP_NET_ADMIN).

CAP_SYS_RAWIO seems to have encountered a fate similar to CAP_SYS_ADMIN, albeit on a smaller scale. It has expanded well beyond its original narrow use. In particular, Matthew noted:

Not having CAP_SYS_RAWIO blocks various SCSI commands, for instance. These might result in the ability to write individual blocks or destroy the device firmware, but do any of them permit modifying the running kernel?

Peter had some choice words to describe the abuse of CAP_SYS_RAWIO to protect operations on SCSI devices. The problem, of course, is that in order to perform relatively harmless SCSI operations, an application requires the same capability that can trivially be used to damage the integrity of a secure-boot system. And that, as Matthew went on to point out, is the point of CAP_COMPROMISE_KERNEL: to disable the truly dangerous operations (such as MSR writes) that CAP_SYS_RAWIO permits, while still allowing the less dangerous operations (such as the SCSI device operations).

All of this leads to a conundrum that was nicely summarized by Matthew. On the one hand, CAP_COMPROMISE_KERNEL is needed to address the problem that CAP_SYS_RAWIO has become too diffuse in its meaning. On the other hand, the addition of CAP_COMPROMISE_KERNEL checks in places where there were previously no capability checks in the kernel means that applications that drop all capabilities will break. There is no easy way out of this difficulty. As Peter noted: "We thus have a bunch of unpalatable choices, **all of which are wrong**".

Some possible resolutions of the conundrum were mentioned by Josh Boyer earlier in the thread: CAP_COMPROMISE_KERNEL could be treated as a "hidden" capability whose state could be modified only internally by the kernel. Alternatively, CAP_COMPROMISE_KERNEL might be specially treated, so that it can be dropped only by a capset() call that operates on that capability alone; in other words, if a capset() call specified dropping multiple capabilities, including CAP_COMPROMISE_KERNEL, the state of the other capabilities would be changed, but not the state of CAP_COMPROMISE_KERNEL. The problem with these approaches is that they special-case the treatment of CAP_COMPROMISE_KERNEL in a surprising way (and surprises in security-related APIs have a way of coming back to bite in the future). Furthermore, it may well be the case that analogous problems are encountered in the future with other capabilities; handling each of these as a special case would further add to the complexity of the capabilities API.

The discussion in the thread touched on a number of other difficulties with capabilities. Part of the solution to the problem of the overly broad effect of CAP_SYS_RAWIO (and CAP_SYS_ADMIN) might be to split the capability into smaller pieces—replace one capability with several new capabilities that each govern a subset of the operations governed by the old capability. Each privileged operation in the kernel would then check to see whether the caller had either the old or the new privilege. This would allow old binaries to continue to work while allowing new binaries to employ the new, tighter capability. The risk with this approach is, as Casey Schaufler noted, the possibility of an explosion in the number of capabilities, which would further complicate administering capabilities for applications. Furthermore, splitting capabilities in this manner doesn't solve the particular problem that the CAP_COMPROMISE_KERNEL patches attempt to solve for CAP_SYS_RAWIO.

Another general problem touched on by Casey is that capabilities still have not seen wide adoption as a replacement for set-user-ID and set-group-ID programs. But, as Peter noted, that may well be

in large part because a bunch of the capabilities are so close to equivalent to "superuser" that the distinction is meaningless... so why go through the hassle?

With 502 uses in the 3.9-rc2 kernel, CAP_SYS_ADMIN is the most egregious example of this problem. That problem itself would appear to spring from the Linux kernel development model: the decisions about which capabilities should govern new kernel features typically are made by individual developer in a largely decentralized and uncoordinated manner. Without having a coordinated big picture, many developers have adopted the seemingly safe choice, CAP_SYS_ADMIN. A related problem is that it turns out that a number of capabilities allow escalation to full root privileges in certain circumstances. To some degree, this is probably unavoidable, and it doesn't diminish the fact that a well-designed capabilities scheme can be used to reduce the attack surface of applications.

One approach that might help solve the problem of overly broad capabilities is hierarchical capabilities. The idea, mentioned by Peter, is to split some capabilities in a fashion similar to the way that the root privilege was split into capabilities. Thus, for instance, CAP_SYS_RAWIO could become a hierarchical capability with sub-capabilities called (say) CAP_DANGEROUS and CAP_MOSTLY_HARMLESS. A process that gained or lost CAP_SYS_RAWIO would implicitly gain or lose both CAP_DANGEROUS and CAP_MOSTLY_HARMLESS, in the same way that transitions to and from an effective user ID of 0 grant and drop all capabilities. In addition, sub-capabilities could be raised and dropped independently of their "siblings" at the same hierarchical level. However, sub-capabilities are not a concept that currently exists in the kernel, and it's not clear whether the existing capabilities API could be tweaked in such a way that they could be implemented sanely. Digging deeper into that topic remains an open challenge.

The CAP_SYS_RAWIO discussion touched on a long list of difficulties in the current Linux capabilities implementation: capabilities whose range is too broad, the difficulties of splitting capabilities while maintaining binary compatibility (and, conversely, the administrative difficulties associated with defining too large a set of capabilities), the as-yet poor adoption of binaries with file capabilities vis-a-vis traditional set-user-ID binaries, and the (possible) need for an API for hierarchical capabilities. It would seem that capabilities still have a way to go before they can deliver on the promise of providing a manageable mechanism for providing discrete, non-elevatable privileges to applications.


(Log in to post comments)

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 17:21 UTC (Wed) by cesarb (subscriber, #6266) [Link]

For the particular scenario mentioned: couldn't CAP_COMPROMISE_KERNEL be checked at open time (together with CAP_SYS_RAWIO) and set a flag in the kernel open file struct? That way, dropping capabilities after open would not cause any problems, and the application would keep working as it used to.

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 17:47 UTC (Wed) by mjg59 (subscriber, #23239) [Link]

The check would be if(!capable(CAP_SYS_RAWIO) || !capable(CAP_SYS_COMPROMISE_KERNEL)) return -EPERM, so anything that drops all capabilities other than CAP_SYS_RAWIO would still fail.

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 19:49 UTC (Wed) by cesarb (subscriber, #6266) [Link]

No, the check would be "if (!capable(CAP_SYS_RAWIO)) return -EPERM;" followed by stashing the value of capable(CAP_SYS_COMPROMISE_KERNEL) somewhere. When writing, check the stashed value instead of capable(CAP_SYS_COMPROMISE_KERNEL) directly.

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 19:59 UTC (Wed) by mjg59 (subscriber, #23239) [Link]

Which would still result in any application that drops CAP_SYS_COMPROMISE_KERNEL before doing anything else being unable to write.

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 20:11 UTC (Wed) by smurf (subscriber, #17840) [Link]

No, the check should be

if (reading && !CAP_SYS_RAWIO) return -EPERM;
if (writing && !CAP_SYS_COMPROMISE_KERNEL) return -EPERM;

when the device is opened, and anything subsequent checking the file's flags.

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 20:20 UTC (Wed) by mjg59 (subscriber, #23239) [Link]

Applications that drop CAP_SYS_COMPROMISE_KERNEL are still unable to write. How does this help?

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 21:25 UTC (Wed) by khim (subscriber, #9252) [Link]

Application which drops CAP_SYS_COMPROMISE_KERNEL will work just fine because both checks happen in open(2) syscall. It'll break application which opens file for reading and writing but then only issues read commands. This can be fixed by changing logic: if read/write open(2) request is attempted without CAP_SYS_COMPROMISE_KERNEL then it's silently translated to read-only open(2) request. Of course application which will try to write to said file will see EBADF which may crash it, but I'm not sure what can save such an application.

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 21:38 UTC (Wed) by mjg59 (subscriber, #23239) [Link]

The current state of things: an application that drops all capabilities other than CAP_SYS_RAWIO can read and write MSRs. The new state of things: an application that drops all capabilities other than CAP_SYS_RAWIO can no longer write MSRs. So, how does that fix anything?

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 22:40 UTC (Wed) by smurf (subscriber, #17840) [Link]

It fixes the "open the device and then drop privileges" usage.

It does not fix the "drop privileges and then open the device" case. I know that. But if userspace runs afoul of this problem, the fix is a simple re-ordering of two lines of code. Or retaining an additional capability.

Is there any real-world program that would run aground at this change, or is this an academic exercise?

Personally I never regarded CAP_SYS_RAWIO (or _ADMIN) as written in stone. It's a sufficiently broad catch-all category that a reasonable programmer should expect that requiring a different, or additional, capability for a task that used to work with only this one right might be in the cards.

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 22:48 UTC (Wed) by mjg59 (subscriber, #23239) [Link]

You've still broken userspace. It's still undesirable and it's still a fundamental problem with the capabilities interface. Similar changes in other places in the Fedora kernel have broken various bits of userspace in irritating ways.

The trouble with CAP_SYS_RAWIO

Posted Mar 14, 2013 4:50 UTC (Thu) by heijo (guest, #88363) [Link]

The problem is that both CAP_SYS_RAWIO, CAP_SYS_ADMIN and possibly others used to be equivalent and imply the ability to arbitrarily alter the system.

Redefining those to no longer being able to do so is idiotic and breaks compatibility.

The trouble with CAP_SYS_RAWIO

Posted Mar 14, 2013 5:04 UTC (Thu) by mjg59 (subscriber, #23239) [Link]

CAP_SYS_RAWIO is not equivalent to CAP_SYS_ADMIN. That's why they're not defined to the same value. It does not imply the ability to arbitrarily alter the system. That's *the entire point* of capabilities.

The trouble with CAP_SYS_RAWIO

Posted Mar 14, 2013 19:52 UTC (Thu) by WolfWings (guest, #56790) [Link]

Actually I don't believe the suggestion change (move the check for write-modes to the open()) breaks userspace. Not the 'silent downgrade to read-only' which is the entire problem right now and identical to the 'check on write()' problem.

Right now you don't appear to be able to drop-all-caps then open /dev/msr, you need to open it first then drop privs as there already is a check to block so much as reads unless you have the RAWIO cap.

So how does the "read = RAWIO, write = RAWIO && COMPROMISE" check on open() instead of on write() break userspace? Programs would be refused access to /dev/msr and complain about it, same as before, and their existing 'Check your caps!' error messages would still apply.

There's a difference between 'breaking' userspace in a way that existing apps error messages don't apply and may not even have error-handling paths for the new issues, and simply enforcing stronger checks in a way compatible with existing error handling.

The trouble with CAP_SYS_RAWIO

Posted Mar 14, 2013 20:03 UTC (Thu) by mjg59 (subscriber, #23239) [Link]

A program used to work. After a change, that program no longer works. A non-working program is broken. Breaking that program doesn't add any real additional security in the common (ie, non-Secure Boot) case, and so is undesirable.

The trouble with CAP_SYS_RAWIO

Posted Mar 17, 2013 15:46 UTC (Sun) by mrjk (subscriber, #48482) [Link]

With the suggested change there would be no program that used to work that would not work now that I can see. Every single current program that worked with dropping privileges after an open would still work the exact same way with caching the new capability at open time and using the cached value on those opened channels.

Can you give an example that would now break -- that wouldn't have broken already?

The trouble with CAP_SYS_RAWIO

Posted Mar 17, 2013 17:02 UTC (Sun) by mjg59 (subscriber, #23239) [Link]

Any application that drops all privileges other than CAP_SYS_RAWIO before attempting the open?

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 21:36 UTC (Wed) by kugel (subscriber, #70540) [Link]

Can the cap dropping be changed to happen in a way such that only those capabilities are dropped that existed when the program was compiled? That said, I know nothing about capability APIs or internals.

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 21:49 UTC (Wed) by mjg59 (subscriber, #23239) [Link]

Imagine a /dev/halt_catch_fire that requires CAP_SYS_RAWIO. Your application runs as root but drops all capabilities, so you never bother making sure that it doesn't accidentally touch /dev/halt_catch-fire. Later, someone decides to add a more fine-grained capability and now either CAP_SYS_RAWIO *or* CAP_SYS_HALT_CATCH_FIRE is sufficient. Your application was previously incapable of setting the machine on fire, but now it can.

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 21:39 UTC (Wed) by ebiederm (subscriber, #35028) [Link]

Just for fun I will suggest changing capable from:

bool capable(int cap)
{
return ns_capable(&init_user_ns, cap);
}

to:

bool capable(int cap)
{
if (we_dont_trust_root)
return false;
return ns_capable(&init_user_ns, cap);
}

Which is equivalent to running userspace outside the initial user namespace, and trivially gives you and environment that has been audited to work for an untrusted root.

Just a few more things won't work that way but I would not mind a little help flushing out the things that we can trust less than fully privileged users with doing.

As for msrs. Make no mistake someone will eventually implement rdmsr(HALT_AND_CATCH_FIRE). So I can't believe even reading msrs is safe.

The trouble with CAP_SYS_RAWIO

Posted Mar 13, 2013 22:10 UTC (Wed) by spender (subscriber, #23067) [Link]

Get busy fixing this trivial local root vulnerability you introduced in 3.8 first:
http://stealth.openwall.net/xSports/clown-newuser.c

-Brad

The trouble with CAP_SYS_RAWIO

Posted Mar 14, 2013 3:12 UTC (Thu) by shlevy (guest, #87221) [Link]

Yikes!!! I had to disable fs.protected_hardlinks, but I can confirm this exploit works... Has this been reported to the appropriate channels?

The trouble with CAP_SYS_RAWIO

Posted Mar 14, 2013 6:51 UTC (Thu) by kees (subscriber, #27264) [Link]

Yes, and this specific issue has already been fixed:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linu...

The trouble with CAP_SYS_RAWIO

Posted Mar 14, 2013 11:33 UTC (Thu) by spender (subscriber, #23067) [Link]

Key phrase being "this specific issue". Several vulnerabilities have already been introduced by the addition of unprivileged user namespaces. It speaks to the trustworthiness of the code that these were found through casual inspection of a few lines of code and a very dumb fuzzer (trinity) -- it has not been exposed to serious security auditing. The author of the above exploit said it was the easiest he has ever written.

You should also know that the existing kernel exploit payloads for granting root privilege also break out of user namespaces without modification.

So the end result is opening up more attack surface to the most vulnerable part of the system, and soon on all distros you will have no choice but to be exposed to it. It's just broken security design.

-Brad

The trouble with CAP_SYS_RAWIO

Posted Mar 19, 2013 1:57 UTC (Tue) by wahern (subscriber, #37304) [Link]

"So the end result is opening up more attack surface to the most vulnerable part of the system, and soon on all distros you will have no choice but to be exposed to it."

Unfortunately, "less code, simpler code" is not one of the competing security paradigms in Linux Land.

*BSD "securelevel"

Posted Mar 14, 2013 10:03 UTC (Thu) by ewen (subscriber, #4772) [Link]

Reading through this article reminded me of *BSDs "securelevel", which is a one-way ratchet change (ie having changed it, you can't change it back to a less secure level except by rebooting). It controls various "compromise the kernel" like things. The exact set of things it controls is probably not an ideal match, but the idea of a sysctl value which can only ever be changed to "be at least as restrictive of insecure things you can do as now" seems like a fairly good fit. And it would be completely orthogonal to the Linux capabilities, which seems helpful. (As well as being "system wide" which seems desirable in this case -- if you've booted via secure UEFI you probably don't want to end up in a situation where some processes can compromise the kernel and others cannot....)

Ewen

*BSD "securelevel"

Posted Mar 14, 2013 11:40 UTC (Thu) by spender (subscriber, #23067) [Link]

Linux already attempts this a bit inconsistently. See /proc/sys/kernel/modules_disabled, which does not allow module loading to be re-enabled once disabled. Yet you can just run: https://grsecurity.net/~spender/msr32.c (before it required no capabilities, now it requires CAP_SYS_RAWIO, but neither matter) and re-enable the supposed securelevel-like value. There is no CVE for this (only for the adding of CAP_SYS_RAWIO) and no patch for the removal of modules_disabled on x86. There's no coherency to upstream security.

-Brad

*BSD "securelevel"

Posted Mar 14, 2013 16:39 UTC (Thu) by ThinkRob (subscriber, #64513) [Link]

At the risk of going slightly off-topic (though not much, given the matter at hand), what do you think of FreeBSD's approach to kernel security? I'd be willing to wager you're a fan of the fact that (it seems like) they're a bit more up-front with their security hole disclosure, but what about things like the "secure level" model, jails, capsicum, and other security-oriented features they've introduced -- do you think they're on the right track?

I know the full answer to this is probably not terribly succinct, but I'm just curious to hear what you think, since kernel security is obviously something you're quite passionate about. :)

*BSD "securelevel"

Posted Mar 20, 2013 9:13 UTC (Wed) by renox (subscriber, #23785) [Link]

Is capsicum actually used in FreeBSD?
AFAIK it's only useful iff programs are ported to use it..

*BSD "securelevel"

Posted Mar 14, 2013 18:35 UTC (Thu) by ewen (subscriber, #4772) [Link]

Which is an important point: if you're going to implement something like this, it needs to be sufficiently encompassing (in the single switch) that it doesn't allow a user (or attacker) to simply undo it by using an alternative mechanism to manipulate the value. The *BSD "securelevel" handles that by having a single switch that cuts off a set of things, including raw memory manipulation and module loading.

It sounds like the "compromise the kernel" flag is also aimed to cut off a more encompassing set of things, so hopefully it'd be less easy to do an end-run around the protection. (And there'd be more incentive to add "you can't do that either" into the set of things turned off as other ways to manipulate it are discovered: Firewire device DMA being one that comes to mind.)

Ewen

*BSD "securelevel"

Posted Mar 19, 2013 22:33 UTC (Tue) by wahern (subscriber, #37304) [Link]

There are enough issues w/ BSD securelevel, too, that it doesn't receive much interest these days.

For example, the immutable files protection can be bypassed by mounting over the directory. It doesn't allow you to change the original file, but allows you to fool other applications at runtime and is thus of little use for, e.g., preventing root kit installation once you've already attained root.

AFAIK nobody has bothered to fix it on systems where it was an issue (NetBSD was immune to this particular attack). The fundamental issue is that even this course-grained capabilities system gives a false of security. Invariably someone will forget about some corner case, or some new feature is added which allows circumvention of the whole pile of policies.

Fine-grained capabilities systems (both system-level and process-level) are just too brittle, including the policies, the mechanisms, and the actual implementations.

Unix systems have only just recently reached a decent level of correctness and reliability with basic file permissions. Anybody who relies on more sophisticated schemes (or allows them in their kernel) is just begging to be rooted.

The trouble with CAP_SYS_RAWIO

Posted Mar 14, 2013 12:35 UTC (Thu) by paulj (subscriber, #341) [Link]

It seems pretty clear from this article that the root of this problem is that the semantic meaning of certain capabilities either was unclear at the start, or was allowed to degenerate from a precise meaning into a mess.

The trouble with CAP_SYS_RAWIO

Posted Mar 14, 2013 12:55 UTC (Thu) by paulj (subscriber, #341) [Link]

Oh, in other fields of computer science, changing semantics of protocols in incompatible ways is often handled through versioning.

I.e. the capability calls need a version flag, perhaps?

The trouble with CAP_SYS_RAWIO

Posted Mar 14, 2013 15:30 UTC (Thu) by paulj (subscriber, #341) [Link]

And looking into this, cap_user_header_t actually has a version field! So this problem may well be fixable without hacks or breakage, if the kernel developers start to consider that the semantic meaning of capabilities, as exposed to user-space, is versioned.

The trouble with CAP_SYS_RAWIO

Posted Mar 22, 2013 4:36 UTC (Fri) by kevinm (guest, #69913) [Link]

It still sounds to me like the simple solution is "remove CAP_SYS_RAWIO from the initial capability set on secure-booted kernels". So you'll lose the ability to perform some iffy SCSI commands - well, you signed up for some bondage and discipline when you asked for a locked-down, secure-booted kernel, didn't you?

The trouble with CAP_SYS_RAWIO

Posted Mar 22, 2013 7:12 UTC (Fri) by dlang (subscriber, #313) [Link]

the problem is that these 'iffy' scsi commands end up being things like the ability to burn CDs, not exactly easy things to do without.

Or in enterprise settings, commands to be able to manipulate tape changers

The trouble with CAP_SYS_RAWIO

Posted Mar 31, 2013 7:12 UTC (Sun) by Duncan (guest, #6647) [Link]

> the problem is that these 'iffy' scsi commands end up being things like the ability to burn CDs, not exactly easy things to do without.

Actually, that's rather less of a problem now, and trending less so, than it was a few years ago when /the/ major form of removable media was optical, CD/DVD. Now days, the sub-GB size of a CD looks positively diminutive, and even the near-5-GB size of a standard DVD looks small, compared to the ubiquitous USB thumbdrive of say 8+GB. A Bluray's 25 gigs is a bit better priced media-only (US$6 individual, just over $1/ea in 25-packs, pricewatch.com), but while the stick's a bit more expensive (USB flash: 32G=US$15, 16G=$10) as it ships with its own housing and read/write hardware, it's also CONSIDERABLY less fragile and generally more easily handled, AND direct-block-device read-writable (well, as seen by the OS...).

Additionally, with current inet and smartphone penetrations, people that a few years ago might have used dedicated removable media (either USB sticks or CD/DVD) these days more often either use the inet directly (streaming what might have been on CD a few years ago, or pastebinning it to a friend if not attaching it to an email), or if they do play local media, say in the car, it's from a jacked-in phone more often than a CD.

Unfortunately when I upgraded machines last year I didn't think of that, and bought a blu-ray burner for it. I really haven't used it... Fortunately, it's a USB-based one and wasn't /that/ expensive, so it's usable on my netbook as well should I decide to and not taking any power when it's unplugged (see kernel 3.9's new ZPODD, only I've had that in the form of an unplugged USB-based bluray for a few months now), and being USB, as long as I don't mistreat it it should stay usable for years, so I suppose I'll get some use out of it, over time. But I'd have been better off simply not buying it at all.

So it's considerably easier to do without optical burning than it was even just a few years ago, to the point where many people would miss it about as much as they do their 1.44 MB floppy...

The trouble with CAP_SYS_RAWIO

Posted Mar 22, 2013 23:32 UTC (Fri) by clemenstimpler (guest, #71914) [Link]

It takes a bit, until curious, but uninformed readers learn what this article is about. "CAP_COMPROMISE_KERNEL (formerly known as CAP_SECURE_FIRMWARE) is a new capability designed for use in conjunction with UEFI secure boot, which is a mechanism to ensure that the kernel is booted from an on-disk representation that has not been modified" would have made a great first sentence for this article. It's also nice at the end of the second paragraph, but why not put it in the limelight of the first paragraph, where it rightly belongs? Just saying... ;)

The trouble with CAP_SYS_RAWIO

Posted Mar 25, 2013 13:16 UTC (Mon) by mkerrisk (subscriber, #1978) [Link]

> but why not put it in the limelight of the first paragraph, where it rightly belongs?

Reviewing the article, it's a smart editorial suggestion and I agree with you; and it would have made for a punchier start. Next time...


Copyright © 2013, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds