Post-init read-only memory
The key to a successful exploit is often convincing the kernel to write to an unintended location. See, for example, this recent exploit, which uses a driver bug to overwrite a portion of the vDSO area; that, in turn, enables an attacker to run arbitrary code in kernel mode. One way to defend against such attacks is to minimize, to the greatest extent possible, the memory that the kernel is allowed to write to. A number of techniques, from simply marking data read-only to supervisor-mode access prevention, can be deployed toward that end. There is one class of data, identified by the grsecurity developers, that current techniques overlook, however.
When the kernel boots, it sets up a vast array of data structures describing the hardware it runs on and much more. In many cases, those data structures will never be changed again but, since they are resident in writable memory, they can still be changed by an errant write operation. The post-init read-only memory patch set, as posted by Kees, allows these data structures to be marked with a special __read_only annotation. That will cause them to be placed into a separate ELF section (".data..read_only"). Once the kernel has finished the initialization process, all data found in that section will be marked read-only, never to be changed again. At that point, exploits like the vDSO overwrite linked above will no longer work.
This change seems like an obvious win: unchanging data is marked read-only, blocking known exploits and, perhaps, minimizing the impact of simple bugs as well. As an added bonus, read-only data will be kept together, leading to better cache behavior. It would appear to be an obvious candidate for merging in the near future. That will probably come to pass, but, first, an important question has to be answered: what should happen when the hardware catches an attempt by the kernel to write (post initialization) memory that had been marked __read_only?
When things go wrong
This question matters because there is a potential hazard whenever a data
structure is marked __read_only: the developer involved may have
overlooked the one case where, after a rare sequence of events on days with
a waxing gibbous moon, that data structure must be changed. Or there may
be a case where data structures are modified unnecessarily, perhaps storing
data that is already there anyway. Such cases work in current kernels, but
would break if the data being written were made read-only. Mathias Krause described one such experience, wherein the
system would fail during the resume sequence. As he noted:
"Debugging that kind of problem is sort of a PITA, you could
imagine.
"
The ideal solution would be to have the compiler catch attempts to modify __read_only data outside of the initialization sequence, but that is not currently possible. Simply marking the relevant data structures const will not work; those data structures are written to during boot and, as PaX Team pointed out, making them const opens the door to all kinds of surprising, optimization-related behavior from the compiler. Where compilers are involved, surprising behavior is rarely a good thing. As an alternative, Mathias suggested the use of a special-purpose GCC module to detect inappropriate writes. There seems to be agreement that this is a good idea, but no such module exists and it will take time to create one. Holding this patch set until a checker module can be created seems undesirable.
But without such a checker, there will almost certainly be situations where the kernel tries to write to something marked __read_only, either because it was so marked in error or as the result of some other bug. There have been a number of ideas put forward on how such problems could be handled.
The most obvious thing to do is to simply oops the kernel, with the usual
results for the process that was running and, perhaps, the machine as a
whole. Andy Lutomirski supported this
approach, saying: "We failed, we might be under attack, let's
oops.
" The problem with this approach, of course, is that it takes
the machine out of commission, possibly with an error that is less than fun
to try to track down. Ingo Molnar also worried that the oops information would, in
most desktop cases, never be seen by the user and, as a result, would never
be reported to developers. That highlights an old problem with presenting
such information on desktop systems, but that problem is unlikely to be
fixed right now.
The alternative to oopsing the system would be to log the error and somehow try to continue. Ingo suggested simply skipping over the offending instruction and trying to continue, but that idea did not go far; as PaX Team pointed out, simply dropping an intended write operation could create no end of strange problems further down the line and may actually help exploit attempts. Linus suggested, instead, that the kernel could mark the relevant page writable and retry the instruction. That would, of course, remove the read-only protection from that page, but it would allow the system to continue to operate while generating diagnostic information for developers. One would probably not want things to work this way on a production system, but it could be an invaluable option for developers.
The final piece of the puzzle might be to have a kernel command-line operation to disable the read-only marking entirely. That would provide an option to users who run into a bug and need to be able to get their work done until a proper fix is available.
Kees has indicated that his current
approach is to take the kill-the-machine approach by default. He has
already implemented the command-line option, and said that Linus's
"mark the page writable" suggestion would not be difficult to add. So the
next version of the patch should have addressed most of the concerns
expressed so far. Getting it merged may prove to be the easy part, though;
the task of identifying and marking truly read-only data could be a long and
error-prone affair, even when starting with the work that the grsecurity
developers have already done. The good news is that this work should make
the kernel more secure, provide a (perhaps imperceptible) performance
improvement, and turn up a few bugs along the way.
Index entries for this article | |
---|---|
Kernel | grsecurity |
Kernel | Security/Kernel hardening |
Security | Hardening |
Security | Linux kernel |
Posted Dec 3, 2015 2:46 UTC (Thu)
by spender (guest, #23067)
[Link] (1 responses)
The exploit linked to was just an example of one exploit for an educational kernel vulnerability created as part of a CTF. The linked blog links to another participant's exploit for the same vulnerability that would work regardless of the __read_only changes currently being discussed.
Of note however is that that exploit would be made more difficult (even in the absence of any other grsecurity/PaX features) by RANDSTRUCT. Both exploits also wouldn't work as-is solely due to USERCOPY (another grsecurity feature being discussed recently).
Finally, the initial source of the vulnerability, an overflow in a call to krealloc, is firmly in the class of vulnerabilities PaX's size_overflow GCC plugin was designed to prevent. So regardless of desired exploit method, catching the overflow and terminating the attacking process prevents the attacker from gaining the arbitrary read/write primitive via copy_*_user and thus prevents any exploitation of the vulnerability.
-Brad
Posted Dec 3, 2015 3:00 UTC (Thu)
by spender (guest, #23067)
[Link]
The proposed patches currently don't handle the use of __read_only in modules, they'll simply still be writable.
Grsecurity makes use of __read_only in many places that won't be possible with the reduced infrastructure proposed upstream. Specifically, we are able to use __read_only on data that is writable infrequently even after init (for instance, to protect important sysctl values, or LSM's security_ops struct). It's able to accomplish this on ARM, x86, and x64 through a feature of our KERNEXEC architecture that temporarily allows write access to read-only data for the current CPU in a race-free manner.
-Brad
Posted Dec 3, 2015 5:49 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link]
I don't have anything more to say.
Posted Dec 3, 2015 8:48 UTC (Thu)
by pabs (subscriber, #43278)
[Link]
Posted Dec 3, 2015 9:10 UTC (Thu)
by petur (guest, #73362)
[Link] (1 responses)
Posted Dec 4, 2015 9:53 UTC (Fri)
by NAR (subscriber, #1313)
[Link]
Posted Dec 3, 2015 16:03 UTC (Thu)
by fandingo (guest, #67019)
[Link]
Posted Dec 4, 2015 5:39 UTC (Fri)
by NCunningham (guest, #6457)
[Link]
All of this is a long way of saying perhaps there's value in making something more generic that could be used for security and incremental hibernation images and whatever else might be able to use it in the future?
Posted Dec 11, 2015 21:13 UTC (Fri)
by fratti (guest, #105722)
[Link] (1 responses)
Such a qualifier might make for either a nice GCC compiler extension or an addition to the next C language specification revision, since (if I'm not mistaken) such a functionality would solve this particular case. The "initialise once, keep around read-only for a long time" paradigm is probably present in a lot of software, so while any language revisions or GCC extensions might be too far away for this Linux patch set, a lot of C code could probably benefit from it.
Posted Dec 11, 2015 23:40 UTC (Fri)
by PaXTeam (guest, #24616)
[Link]
Posted Dec 11, 2015 22:22 UTC (Fri)
by ksandstr (guest, #60862)
[Link] (2 responses)
So the question is: what measures are there for __read_only sections that prevent the compiler from writing the memory willy-nilly? Presumably it's not marked volatile for its performance cost.
[0] wrt TLBs in particular
Posted Dec 11, 2015 23:05 UTC (Fri)
by PaXTeam (guest, #24616)
[Link] (1 responses)
Posted Dec 12, 2015 1:04 UTC (Sat)
by ksandstr (guest, #60862)
[Link]
Post-init read-only memory
Post-init read-only memory
Post-init read-only memory
Post-init read-only memory
Post-init read-only memory
Post-init read-only memory
Post-init read-only memory
Post-init read-only memory
Post-init read-only memory
Post-init read-only memory
Post-init read-only memory
Post-init read-only memory
Post-init read-only memory