Increasing the range of address-space layout randomization
Address space layout randomization (ASLR) introduces some randomness into the locations where a process's code and data segments are placed within that process's address space. Two processes running the same program should, under ASLR, see that program placed in different virtual locations. The Linux kernel first implemented ASLR back in 2005. Even then, that implementation was seen as being, at best, a partial solution to the problem.
ASLR works by calculating a random offset to be added to the return value from every mmap() call. Since mmap() is used to place most data and code segments, that causes this offset to apply to most of the process's address space (one exception is the stack, which has a separate offset). On 32-bit systems, the offset is an eight-bit value; the offset is interpreted in pages, so, on these systems, memory areas from mmap() will have a random offset between zero and 255 pages. Note that the lowest bits of the address of any mapping are not random, since mappings are page-aligned.
This offset will be sufficient to frustrate (almost all the time) a simple exploit that tries a single, hard-coded address. It is somewhat less effective, though, if the exploit is able to retry with varying offsets of its own. Only 256 attempts are required to explore the entire range of potential offsets; an exploit running locally can probably work through the entire set in well less than a second, especially if it is able to avoid causing process crashes in its attempts. ASLR, thus, does not provide a great deal of additional security against such threats.
The problem is that there is only so much address space to play with, especially on 32-bit systems. Shifting the heap and stack areas around will reduce the space between them (traditional layout schemes having been designed to maximize that space); that, in turn, can cause programs needing to do huge allocations to fail. Increasing the amount of randomness in the layout offset may not cause problems in a lot of systems, but a mechanism that goes into everybody's kernel has to be implemented conservatively. Thus, the eight-bit value used on 32-bit systems (64-bit systems have more space to play with, so a 28-bit offset is used on x86-64, for example).
One might think that 32-bit systems are on the way out, but there are still
a lot of them — including a lot of Android devices. When the "Stagefright"
vulnerability, which affects Android systems, was disclosed in July, Mark
Brand at Google set
out to write an exploit for it. He duly found that ASLR was not a
significant obstacle in the face of a brute-force attack; as he put it:
"I knew that ASLR on 32-bit was always a bit shaky; but I didn’t
think it was this broken.
" An easily exploitable vulnerability on
vast numbers of Android devices is a bit of a nightmare scenario; it would
be nice if the kernel had ways to mitigate exploits in such situations.
ASLR, as it turns out, is, in its current form, not really one of those ways.
The interesting thing about Android, though, is that it is a relatively controlled environment. It may not always need the same level of generality that a kernel-for-everybody must provide. If it is known that Android systems will not be making huge allocations (and that should be fairly well known), then it may well be safe to increase the randomness of the ASLR mechanism, making brute-force attacks harder to carry out. But current kernels offer no way to increase the amount of randomness used with ASLR.
It would not be out of character for the Android developers to simply patch a higher degree of randomness into the kernel shipped with the Android distribution. But, sometimes, they try to get a general solution into the upstream kernel instead. That is what is happening with Daniel Cashman's patch to provide an customizable random offset range for ASLR.
This patch set replaces the compile-time ASLR offset range with a pair of new sysctl knobs called /proc/sys/vm/mmap_rnd_bits and /proc/sys/vm/mmap_rnd_compat_bits. The first covers normal processes, while the second applies to those running in the compatibility mode — 32-bit processes running on a 64-bit kernel, for example. The default value matches the value in current kernels, so users will see no change unless they (or their distributor) explicitly make a change.
As the names suggest, these knobs control the number of address-space bits used for the ASLR random offset. Each architecture sets minimum and maximum values that make sense, given the available address space. On x86, the value may range between eight and 16 bits (for 32-bit) or 28 and 32 (for 64-bit). The limits for ARM are rather more complicated, depending on the specific subarchitecture and the page size in use. In all cases, though, this patch makes it possible to increase the number of bits used for the random offset. Each additional bit doubles the space that must be searched, making an exploit slower and more likely to be detected. If the Android developers are able to use this feature to increase ASLR randomness, the next Stagefright-like vulnerability will, hopefully, be harder to exploit.
This patch set has been through a number of revisions (six thus far with at
least one more expected) based on comments. Those comments, though, are
concerned with how the patch can be improved; there does not seem to be any
real discomfort with the idea in general. So Linux kernels could offer
more random address-space layout randomization in the relatively near
future.
Index entries for this article | |
---|---|
Kernel | Security/Address-space layout randomization |
Security | Linux kernel/Address-space layout randomization |
Posted Dec 17, 2015 3:36 UTC (Thu)
by spender (guest, #23067)
[Link] (10 responses)
-Brad
Posted Dec 17, 2015 14:36 UTC (Thu)
by corbet (editor, #1)
[Link] (4 responses)
The discussion on this particular patch set has been going on for a while; I don't recall you having been a part of it. It is certainly your right to remain aloof from the discussion, but there are consequences to that; in particular, it means you tend not to be a part of the discussion. I feel like you're asking me to take your part in a discussion you chose to remain outside of, and that seems a little strange.
Posted Dec 17, 2015 16:15 UTC (Thu)
by spender (guest, #23067)
[Link] (3 responses)
If at any point in the past 15 years upstream had its own useful security ideas, by all means feel free to leave us out of articles discussing those topics, but given that that's clearly not been the case, and especially now with the essential fire sale of grsecurity/PaX features being ripped off by people who self-declare as having no real security or kernel development experience, you're essentially contributing to the rewriting of history and playing into the hands of the corporations that post the WP article want to be the ones recognized for making Linux secure (of course, with no original ideas or implementations of their own).
It's basic journalistic integrity and human decency, respect for the time and effort of others by acknowledging the work and ideas regardless of whether you personally like the person you're ripping off, but it's good that you've laid bare your obvious bias and agenda here for everyone to see. It's clear from your other articles (e.g. https://lwn.net/Articles/256389/, https://lwn.net/Articles/345076/, https://lwn.net/Articles/350100/, and many others) that you have no problem mentioning other "out-of-tree" projects that you don't have a personal vendetta against. It's not informing your audience and it's not journalism.
PS: scornful and truthful are mutually exclusive.
-Brad
Posted Dec 17, 2015 16:23 UTC (Thu)
by andresfreund (subscriber, #69562)
[Link] (1 responses)
Do you actually ever read what you write? What comes over as scornful is a significant portion of your messages, including the one I'm replying to, here and elsewhere. At some point that starts to rub of to the people conversing with you.
Posted Dec 19, 2015 15:57 UTC (Sat)
by nix (subscriber, #2304)
[Link]
I note that in the past Brad et al have attacked the upstream kernel for not including their stuff, and puffed said stuff at every opportunity. Now there's an attempt actually being made to get some of their stuff in, their response is... to attack the people doing the work, call them incompetent, and claim they are ripping work off. It appears there is *nothing* you can do which won't made Brad et al unhappy with you.
Posted Dec 18, 2015 0:15 UTC (Fri)
by corbet (editor, #1)
[Link]
Look, Brad, I think you have done a lot of good work. I certainly have no "personal vendetta" against it. Don't go paranoid on me, that won't help anybody.
I realize that trying to get work into the mainline can be frustrating and infuriating, and that holds doubly true for the kind of stuff that you do. It has taken a very long time for attitudes in the kernel community to catch up to our security problems, and the jury is still out on just how much that has happened even now.
Still, if you want to be a part of the kernel community, you need to be a part of the kernel community. If you work to get your code included just like everybody else does, there should be no issues around credit. If you sit on the sidelines and restrict access to your patches, you will be on the sidelines. Even then, when work clearly derives from what you have done (as with the post-init read-only stuff) the developers involved should credit their sources, and I will certainly try to mirror that. If this simple ASLR patch derives from your work, the author did not say so.
Honestly, Brad, I wish you would engage a bit more in useful forums; LWN, for all that I put into it, may not be the best place to actually get change effected. You have a lot to offer, but, if you want to come in from the cold, I don't think you can really expect me to just make that happen for you.
Posted Dec 17, 2015 16:39 UTC (Thu)
by MattJD (subscriber, #91390)
[Link] (4 responses)
Reading through that document suggest you guys improved on what the kernel does by allowing different amounts of randomization over different parts of the program. That seems like a good idea to move back upstream as well. However, you then mention by using ASLR you run the risk of exhausting the available virtual memory to a process, which this article mentions as the reason why upstream only does 8bits of randomization. Are you guys doing something else to minimize this problem? If I increase the number of bits used with upstream's implementation, would it run the same risks, or does your system decrease them further then upstream with comparable number of used bits?
Also, when you talk about probability you use the following formulas:
The probabilities of success within x number of attempts are given by the
(1) Pg(x) = 1 - (1 - 2^-N)^x, 0 <= x
where N = Rs-As + Rm-Am + Rx-Ax, the number of randomized bits to find.
Except, do I have to break all the ASLR (thus use the value of N) for all attacks? For instance, if I'm doing ROP I just need the executable layout as I would only use the stack in a relative fashion and I don't need to care about the data layout. Or once I have ROP working I can just read the mappings anyway.
Furthermore, wouldn't the probability for a section be R/A, not R-A? If I have 8 bits of random but I can attack 2 bits at a time, I can get the answer in 2^4 guess since I can attack a pair of bits at a time. Using your formula, I'd have 2^6 guesses. I'm not sure why I'd have those extra bits to get? And furthermore if A is 1, I have to find all of R, not R-1.
I'm trying to avoid NIH mindset and thus understand your system. Since Pax has been around for longer, I wouldn't be surprised that you have answers to these questions already.
Posted Dec 17, 2015 19:55 UTC (Thu)
by thestinger (guest, #91827)
[Link]
I doubt it's up-to-date, it probably dates back to the 2003 era. You're better off looking at the source code. There's newer documentation but it's mostly at https://forums.grsecurity.net/viewforum.php?f=7&sid=0... and doesn't cover those old portions.
Posted Dec 20, 2015 4:12 UTC (Sun)
by PaXTeam (guest, #24616)
[Link] (2 responses)
if you go back to the doc homepage (https://pax.grsecurity.net/docs/) you'll see its last modification time, so no, it's not exactly up-to-date as far as implementation details go but the design hasn't changed since 2001.
> Reading through that document suggest you guys improved on what the kernel does by allowing different
i didn't improve anyone's ASLR, i invented the whole thing ;).
> However, you then mention by using ASLR you run the risk of exhausting the available virtual memory to a process[...]
i don't think i mentioned address space exhaustion anywhere, perhaps you're thinking of entropy pool exhaustion? what (carelessly implemented) ASLR can affect is address space *fragmentation* which is why PaX settles on per-region randomization only to minimize it (another reason is that in practice an attacker only ever really needs to know a single library's address, so there's no point in randomizing per-mmap). that said, fragmentation can cause similar effects to exhaustion (failing mmap requests) but they're different problems.
> [...]Except, do I have to break all the ASLR (thus use the value of N) for all attacks?
it always depends on the situation, this is what the A* bits represent. ad absurdum, you don't need to guess at all because you can learn the needed addresses a priori, or don't need fixed addresses at all, etc.
> For instance, if I'm doing ROP I just need the executable layout as I would only use the stack in a relative fashion and I don't need to care about the data layout.
depends on what 'doing ROP' means. if you mean some full-blown Turing-complete computation then it's not enough to learn the code layout, you obviously have to be able to pass data to the code which means learning data addresses which may be in different regions than the ROP gadgets you found. e.g., if you take a classical stack based buffer overflow and want to call system() then you'll need to learn both stack and mmap addresses (and for general ROP, also the content at code addresses).
> Or once I have ROP working I can just read the mappings anyway.
it depends on what you can do with 'ROP working', whether your gadgets are powerful enough, whether you can access /proc/self/maps or scavenge memory for addresses to other regions, etc.
> Furthermore, wouldn't the probability for a section be R/A, not R-A?
no, it's as written, R-A. perhaps the 'attack the bits' phrase isn't as clear as i intended it, but the example i gave (it's called heap spraying these days) helps understand it: if you can control enough contiguous memory then you can effectively 'erode' the entropy in the lower bits of the randomized addresses. when R=A then you basically control a large enough region so that you can just choose a fixed address and be assured that it'll be mapped with your data during the attack (the maths is then R-A=0 -> 2^0=1 -> one shot exploit, no need to guess anything).
Posted Dec 20, 2015 5:40 UTC (Sun)
by MattJD (subscriber, #91390)
[Link] (1 responses)
>i didn't improve anyone's ASLR, i invented the whole thing ;).
Sorry, I meant to imply how it is better then upstream's implementation. I realize you guys had ASLR first, it wasn't my intention to imply otherwise.
> i don't think i mentioned address space exhaustion anywhere, perhaps you're thinking of entropy pool exhaustion? what (carelessly implemented) ASLR can affect is address space *fragmentation* [...]
I meant address space fragmentation, which you discussed. So, in your experience, you find by dividing the sections you get bigger bang for your buck (since you get 16 bits of randomness at least per section, compared to upstream's 8). Is there anything else you do to avoid address space fragmentation? Or does this limit ASLR's effect on fragmentation enough that it is equivalent to running without?
> depends on what 'doing ROP' means. if you mean some full-blown Turing-complete computation then it's not enough to learn the code layout[...]
Ok, I can see where you are coming from. Basically you are giving the best case scenario, which is the general case. Specific exploits *may* reduce this, depending upon the details.
When you mention for ROP you need to know the stack address, is that for the case where you don't have a stack overflow that takes over a return instruction? As I understand ROP, isn't that the usual case? And once I get control over it once, isn't that enough to execute my code (with your mentioned limitation regarding data/mmap addresses)? Or am I missing something about how ROP works? I do understand why the other addresses matter.
> no, it's as written, R-A. perhaps the 'attack the bits' phrase isn't as clear as i intended it[...]
Ok, that makes sense then. That phrase did confuse me, but your comment clears that up. If I get time to take a wack at it, would a patch against the document be welcome? No guarantees on when I might surface with that though.
Also, is there any other benefits to your ASLR implementation versus upstream's? That was why I was originally looking for a design document. I recognize I know only a little versus your team's expertise, so I apologize if I missed any differences, especially if it was mentioned in the document. I'm just trying to understand the potential issues implementing ASLR.
Posted Dec 20, 2015 13:01 UTC (Sun)
by PaXTeam (guest, #24616)
[Link]
> Specific exploits *may* reduce this, depending upon the details.
yes, i tend to describe the limits of any given defense mechanism, in case of ASLR it's obvious what the lower limit is, the less obvious case is the upper limit (of the entropy a given attack may have to overcome). any real life case will fall somewhere in-between and let's not forget about the attack detection/reaction mechanism that is also part of the ASLR design (which bounds the probability of success) that is mostly omitted in other implementations.
> When you mention for ROP you need to know the stack address, is that for the case where you don't have a stack overflow that takes over a return instruction?
well, when i speak of ROP, i actually mean 'execute existing code out of original program order' (see pax.txt about the threat model) which these days is called 'code reuse attack', i.e., it's the general case, not a specific one for just controlling a return address. in any case, when you think of the stack buffer overflow case, to pass a pointer parameter to a function where the (attacker-provided) data is also on the stack somewhere then you somehow need to learn that stack address. whether that requires learning the stack entropy or just some offsetting a register via gadgets depends on the situation.
> If I get time to take a wack at it, would a patch against the document be welcome?
sure but i'd appreciate an update on the implementation details even more ;).
> Also, is there any other benefits to your ASLR implementation versus upstream's?
if there wasn't any, i wouldn't still be maintaining my code after all these years ;). one obvious benefit is the much more concise and understandable implementation, there're basically 4 per-arch constants (PAX_DELTA_*) that will tell you what region gets randomized by how much, no need for reimplementing the effectively same helper functions for each arch as vanilla does. there're also finer details like handling corner cases on address space exhaustion or randomizing PIEs or disallowing the runtime control of ASLR (PF_RANDOMIZE) or safer brk randomization, etc.
Posted Dec 17, 2015 12:29 UTC (Thu)
by wodny (subscriber, #73045)
[Link] (2 responses)
[1] http://wenke.gtisc.gatech.edu/papers/morula.pdf
Posted Dec 17, 2015 19:57 UTC (Thu)
by thestinger (guest, #91827)
[Link] (1 responses)
Posted Dec 17, 2015 23:03 UTC (Thu)
by wodny (subscriber, #73045)
[Link]
Posted Dec 17, 2015 19:41 UTC (Thu)
by mageta (subscriber, #89696)
[Link] (1 responses)
Posted Mar 14, 2016 15:30 UTC (Mon)
by david.a.wheeler (subscriber, #72896)
[Link]
Kees Cook proposed working on Linux kernel hardening in Nov 2015: https://lwn.net/Articles/663361/ and pointed to a wiki on the "Kernel Self Protection Project" at http://kernsec.org/wiki/index.php/Kernel_Self_Protection_... - that site states, "These kinds of protections have existed for years in PaX, grsecurity, and piles of academic papers. For various social, cultural, and technical reasons, they have not made their way into the upstream kernel, and this project seeks to change that."
Increasing the range of address-space layout randomization
Sorry, Brad. I tend to write about stuff that has been posted on the mailing lists and is intended for the mainline kernel. Deliberately out-of-tree patch sets get less attention. Especially when the maintainers of that patch set respond most scornfully when I suggest it would be good to get at least parts of it upstream.
Increasing the range of address-space layout randomization
Increasing the range of address-space layout randomization
Increasing the range of address-space layout randomization
Increasing the range of address-space layout randomization
Well, I have never once in my life claimed to be a journalist, FWIW.
Increasing the range of address-space layout randomization
Increasing the range of address-space layout randomization
following formulae (for guessing and brute forcing, respectively):
(2) Pb(x) = x / 2^N, 0 <= x <= 2^N
(from the above link)
Increasing the range of address-space layout randomization
Increasing the range of address-space layout randomization
> amounts of randomization over different parts of the program.
Increasing the range of address-space layout randomization
Increasing the range of address-space layout randomization
Increasing the range of address-space layout randomization
[2] https://securityintelligence.com/one-class-to-rule-them-a...
Increasing the range of address-space layout randomization
Increasing the range of address-space layout randomization
Increasing the range of address-space layout randomization
kernel-hardening
http://www.openwall.com/lists/kernel-hardening/