A GCC -fstack-protector vulnerability on arm64
Dynamic allocations are just as susceptible to overflows as other locals. In fact, they're arguably more susceptible because they're almost always arrays, whereas fixed locals are often integers, pointers, or other types to which variable-length data is never written. GCC's own heuristics for when to use a stack guard reflect this.
Kees Cook, meanwhile, has pointed out that
the kernel no longer uses variable-length arrays, so kernel builds should
not be affected by this vulnerability.
Posted Sep 12, 2023 22:21 UTC (Tue)
by dvdeug (guest, #10998)
[Link] (16 responses)
Posted Sep 13, 2023 1:06 UTC (Wed)
by hmh (subscriber, #3838)
[Link]
Posted Sep 13, 2023 5:58 UTC (Wed)
by Vipketsh (guest, #134480)
[Link] (13 responses)
In conclusion: there is no case where VLAs are an advantage.
Posted Sep 13, 2023 7:59 UTC (Wed)
by PengZheng (subscriber, #108006)
[Link] (2 responses)
Large stack allocation is not advisable. Especially for embedded systems with swap turned off.
If VLA could be as large as stack limit, it should not exist at the first place.
Support that we have a function call hierarchy A->B->C, and that each function uses VLA, whose size depends on various factors.
Posted Sep 13, 2023 15:01 UTC (Wed)
by geofft (subscriber, #59789)
[Link] (1 responses)
(Even if the program is processing trusted input - e.g., it's part of something like 'make' whose purpose is running code anyway, and so there cannot really be security vulnerabilities - there still isn't a point in having it conditionally overrun the stack and crash. Just detect when the inputs are too big and conditionally throw an error at the beginning of the program. The effect for the end user is no worse, and probably a bit better really.)
This scenario only makes sense, I think, if you can somehow guarantee that when A makes a large VLA, B and C definitely will not, etc. But I'm having trouble thinking of how you'd end up with code like that. Most of the time, if you are processing large input in one function and call another, that second function is going to also process large input too, or at best process data of constant size. It isn't going to get smaller.
Maybe your logic sometimes does lots of work in B, and sometimes lots of work in C instead, but only in one or the other? But you can solve that by just creating a stack array in B (or A) and passing a pointer to it down to C, instead of doing another allocation in C. Pointers to stack variables remain valid as long you're somewhere deeper on the stack.
Posted Sep 16, 2023 7:02 UTC (Sat)
by ssmith32 (subscriber, #72404)
[Link]
For most programs, having a very rare, badly performing worst case is better than having an always occurring worst case that isn't quite as bad, but is still a bit worse than the common case in the "rarely very bad" scenario.
See: usage of quicksort O(n^2) vs mergesort O(nlgn).
Since quicksort is *usually* faster, it often is the better choice, despite having a much, much worse worst case.
In fact, if one always just allocated the worst case statically, there'd actually be no point for heap memory whatsoever - just allocate for the worst case, for anything.
>There is no benefit in converting a program that unconditionally overruns the stack to one that conditionally overruns the stack.
Yes, there is: if the condition is very rare, of course it's far far better to have a program that only (rarely) conditionally overruns the stack, instead of one that always overruns it. The only benefit of having a program that always runs out of memory is if you're selling memory (or trying to convince someone who needs 99.9999% uptime that the worst case will crash on the given hardware, i.e. as a test program)
In fact, the space of programs where it's not preferred to rarely crash instead of always crash is rather small indeed, I would imagine.
Posted Sep 13, 2023 8:47 UTC (Wed)
by ballombe (subscriber, #9523)
[Link]
(An even longer time, before Linux existed, I tried to copy the screen 32k bytes frame buffer in the C stack. It did not end well.)
Stacks are quite useful data structure, but one should use a separate stack for data with a much larger hard limit.
Posted Sep 13, 2023 19:16 UTC (Wed)
by epa (subscriber, #39769)
[Link] (2 responses)
Sure, you might want to set a limit, to catch infinite recursion bugs or unexpectedly high stack memory usage, but that should be something you opt into with ‘ulimit’ or similar.
And if the stack really cannot exceed a few megabytes, the C compiler ought to be capable of allocating variable length arrays somewhere else (and freeing them as the stack unwinds). The classical C alloc() was effectively a stack, as you had to free memory in the reverse order of allocating it.
In kernel space I totally get why the stack size has to be strictly controlled.
Posted Sep 13, 2023 20:06 UTC (Wed)
by cmm (guest, #81305)
[Link]
In a multi-threaded process, stack space of any thread apart from the main one is limited because the stack has to be mmapped at a particular virtual address upon thread creation and cannot move.
Posted Sep 13, 2023 20:40 UTC (Wed)
by excors (subscriber, #95769)
[Link]
I guessed that might cause problems with setjmp/longjmp, which will restore the stack pointer (effectively undoing any stack allocations) but won't know how to deallocate any automatic heap allocations. But it turns out the C standard already says that any VLAs allocated between the setjmp and longjmp may be leaked; apparently the original Cray Research implementation of VLAs used the heap as a fallback when it ran out of stack space (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n317.pdf). So it's okay to use the heap and just tell programmers not to combine VLAs and longjmp.
Posted Sep 17, 2023 4:06 UTC (Sun)
by ianmcc (subscriber, #88379)
[Link] (5 responses)
Posted Sep 17, 2023 10:31 UTC (Sun)
by excors (subscriber, #95769)
[Link]
If you're developing for a single-application embedded environment, then I think it often *is* a good idea to calculate your worst-case heap usage and statically allocate that, so you can be sure the application will meet its specification and won't crash from resource exhaustion when given a valid input. Limited dynamicity can be done with statically-sized pool allocators, where higher-level code allocates a whole complex data structure and allocation failure can either be prevented (e.g. by verifying the resource requirements of a request before accepting it, or applying backpressure to a message queue before you get overloaded, etc) or handled gracefully (unwinding the operation and returning a meaningful error to the user), in contrast to a global heap which might fail in any of your many thousands of low-level std::vector/etc operations where it's practically impossible to recover except by crashing and restarting the whole application.
Posted Sep 18, 2023 6:11 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (3 responses)
And a typical system now has gigabytes of RAM.
Posted Sep 18, 2023 8:31 UTC (Mon)
by geert (subscriber, #98403)
[Link]
Posted Sep 18, 2023 9:13 UTC (Mon)
by mathstuf (subscriber, #69389)
[Link] (1 responses)
Hmm. So 4 pages on typical builds. Is the stack still this small even with something like 64k-sized pages?
Posted Sep 18, 2023 9:57 UTC (Mon)
by geert (subscriber, #98403)
[Link]
Posted Sep 13, 2023 7:40 UTC (Wed)
by linusw (subscriber, #40300)
[Link]
Posted Sep 14, 2023 9:48 UTC (Thu)
by ibukanov (subscriber, #3942)
[Link] (1 responses)
I always wondered why C cannot just use something similar? Instead the standard came up with a solution with no possibility to check for stack overflow making VLA unusable in practice.
Posted Sep 14, 2023 13:02 UTC (Thu)
by khim (subscriber, #9252)
[Link]
The fact that you could do something doesn't mean that you should do something. At this point it's just better to stop pretending that you are using stack and use heap where you know you are doing that and not have compiler hide these gotches for you. And back then it was the only sensible way to avoid use of heap. And since Ada was unable to use heap safely for decades it was valuable tool. But Ada eventually took ideas from Rust and now that's possible. At that point trying to pretend that you are using stack when you are using heap instead just stopped being good idea.
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
https://stackoverflow.com/questions/14389525/linux-stack-...
If they use constant-sized array rather than VLA as you suggested, then the stack usage is *unconditionally* larger than the sum of the sizes of these constant-sized arrays.
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
arch/powerpc/Kconfig- int "Thread shift" if EXPERT
arch/powerpc/Kconfig- range 13 15
arch/powerpc/Kconfig- default "15" if PPC_256K_PAGES
arch/powerpc/Kconfig- default "14" if PPC64
arch/powerpc/Kconfig- default "13"
arch/powerpc/Kconfig- help
arch/powerpc/Kconfig- Used to define the stack size. The default is almost always what you
arch/powerpc/Kconfig- want. Only change this if you know what you are doing.
A GCC -fstack-protector vulnerability on arm64
A GCC -fstack-protector vulnerability on arm64
> I always wondered why C cannot just use something similar?
A GCC -fstack-protector vulnerability on arm64