LWN.net Logo

Shrinking the x86 stack

The kernel stack on x86 systems is two pages - 8KB - in length. This stack area exists for every process on the system; one can easily see that, in a system with a large number of processes, the amount of memory given over to stacks could get large. This memory is unpageable kernel memory; it also requires an "order one" (two page) allocation for every new process. As memory becomes fragmented, multi-page allocations get harder to satisfy, and creation of new processes can fail. So there are plenty of reasons for wanting to reduce the size of the kernel stack.

Dave Hansen has posted a patch (originally by Ben LaHaise) which cuts the per-process kernel stack down to a single page. To accomplish that, this patch must do a few things:

  • One, of course, is to provide an option to use the smaller stack. Since there is a very real possibility of overflowing the reduced stack, this option will not be for everybody - at least, until all of the overflows have been found.

  • To help in finding those overflows, the patch includes a debugging option which uses the gcc profiling mechanism to regularly check the state of the stack. If it gets more than half full, a warning is emitted; should the stack overflow, the system will panic immediately. Or, almost immediately - it switches to an "overflow stack" first to give the panic code room to operate in.

  • Interrupt handling puts its own demands on the kernel stack. But handling of interrupts has nothing to do with any particular process, so there is no real need to use a per-process kernel stack. The patch thus sets up a separate, per-CPU stack which is used only for interrupts. Switching stacks when an interrupt happens is easy enough; the only tricky part is copying some information that the rest of the kernel expects to find on the stack - the preempt count and task pointer - when switching from one stack to another.

    Having a separate, per-CPU interrupt stack can also give a small boost to performance through better cache behavior.

This patch does not try to address the problem of kernel code which puts large variables on the stack. Heavy stack usage has always been considered poor form, but there are still kernel functions which do it. A smaller kernel stack would, undoubtedly, increase interest in fixing those functions.

A variant of the smaller-stack patch has been circulated before, but Linus has not commented on it. It is not clear whether this patch, at this time, would pass the "feature freeze" test. The idea probably makes enough sense to be integrated at some point, however, whether in this development series or the next.


(Log in to post comments)

Shrinking the x86 stack

Posted Dec 12, 2002 19:06 UTC (Thu) by cpeterso (guest, #305) [Link]


How large is the kernel stack on non-x86 architectures? Does part of that 8 KB (or 4 KB) stack get used up for per-thread data structures?

Shrinking the x86 stack

Posted Dec 15, 2002 11:48 UTC (Sun) by rwmj (subscriber, #5474) [Link]

I can't comment on other architectures, but on the x86, the 8 KB (2 page)
area actually has two purposes. It contains the (fixed-size) task_struct
at the bottom end, and the kernel stack growing down from the top end.

On x86, the pseudo-variable "current" which points to the current process's
task_struct is actually an assembly macro which does (roughly)

%esp & ~8KB

The observation is that since every process has an 8 KB block, aligned
to 8 KB, to find the address of task_struct, one just needs to round down
the stack pointer to the nearest 8 KB-aligned address.

This article didn't make clear what happened to task_struct. Is it still
there? Or somewhere else?

Shrinking the x86 stack

Posted Dec 15, 2002 11:53 UTC (Sun) by rwmj (subscriber, #5474) [Link]

To answer my own question: the patch does appear to keep the task structure
along with the reduced stack. Things also seem to have changed with regard
the "current" variable which seems to no longer exist (replaced by
GET_THREAD_INFO, I think).

Copyright © 2002, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds