Each process has a stack space allocated for running in kernel mode.
Having a 4K stack means that it only requires one page of memory per
process instead of two.
The difficulty from a kernel developer point of view is insuring that
no code path ever overruns the one page of stack.
The argument against it is that for most workloads, no process usually needs
more than 4K of stack, ie wasted, un-used space.