Because the reason such "complicated storage devices" are complicated, is because they can be arbitrarily layered. Your "simple" example was the case of ext4 directly on local disk. Fine, but there's nothing preventing that ext4 from being on LVM, on md/RAID-0, on md/RAID-1, on network iSCSI, the sort of case mentioned in the article. Each layer eats up additional stack space, and it's the very flexibility to be able to stack block devices (and the kernel stack space they make use of) like that which makes them so useful in the first place. The ext4 doesn't particularly care whether it's directly on disk, or on LVM, or on md/RAID, or what, and that flexibility is taken to be a GOOD thing, one which a LOT of folks would object to disappearing.
OTOH, some of the new generation of filesystem solutions, such as btrfs, have "layering violations" in that they know about and account for, at least to some degree, what they're actually on, and optionally include layers such as RAID directly in the filesystem itself. But these are still very new, btrfs is still labeled experimental, and not ready for production use as yet.
Really, getting the reclaim into its own stack space would seem to be the only reasonably permanent solution, because that's the only way out of the current situation with an arbitrary amount of stack space both already used before the reclaim is called, and needed to guarantee that the call will succeed. Anything else is only a temporary solution, not addressing the real problem, that there's an arbitrarily variable amount of stack both allocated before the call and needed after it, to complete it. Anything else is only addressing individual special-cases, while ignoring the root of the problem.