On the proper use of vmalloc()
Erik Jacobson recently found the limits of kmalloc() while querying /proc/interrupts on a very large system. The code implementing /proc/interrupts attempts to allocate a buffer for its output; the size of that buffer is dependent on the number of processors on the system. On big systems, the required buffer is large and the allocation fails. So Erik submitted a fix which uses vmalloc() to allocate the memory instead.
Linus didn't like it. He pointed out that the seq_file interface should be used instead. Indeed, /proc/interrupts fits naturally into the sort of output seq_file is intended to create, and doing things that way can eliminate the need to allocate a large buffer at all. But Linus also clarified his thoughts on when vmalloc() should be used:
That should be sufficiently clear for most readers; perhaps an entry on vmalloc() needs to be added to the coding style document.
There are a few reasons for this stance. Every call to vmalloc()
requires page table tweaking and translation buffer flushes, so it will be
slow. Space from vmalloc() lies outside of the regular kernel
range, which is (on most architectures) covered by a single, large page
table entry, so extra translation buffer slots are required to access it.
And, on many architectures, the amount of virtual space set aside for
vmalloc() is relatively small. For all of these reasons, use of
vmalloc() is discouraged, and patches containing
vmalloc() calls are increasingly unlikely to make it into the
kernel.
