. The recent set of null-pointer vulnerabilities
has not been helped by the confusion around how the mmap_min_addr
knob and security modules interact. The 2.6.31-rc7 kernel will see a
couple of changes intended to clarify and rationalize this interaction.
With these patches, SELinux will no longer bypass the
check; any process wanting to map memory below that
address will require the CAP_SYS_RAWIO
SELinux will also implement its own low-memory limit, controlled by a
separate knob in the SELinux policy. As a result, it will be possible to
turn off the mmap_min_addr
protection but still only allow
specific programs to do low-memory mapping.
Module loading security. Another SELinux-related change - not yet
merged into the mainline - adds a new hook to request_module().
The idea here is to try to limit the ability of user-space programs to load
arbitrary modules into a running kernel. In future versions of the SELinux
policy, the ability to trigger module loads is likely to be reduced to a
much smaller set of roles.
HugeTLB mappings. The "hugetlb" feature allows processes to create
pseudo-anonymous memory mappings backed by pages which are larger (perhaps much
larger) than the normal system page size. For certain kinds of
applications, these mappings can improve performance by reducing pressure
on the CPU's translation lookaside buffer (TLB). The kernel code resides
within such a mapping for the same reason. Using hugetlb pages in user
space is a bit awkward, though; it requires mounting the special hugetlbfs
filesystem and mapping files from there.
Eric Munson has put together a patch
implementing an easier way. With this patch, the mmap() system
call understands a new MAP_HUGETLB flag; when that flag is
present, the kernel will attempt to create a mapping backed by huge pages.
Underneath it all, the mapping is still implemented as a hugetlbfs file,
but user space need no longer be aware of that fact.
spin_is_locked(). Kernel code can test the current state of a
spinlock with spin_is_locked(). But what should this function
return on a single-processor system, where spinlocks do not exist at all?
Kumar Gala ran into trouble because one
uniprocessor spin_is_locked() implementation returned zero. The
problem was code in this form:
/* Ensure we have the requisite lock */
So Kumar thinks that the return value should always be true. But there are
other situations where that is just the wrong thing to do; Linus gave an example where code is waiting for a lock to
The real problem is that a predicate like spin_is_locked() simply
lacks a well-defined meaning when the spinlock does not exist. So there is
no way to always give the "right" answer in such situations. What may
happen instead is that, in a future kernel, spin_is_locked() will
be deprecated. Instead, there will be new expect_spin_locked()
and expect_spin_unlocked() primitives for testing the state of a
spinlock. When the code is this explicit about what it is looking for, the
default answer can make sense; both would return true on uniprocessor
localmodconfig. Many kernel testers want to build a kernel which
looks like the kernel shipped with their distribution. But distributor
kernels come with a configuration which builds almost everything. So our
poor tester ends up waiting for a very long time as the system builds a
bunch of modules which will never be used. One could avoid this problem by
creating a new configuration from scratch, but that process can be a little
daunting as well. There are a lot of configuration options in a
Steven Rostedt has posted a build
system change intended to help with this problem. A user who types:
will get a configuration which builds all modules currently loaded into the
running system, but no others. That should be a configuration which
supports the system's hardware, but which lacks hundreds of useless
modules. There is also a localyesconfig option which builds the
required drivers directly into the kernel.
to post comments)