Security holes can sneak into code in surprising ways, even in highly
scrutinized codebases. Perhaps even more surprising is how long they can
persist in something as popular as the Linux kernel before someone
notices. The release of stable
kernels 126.96.36.199 and 188.8.131.52 this week are instructive for both of
The bug that led to the releases is fixed by a two
line patch, but might be exploitable to cause filesystem corruption.
If it were a bug in a driver for an obscure piece of hardware,
with relatively few users, it might have been less eye opening, but it was
in the Virtual File System (VFS) layer of the kernel. VFS is the
abstraction that allows all kernel filesystems to be used identically
regardless of their underlying implementation. The open() system
call is used to open any file on any type of filesystem; VFS is what makes
In fact it is the open() path that is affected by the bug.
Due to a faulty test, the bug allows directories to be opened for writing, which is generally a
recipe for disaster. It could also allow a file on a read-only filesystem
to be opened for writing – depending on the underlying filesystem
implementation, that could lead to corruption. In both cases, they are
only locally exploitable.
The bug was introduced in a change to support NFS in October of 2005 – more
than two years ago; all kernels since 2.6.15 are affected. The change
was aimed at making NFSv4 open calls be atomic (because an open is really a
lookup followed by an open), but also did some code reorganization that
changed the semantics of a flag variable. That variable was being used to
determine the access mode for directories and read-only filesystems, so
that change subtly broke the tests.
Part of the problem is that the tests are in a function called
may_open(), which takes two flag parameters:
int may_open(struct nameidata *nd, int acc_mode, int flag)
The incorrect code was using flag
in the tests when it should have
been using acc_mode
. Each of them is a bitmask of values that, on
first glance, might be easy to confuse – each is related to permissions.
The bit values for each have names like FMODE_WRITE
, which would seem to have a fair amount of overlap. This
may explain why the problem was not spotted at the time it was introduced.
There may be no easy solution to this kind of problem – other than
more scrutiny. Using different types, rather than plain int, for
each flag might have helped, but since the tests were using the right kind
of bit values for flag, that is a somewhat hard sell.
Something unpleasant to consider in all of this is that this may not be the
first time this problem has been noticed. It may just have been the first time
it was noticed by someone who reported it. Folks with a malicious intent
are much less inclined to report bugs. This particular bug is not one that
would be particularly useful to attackers, but we would do well to remember
that fixing a two year old hole means that systems were vulnerable for all
that time. It is not only the good guys who can read code.
to post comments)