On the kernel page a few weeks ago, we took a look at KSM, a technique to
reduce memory usage by sharing identical pages. Currently proposed for
inclusion in the mainline kernel, KSM implements a potentially
useful—but not particularly new—mechanism. Unfortunately,
before it can be examined on its technical merits, it may run afoul of what
is essentially a political problem: software patents.
The basic idea behind KSM is to find memory pages that have the same
contents, then arrange for one copy to be shared amongst the various
users. The kernel does some of this already for things like shared
libraries, but there are numerous ways for identical pages to get created
that the kernel does not know about directly, thus cannot coalesce.
Examples include initialized memory (at startup or in caches) from
multiple copies of the same program and virtualized guests that are running
the same operating system and application programs.
Unfortunately, as Dmitri Monakhov points out, the KSM technique
appears to be patented by
VMware. A patent for "Content-based, transparent sharing of memory
units" was filed in July 2001 and granted in September 2004. The abstract
seems to clearly cover the ideas behind KSM:
[...] The context, as opposed to merely
the addresses or page numbers, of virtual memory pages that [are]
one or more contexts are examined. If two or more context pages are
identical, then their memory mappings are changed to point to a single,
shared copy of the page in the hardware memory, thereby freeing the memory
space taken up by the redundant copies. The shared copy is ten preferable
marked copy-on-write. Sharing is preferably dynamic, whereby the presence
of redundant copies of pages is preferably determined by hashing page
contents and performing full content comparisons only when two or more
pages hash to the same key.
It should be noted that the abstract has no legal bearing, that comes from
the—always tortuously worded—claims, which can be seen at the
link above. In this case, as far as
can be determined, the claims and abstract are in close agreement.
The dates above are rather important because there is some "prior art" to
consider, namely the mergemem patch
in March of 1998. It is substantially the same as the patented idea: it
looks for identical "context pages", then changes the memory mappings to
point to a single copy-on-write page. This would seem to be a clear
example of the idea being implemented well before the patent was filed, so
it should invalidate the patent. As with everything surrounding
software patents, though, it isn't as easy as that.
In order to invalidate a patent, either a court must rule that way or the
patent office must be convinced to re-examine it, then find that the prior
art makes it invalid. Both of these methods
take time and usually money and lawyers as well. Free software projects
may have time, but the other two are typically out of reach. Alan Cox suggests that "perhaps the
Linux Foundation and
some of the patent busters could take a look at mergemem and
re-examination". While that might eventually resolve the problem,
it is a multi-year process at best.
The folks behind the KSM project are some of the kvm hackers from
Qumranet—which is now part of Red Hat. It is certainly conceivable
that VMware might consider kvm a competitor and try to use this patent as a
"competitive" weapon. That concern is probably enough to keep KSM out of
the mainline until the issue is resolved.
There is a much quicker resolution available should VMware wish to do so.
Like IBM has done with the RCU patent, VMware could license its patent for
use in GPL-licensed code. There is much to be gained by doing that, at
least in terms of positive community relations, and there is little to be
lost—unless VMware truly believes that the patent will stand up to
scrutiny. Both VMware and its parent, EMC, are members of the Linux
Foundation, so one could see a role for the foundation in helping to put
that kind of agreement together.
The original mergemem idea did not make into the kernel, but the code is
still available for those running Linux 2.2.9. It appears that it was not
hard in the face of some security concerns—which will need to be
addressed by KSM as well. Processes could create a page of memory with
known contents then, after waiting for the checker process (or kernel
thread) to run, see if memory usage has increased. Based on that
information, one can determine if other processes have a page with
identical values. It would seem rather difficult to exploit, but clearly
does allow some information to leak.
It will come as no surprise to most LWN readers that software patents are an
increasingly dense minefield that can derail free software projects.
Unfortunately, it is the kind of problem that has no solution in the
technical domain where such projects excel. The political arena is where
any solution will have to come from, though there seems to be some hope
that judicial opinions (like the Bilski decision) may limit the scope of
the damage. It is a problem that we are likely to see more frequently
until there is some kind of resolution.
to post comments)