Maybe SCO had a point
[Posted August 21, 2003 by corbet]
Eric Raymond recently posted
his own
analysis of SCO's "stolen code" presentation. One unique thing that
Eric did was to compare the code against the version found in a true SYSV
Unix code base. Eric evidently has such a thing lying around; this is not
a test that LWN was in a position to perform.
Based on his review of the various versions of the code in question and the
comparison with SYSV, Eric came to the conclusion that the Linux version
came from an ancient version of Unix. His reasoning goes:
Given this, there are two pieces of internal evidence that suggest
32V. One is that the function is split in two in SVr4 but single in
32V and Linux. A subtler indication that one change between SVr4
and Linux would remove a cast (in the second ASSERT call). It is
quite unlikely that a programmer casually copying code would go to
the effort to remove a cast, and a guilty copier wouldn't do it
when there are ways to obscure similarities that are both easier
and less likely to spawn subtle bugs.
However, to many readers, a close look at Eric's posted diff suggests a
different conclusion. The SYSV version contains code in common with the
Linux version (such as the ASSERT() statements) which do not
appear in any of the ancient Unix versions. The reorganizations Eric
mentions are trivial, the sort of thing a programmer might do while making
a piece of code work.
It is not that hard to conclude that the SGI engineer who produced this
code took it from the closest thing he had at hand: a proprietary,
SYSV-based Unix implementation. It is, among other things, the simplest
explanation. SCO, perhaps, had a point. Despite all of
its precursors in ancient versions of Unix, this particular bit of code
appears to have been stolen from a proprietary code base.
The existence of some suspect code is not particularly surprising; LWN raised that possibility back in
May. When you are dealing with as much code as the Linux community now
handles, and with such a large number of contributors, the chances of
something bad slipping through are actually pretty good. Linux is not
alone in this; any other project, including proprietary developments, can
be contaminated with bad code. By some accounts, proprietary code has a
much worse record in this regard.
It looks like time for the community to face up to this fact: this is one
of those times when something slipped through. Somebody (presumably) at
SGI took a short cut, and we got burned. In the 2.6 kernel, the right
thing has already been done: that code is gone. It needs to come out of
2.4 as well.
This development does not really help SCO's case in any way. SCO cannot
use it to go after users - they are not running the code in question, and
they do not have it on their drives. And
it is absolutely irrelevant to SCO's claims against IBM, which have nothing
to do with copyright infringement. This code did not
help Linux achieve its current capabilities in any way. Its removal does
not hurt Linux. It is - if proved to be an infringement - a tiny one which
resulted from carelessness and laziness, not malice. It is not responsible
for SCO's problems.
But we should not gloss over the fact that the contribution of this bit of
code to Linux does appear to have been a copyright infringement. The right
way to deal with such problems is to acknowledge them, and to remove the
bad code immediately. This will not be the last case of plagiarism that
the community has to address; let's try to set a precedent for doing it
right.
(
Log in to post comments)