Where have the universities gone?
[Posted July 24, 2007 by corbet]
When Greg Kroah-Hartman talked about the provenance of Linux kernel code at
the Ottawa Linux Symposium, one member of the audience asked about whether
contributions from universities were tracked. The answer is that
universities were handled like any other source and tracked accordingly.
If code is contributed by somebody who works for the university (a faculty
member, in other words), the university is credited as having supported the
work. Contributions from students tend to be treated as "hobbyist" work,
but there are few significant contributors who fall into this category.
There is, in fact, very little code coming from the university environment
in general. Your editor was able to find exactly five files in the
2.6.23-rc1 kernel tree which contain a 2007 copyright credited to a
University.
It was not always that way; universities used to be heavily involved in the
creation and distribution of free software (though it did not originally
carry that name). The BSD Unix distribution - the first to support virtual
memory and drive VAXen worldwide - came from the University of California
at Berkeley.
Linux became the master's thesis for one Linus Torvalds. The X Consortium
grew out of a project at MIT - it was part of Project Athena, which was the
source of much interesting work. The GNU project has its roots at MIT as
well. Alan Cox did much of his crucial early Linux work while at Swansea
University. Ted Ts'o, another important early contributor, was based at
MIT.
Looking further back,
graybeards among us will remember the influential WATFOR Fortran compiler
from the University of Waterloo. Much interesting work (and code) came
from the Andrew project at Carnegie Mellon University.
Two of your editors got their start at the University of Colorado
working with a project called Toolpack, creating Fortran developer tools;
their names can be found in this
old report [PDF]. The list goes on at some length. Over the years, we
have all been the beneficiaries of a great deal of creativity (and code) to
come out of the university environment.
While there are still interesting projects happening at universities, the
flow of code has nearly stopped.
This seems strange; one need not dig too far into the
curriculum at most computer science departments to find operating systems
classes using Linux as a teaching tool, but these same computer science
departments are, as a whole, not contributing back changes to that tool.
This is a large and rather unremarked-upon change in how free software
works; it would be interesting to understand what force is driving this
change.
Your editor has spent a few weeks querying contacts in the academic world,
but the amount of useful information coming back is surprisingly small. An
"I don't know" answer from a computer science department chair was not
expected. So, rather than provide definitive answers, your editor will
have to engage in some definitive handwaving.
One obvious change is that the amount of code coming from the
corporate environment has grown from nearly zero to something huge.
As the proprietary software idea took over the industry, the idea that a
company would give away its code came to look similar to the notion of
opening up its bank account to all comers. At the same time, individuals
rarely had the resources to develop and contribute code themselves, and the
supporting community was not there. So universities were about the only
real source for freely-circulated software. Thanks to the culture of
openness in academia, passing that code around (and improving it) seemed
like a natural thing to do.
Unfortunately, that code of openness has suffered somewhat in more recent
times. In many parts of the world, universities are able to privatize and
commercialize interesting work, even if that work was funded by public
money. University researchers have strong incentives to put their energy
(and their code) into startup companies instead of contributing that code
back to the community. Look, for example, at the story of the Stanford
Checker, which was initially built on gcc. Rather than contribute that
code, the developers created a private company (Coverity) to commercialize
it. The community has certainly benefited from Coverity's work, but we
still do not have a static analysis tool with anything near the power of
the erstwhile "Stanford Checker."
The same commercial forces almost certainly have the effect of drawing
effective developers out of the university environment. Talented students
who might once have gone on for advanced degrees or continued to work
within the university are likely to have plenty of more lucrative options
elsewhere. This will be especially true for those who have demonstrated
that they can create useful, production-quality code. So, perhaps, it is
not surprising that many of the most productive free software developers
are no longer found at universities.
Another disincentive for university contributors is that few free software
projects are interested in prototypical or overly experimental code. A
potential kernel contribution must be rock-solid, well-benchmarked, with
well-defined needs and users. A university project may explore an
interesting idea far enough to generate the required publications, but the
resulting code is likely to be far from ready for mainline inclusion. It
may well be that, for many university researchers, there is no real reason
to make the effort to get their code merged, even if the work would be
useful in a more practical environment. Funding agencies and tenure
committees do not normally consider community contributions when making
their decisions.
Code contributed to the community also requires ongoing maintenance,
something which many university environments are not well prepared to
support. Graduate students move on to other challenges, and faculty go on
to the next project. It is hard to write a successful grant application
for maintenance work. So interesting code has a real chance of simply
being dropped once the research objectives have been achieved - or the
funding has run out.
So there are a number of reasons for the reduction in university
participation in the development process. That participation has certainly
not fallen to zero. We can thank the University of Michigan for much of
our NFSv4 code. A lot of USB work has come out of the Rowland Institute at
Harvard. Much of the early eCryptfs work happened at Stony Brook
University. The University of Waikato has contributed to the DCCP protocol
implementation. The Helsinki University of Technology works with the IPv6
code, as have the University of Tokyo and Keio University. These are just
a few recent contributions to the kernel; clearly, the scope of university
contributions to the community goes far beyond that. But these
contributions are buried by the code coming from other sources. For better
or for worse, the period when universities were the source of a large
portion of our free software code base would appear to have passed. But
that period left us with a strong foundation on which to build the systems
we have today.
(
Log in to post comments)