Two GCC stories
Documentation licensing
Back in May, GCC developer Mark Mitchell started a discussion on the topic of documentation. As the GCC folks look at documenting new infrastructure - plugin hooks, for example - they would like to be able to incorporate material from the GCC source directly into the manuals. It seems like an obvious idea; many projects use tools like Doxygen to just that end. In the GCC world, though, there is a problem: the GCC code carries the GPLv3 license, while the documents are released under the GNU Free Documentation License (GFDL). The GFDL is unpopular in many quarters, but the only thing that matters with regard to this discussion is that the GFDL and the GPL are not compatible with each other. So incorporating GPLv3-licensed code into a GFDL-licensed document and distributing the result would be a violation of the GPL.
After some further discussion, Mark was able to get a concession from Richard Stallman on this topic:
This is a severely limited permission in a number of ways. To begin with, it applies only to comments in header files; the use of more advanced tools to generate documentation from the source itself would still be a problem. But there is another issue: this permission only applies to FSF-owned code. As Mark put it:
I find that consequence undesirable. In particular, what I did is OK in that scenario, but suddenly, now, you, are a possibly unwitting violator.
Dave Korn described this situation as being
"laden with unforeseen potential booby-traps
" and suggested
that it might be better to just give up on generating documentation from
the code. The conversation faded away shortly thereafter; it may well be
that this idea is truly dead.
One might poke fun at the FSF for turning a laudable goal (better documentation) into a complicated and potentially hazardous venture. But the real problem is that we as a community lack a copyleft license that works well for both code and text. About the only thing that even comes close to working is putting the documentation under the GPL as well, but the GPL is a poor fit for text. Nonetheless, it may be the best we have in cases where GPL-licensed code is to be incorporated into documentation.
Anonymous contributions
Ian Lance Taylor recently described a problem which will be familiar to many developers in growing projects:
He also noted that a contributor who goes by the name NightStrike had offered to build a system which would track patches and help ensure that they are answered; think of it as a sort of virtual Andrew Morton. This system was never implemented, though, and it doesn't appear that it will be. The reason? The GCC Powers That Be were unwilling to give NightStrike access to the project's infrastructure without knowing something about the person behind the name. As described by Ian, the project's reasoning would seem to make some sense:
NightStrike, who still refuses to provide that information, was unimpressed:
Awesome or not, this episode highlights a real problem that we have in our community. We place a great deal of trust in the people whose code we use and we place an equal amount of trust in the people who work with the infrastructure around that code. The potential economic benefits of abusing that trust could be huge; it's surprising that we have seen so few cases of that happening so far. So it makes sense that a project would want to know who it is taking code from and who it is letting onto its systems. To do anything else looks negligent.
But what do we really know about these people? In many projects, all that is really required is to provide a name which looks moderately plausible. Debian goes a little further by asking prospective maintainers to submit a GPG key which has been signed by at least one established developer. But, in general, it is quite hard to establish that somebody out there on the net is who he or she claims to be. Much of what goes on now - turning away obvious pseudonyms but accepting names that look vaguely real, for example - could well be described as a sort of security theater. The fact that Ian thanked NightStrike for not making up a name says it all: the project is turning away contributors who are honest about their anonymity, but it can do little about those who lie.
Fixes for this problem will not be easy to come by. Attempts to impose
identity structures on the net - as the US is currently
trying to do - seem likely to create more problems than they solve,
even if they can be made to work on a global scale. What we really need is
processes which are robust in the presence of uncertain identity. Peer
review of code is clearly one such process, as is better peer review of the
development and distribution chain in general. Distributed version control
systems can make repository tampering nearly impossible. And so on. But
no solution is perfect, and these concerns will remain with us for some
time. So we will have to continue to rely on feeling that, somehow, we
know the people we are trusting our systems to.
