When we last looked in on
GCC, the 4.4 release branch was held up by licensing issues,
specifically the exemption that would allow GCC plugins. Since that time,
things have progressed—there is now a 4.4 release branch—but the
license issue has not yet been completely resolved, though a new revision is
available. But, since the license change is not
needed for the 4.4 release, some GCC hackers are unhappy about the manner
in which the release was held up. It has led to questions about the roles
of the Free Software Foundation (FSF) and the GCC steering committee (SC)
have in controlling GCC development.
Holding up a release for licensing concerns is seen by many as a reasonable
request that the FSF might make. That organization is very concerned about
licenses, in general, and they certainly think it is important to get the
license right before releasing any code under the license. It is the
concerns expressed by GCC developers about the wording of the runtime
library exception that led to the license review. But Richard Stallman,
acting for the FSF, did not ask that the release be delayed—at least
directly—he asked that no release branch be created. The net effect
on the release process is the same (i.e. no release), but the impact on
development of new features destined for GCC 4.5 is a large part of what
irked the GCC developers.
The other piece of the puzzle is that the issue is essentially moot: there
is no plugin API in GCC 4.4 that would require the runtime library
exemption changes. Stallman is adamant that proprietary GCC plugins be
outlawed before such an API gets added. But, since the API isn't present
(yet), the license changes aren't needed yet either. So, while it makes
sense to take whatever time is required to get the license right, many
seem very puzzled as to why that would need to hold up the release
and new development.
There has never been a clear explanation of why the release branch
needed to be delayed, at least publicly. Either Stallman
relented—perhaps due to completing the license rewording—or the
SC eventually decided to override his request because the release branch
was created on March 27. But there was clear discontent in the ranks that
the SC could, evidently, be pushed around so easily by the FSF. This led
to questions about the role of the SC, how much control over "technical"
decisions the FSF has, and, in general, how the project is governed. As
Daniel Berlin put it: "Where is the
pushback by the SC onto the FSF?"
Berlin's complaint was that it has taken the FSF too long to resolve the
issue, to the point where it is now (and has been for a bit) seriously
impacting GCC development. Because the license change is not required for
this release, there is no good reason to delay it. Ian Taylor was fairly blunt:
I'm a strong supporter of the FSF, but I agree with Danny. This has
gone on far too long. Releasing gcc 4.4.0 with the same licensing as
gcc 4.3.0 will do no significant harm. The FSF is acting
inappropriately in restricting us in this way.
In response to Berlin's criticism of their response, SC member David
Edelsohn noted that there were things going
on behind the scenes:
Why do you think that the SC has not pushed back? Not all diplomacy
is best done in public. This is frustrating for all of us.
But a lack of communication from the SC to the greater GCC community is
part of the problem, according to Berlin:
Okay then, as the leadership body of the GCC community, part of your
responsibility is keeping your constituents (the rest of us!) informed
of the status of things troubling them.
I don't believe saying "we have given the FSF a deadline to meet in
the near future" would at all endanger any diplomacy, and i'd love to
see a counter argument that says otherwise.
The discussion then turned to the role that the FSF plays in GCC
development. SC member Joe Buck points out
that the FSF holds the cards: "The problem in this instance is that
the SC has little power; it's
the FSF that's holding things up and I don't know more than you do."
But that doesn't sit well with some. Steven Bosscher asks:
I don't understand this. Why does the SC have little power in this
matter? Surely you could decide to ship GCC 4.4 with the old license,
as the official GCC maintainer? But you *choose* not to use this
power (perhaps for good reasons, but I'm unconvinced).
Others believe that the FSF should be in complete control. Richard Kenner outlines how he sees the relationship:
The matters to which we defer to the FSF are any matters that they *ask*
us to! They own the code. If RMS, for some reason, decides that he doesn't
like the phrasing of a comment somewhere, we have to either convince RMS
he's wrong or change the comment.
As a practical matter, the FSF *delegates* most of their responsibilities
to the maintainer of the package, but they can undo that delegation as to
any matter any time they want.
Berlin would like to see the governance
structure for GCC more clearly
spelled out. The SC web
page seems to indicate that its role is to prevent the project from
being controlled by any party. But whether the SC is supposed to prevent
the FSF from controlling its own project was disputed. Ultimately,
the developers do have some power as Buck states: "There are checks on FSF control
in the sense that the project can be
forked and developers can leave."
That kind of talk inevitably leads some to mention the egcs/GCC split,
and subsequent join, as an example of the power that the development
community has. No one has said they are seriously considering a fork, but
the GPL certainly allows such things if the FSF's hand gets too heavy.
Jeff Law doesn't see it coming to that:
FWIW, I don't think we're anywhere near the same kind of tipping point
we faced a dozen years ago. I believe both sides learned lessons along
the way and I actually see evidence of that in the simple fact that
we've got a license and blessing to go forward with a plugin framework.
Clearly some developers are chafing under what they see as unnecessary
interference in technical issues from Stallman. But as Buck points out, Stallman does not dictate
technical details to GCC. Various decisions (bugtracker, version control,
etc.) have gone against his express wishes. In addition, contrary to
Stallman's aversion to C++, Taylor has used that language for the gold linker, and currently has a
branch that implements some of GCC in C++.
He sees things this way:
While agreeing that the FSF is the legal owner of the code, I personally
consider the implementation language to be a technical detail which the
FSF has no special control over. We can consider their input, but we
need not follow it. This is distinct from licensing issues, where we
had to either move to GPLv3 or fork into an independent project.
This discussion will hopefully lead to a clearer picture of the governing
structure of GCC. With luck, it may also make Stallman and the FSF more
cognizant of the perception that they are meddling in technical issues to
the detriment of their relationship with at least some in the community.
No one in the rather long thread could come up with any sensible reason
that the release branch should have been held up. At best, that means that
Stallman didn't communicate the why, which leads many to a sense
that he is being rather arbitrary.
In the meantime, though, GCC is preparing for the 4.4 release. Release
manager Mark Mitchell created the release branch, Bosscher is rounding up folks to update the 4.4 changes page, and
work is proceeding towards a release. That also allows the changes for 4.5
to be added to the trunk, which puts that release back on track.
With GCC 4.5 there will likely be a new plugin API for which the license
change is needed. On April 1, Edelsohn announced that the revised runtime library
exception had been released. It explicitly allows for Java byte code
to be used as input into GCC, making a "compilation process" using that
for the runtime library exception. One of the other concerns, regarding
independent modules, will be addressed in the FAQ,
though it has not been at the time of this writing.
Assuming the new exception passes muster on the gcc-devel list, and no
problems are found that would require adjustments, it will presumably end
up in the 4.4 release. While that should conclude this particular issue,
the overarching governance questions will remain.
Comments (11 posted)
Xiph.org achieved a milestone last
week, unveiling the
first public release of its new encoder for Theora video. The new encoder is codenamed
Thusnelda to distinguish it from previous work, and makes several big
improvements, including fixes to constant bitrate and variable bitrate
Theora is derived from a video codec called VP3 created by On2
Technologies. On2 donated the code to VP3 and to the public under an open
source license in 2001, and agreed to help Xiph.org develop Theora as its
successor. The specification for the Theora codec's format was finalized in
2004, but the reference encoder itself — the actual binary that
converts a video file into Theora format — only reached 1.0 in
November of 2008. Work on Thusnelda began shortly thereafter, spearheaded
Christopher Montgomery, but was bolstered by a grant
from Mozilla and the Wikimedia Foundation that allowed lead Theora
developer Tim Terriberry to focus on improving the encoder to coincide with
the built-in Theora support slated for Firefox 3.5.
The Thusnelda encoder is denoted 1.1 alpha, and is available
for download from Xiph.org in several formats: source code for the
libtheora library, binaries of the ffmpeg2theora command-line conversion
utility, and even a Mac OS X Quicktime component.
According to Xiph.org's Ralph Giles, the most noticeable improvement in
1.1 is proper rate control, particularly for fixed bit rate encoding, where
the user specifies either the number of bits per second desired in the
output (a common use case for streaming applications), or the desired file
size. "The 1.0 encoder relies a lot on heuristics, instead of trying
to optimize directly the trade-off between quality of the coded images and
the number of bits used to represent them," he said, "More
significantly, the fixed bitrate mode in the 1.0 reference encoder didn't
really work; it just guessed how to meet its target and often missed the
requested bitrate, sometimes by quite a bit, which was a problem for
streaming and fixed-size encodes."
But Montgomery's work — supported for a year by his employer Red
Hat — also included extensive refactoring of the code, which should
result in improvements today and allow for easier changes moving
forward. "The older encoder was structured as a bunch of nearly
independent passes," Giles said, "[it] made something like 8
passes over each frame. This made some forms of decision making hard,
i.e. if an earlier decision caused you problems (higher bitrate) in a later
stage you were out of luck. The new encoder collapses most of the
The restructuring also allows Thusnelda to take advantage of features in
the Theora specification that had never been implemented before, such as
"4MV" macroblocks, a motion compensation scheme that adaptively chooses
whether to encode motion information for an entire segment of the picture,
for a sub-segment, or for none of the segment. "Theora always breaks
each image up into square blocks," Giles explained, "one of
those blocks then can be split into four motion vectors, or use an average,
and if any of those four don't need to be coded, the alpha encoder can skip
coding a corresponding motion vector. Making a change like that was too
difficult with the 1.0 codebase."
Naturally, real-world performance and not a feature list is the primary
means of assessing an encoder. Theora has been the object of criticism in
years past, especially when compared against proprietary offerings such as
H.264. Reader comments on news stories at Slashdot often dismissed Theora
as a poor alternative, producing larger files than the competition for the
same subjective quality.
Codec testers are always at the mercy of the encoder, however, and as
noted above Theora's 1.0-series encoder had significant flaws, especially
with respect to constant bitrate encoding. In the oft-cited doom9.org 2005 codec
shootout, the Theora encoder performed poorly by failing to meet the
target file size due to poor rate control; the very feature targeted in the
1.1 branch. Similarly, Eugenia Loli-Queru's 2007 Theora versus
H.264 test for OSNews repeatedly cited problems with the encoder that
made direct comparison close to impossible.
Both tests pre-date the 2008 release of
the final 1.0 encoder, much less the 1.1 alpha. Shortly after the
Thusnelda alpha, Jan Schmidt posted the results of his personal
tests on his blog, indicating a 20% reduction in file size and 14%
reduction in encoding time over the 1.0 encoder. Those are significant
numbers, even without accounting for better rate control and other encoding
parameter improvements. As commenters to the blog pointed out, Schmidt's
test was not scientific, particularly as it involved re-encoding an H.264
file rather than a lossless original, and showed example still frames
rather than video results.
Video quality is ultimately a subjective, human-centric measure.
Although there are attempts to quantify video encoding quality, such as peak
signal-to-noise ratio (PSNR) and structural similarity index
(SSIM), they rarely replace subjective evaluations of quality. Xiph's
Gregory Maxwell said that Thusnelda improves on Theora's PSNR, but that it
was a mistake to assume that that equated to a subjective improvement for
any particular use case.
To an extent the objective metric problem is
equal to the coding problem. If we had a perfect metric we could probably
make a perfect encoder (ignoring a lot of engineering details) ... If we
could objectively know what 'looks good' then we could make a coder which
uses that metric to decide what to code. Then the problem of coding
largely reduces to efficiently packing information, which is well
understood. So in any case, objective metrics are usually useful for
measuring the results of small changes which are mostly 'objective' in
nature; they aren't very useful for measuring perceptual changes, nor are
they useful for comparing dramatically different codecs.
Terriberry concurred, noting that none of the simple objective metrics
take any kind of temporal effects into account, and they are still less
trustworthy than the processing done in the brain. "Like most
things, it's a matter of knowing what the limitations of your tools
are. PSNR and SSIM are useful for monitoring day-to-day changes in the code
to identify regressions and optimize parameters. But for evaluating
fundamentally different approaches, there's currently no substitute for
using real humans."
Theora took hits from critics on subjective quality in the 2005 and 2007
tests, too, points which Montgomery responded to in 2007 with a page on
his personal Web site. Although some subjective quality issues like
discernible blockiness are not the result of problems with the 1.0 encoder,
he argued, many of the most visible problems are, and he urged readers to
watch the progress made in the 1.1 series.
There are several improvements still to come before 1.1 is declared
final, according to the Theora team. Giles said the next major feature
will be per-block quantizers, the functions that simplify a block of input
data into smaller values for output. "[Theora precursor] VP3 used a
fixed set of quantizers, and the "quality" knob was the only way you could
change things. When VP3 became Theora, back 2004, we added support for
varying those quantizers both per video, and per frame type. The 1.0
encoder was able to support alternate quantizer matrices, because you just
switch them out, but there were some tuning issues."
"1.1alpha1 is still using the same set, but we expect that the
change soon," Giles said. The newly-restructured codebase makes it
easy to vary the quantizer used, not just on a per-file or per-frame basis,
but block-by-block. Terriberry added that the new code will support 4:2:2
and 4:4:4 pixel formats, which will allow higher color quality, and the
ability to use different quantization
matrices for different color channels and frames.
Giles and Terriberry agreed that 1.1 final will be significantly better
than even the current alpha release once all of the changes are
incorporated. Terriberry noted that many of the remaining improvements are
"minor things" but that added together they will be substantial. "And
that's not even mentioning things like speed optimizations, which also have
real practical benefits."
"There are other things still on the docket as well — we're
not done yet!" added Montgomery, "However, we're finally to
the point of putting together a release solidly better than 1.0 in every
way, along with a much higher future ceiling."
Between now and then, the team is soliciting user input from real-world
encoding tests. "We put it out to show what we've been up to, and to
make it easier to give it a try," said Giles. "We're
interested in samples where it really does poorly, especially relative to
1.0, compatibility testing with current decoders, and general build and
integration issues which of course can only be found through people trying
your software in their own environments." He encouraged users to
submit concrete issues through the bug tracker, but to share other
experiences through the project mailing list, or simply to blog about
them for all to read.
Web video is poised to start changing dramatically once Firefox 3.5
ships with a built-in Theora decoder underlying the HTML5 video element.
That makes it all the more important to get the Theora encoder right.
Xiph.org does not have the full-time staff or resources of larger activist
groups like the Free Software Foundation or Creative Commons, it has only
software developers. Consequently, without the support of Red Hat,
Mozilla, and the Wikimedia Foundation, it might not have been able to get up
to speed. It remains to be seen whether the final build of Thusnelda will
beat Firefox 3.5 to release, but the progress made already is
Comments (13 posted)
The Open Source Business
, held at San Francisco's Palace Hotel, draws a lot of
lawyers, from both corporate legal departments and law firms. Continuing
Legal Education (CLE) credit is available. Jeff Norman, a partner
at the law firm of Kirkland & Ellis
, delivered a talk on "Shims and
Shams: Firewalling Proprietary Code in a Copyleft Context."
This talk gives some insights into the current thinking on how difficult it
can be to create a combined software product using
both copyleft and proprietary code.
Most clients who want to combine GPL and proprietary
code, Norman said, do not have an open source business
model in mind. But the idea of creating a mixed
GPL/proprietary software product is difficult and
expensive. Step one for the lawyer is to explore
the reasons behind the idea. The question is:
"Why do you want to do something unusual instead of
complying with open source disclosure requirements?"
Only if the client says, "We can't open source this,"
does Norman recommend what he calls "shimming," which
he defines as "programing practices and architectures
that reduce the risk that independently created
proprietary code might be deemed a derivative work
based upon some other code that is intended to operate
with such proprietary code." Shimming includes both
procedural shims, which are development practices,
and substantive shims, which are design decisions.
The GPL's reciprocity requirements rely not on any
technical criterion, but on the legal definition
of a "derivative work." The definition is actually
consistent across countries. Norman surveyed the law
in US, Europe, and Asia, and found "little substantial
difference." While the definition is consistent, it's
also broad, often surprisingly broad. Case law shows
that derivative works can come into existence easily,
whenever pre-existing work is either incorporated into
new work or modified.
Two cases show the wide reach of the derivative work concept.
A.R.T Company sold products based on unmodified
postcards, and in one case was found to be creating
a derivative work. (Another case found that
A.R.T.'s product consisting of a postcard mounted
on a tile was not infringing, creating a conflict
between two U.S. circuit courts.) In another case, Midway
Manufacturing Co. v. Artic International, Inc.,
the court found that selling a hardware speed-up
kit for an existing arcade game was creating a
derivative work of the game software, even though
the defendant did not copy or modify the original
game software. Courts use the test of "access and
substantial similarity." If the alleged infringer
had access to the copyrighted work, and even parts of
the alleged derivative work are substantially similar,
then, according to the test, it's a derivative work.
Applying the idea to software, "proprietary code may
incorporate non-proprietary code in non-obvious ways,"
"There are built in to most computer languages
directives that cause code to be combined," he said.
The C preprocessor's #include directive is one example. "I've
of thousands of lines of code incorporated with one
include." In one example, a proprietary program used
a widget class's API, and the widget class, using a
header file, incorporated code from a system library.
"This whole thing becomes a derivative work," he said.
Distribution or not?
If a combination of GPL and proprietary code is only
for in-house use, some clients decide simply
not to redistribute it. The GPL's
reciprocity requirements apply only in distribution.
However, distribution could happen in a lot of ways.
Does depositing source code in escrow count as
distribution? How about an acquisition that's
structured as a sale of assets from one company
to another? "Relying on no distribution is very
dangerous. There are a lot of situations where
distribution can happen but you wouldn't think of it
as distribution," Norman said.
The other approach is what Norman recommends.
"Don't create a derivative work and then you won't
have a problem." He said that some open source
advocates say, "You're violating the spirit or the
purpose of the GPL." But in the long run, allowing a
license to reach out too far could enable proprietary
vendors to apply unwanted terms to open source code.
"If the end user is not creating a derivative work,
not only are the license terms not being triggered
but you don't want them to be triggered,"
In the years that developers have been using and discussing
the GPL, some have developed a false sense of security
about when they're creating a derivative work.
Just using an API may or may not create a derivative
work. The purely functional aspects of a function
call are not copyrightable. However, Norman says,
even minor non-functional aspects of an API, such as
the sequence of fields in a structure, are probably
copyrightable. And just using an API can result
in bringing thousands of lines of code into your
application. Another fallacy is the often-heard, "We
can avoid any problems if we use dynamic rather than
static links," However, dynamic linking by itself does
not automatically avoid creating a derivative work.
Some US circuits ignore extremely small copying under
the so-called "de minimis" exception. However,
Norman said, "it would almost never cover code,"
There's no sure test for what is or isn't de minimis
copying, and one module or section of a program could
be found to be a derivative work. "Even if the whole
project is not substantially similar, one module may
be substantially similar," he added.
The clean room
With the broad standards of what is a derivative
work, combined with the ways that software can
mingle at build time, "derivative works practically
create themselves," Norman said. In order for
a combined product not to be a derivative work,
developers need to take specialized and expensive
measures. Avoiding creating a derivative work is not
something to do in the ordinary course of business.
Shimming requires the same clean-room techniques
that software companies use to protect themselves
when doing reverse engineering. Building a combined
product cleanly "really has to be worth it," Norman
said. "If you have someone who has never done a clean
room before you're going to spend time getting them
up to speed." Besides the development costs, there's
some publicity cost. "In any shimming scenario you
may get some negative PR," he said.
The wrong way to do shimming is simply to build a
wrapper, implementing essentially the same interface
as the GPL software, then communicate with the
wrapper. It doesn't work because the wrapper becomes
a derivative work, then the code that talks to the
wrapper does too. In practice, two development
teams need to work side by side, but not in direct
communication with each other. One team handles GPL
code, the other handles the proprietary code, and the
two communicate only through the legal department,
which acts as filter.
They have to be kept separate, Norman says, because,
"Programmers like to borrow just like lawyers.
How often do you borrow from somebody else's brief?"
The filtering has to block any knowledge about
creative expression from making its way across
the barrier. Since anyone could have seen the GPL
code, "We have a set of programmers who verify under
affidavit that they've never looked at this code
before." The clean room developers' access to the
Internet has to be monitored, or better yet, blocked.
"It's a fairly cumbersome technique, but most software
companies have done some kind of clean room before,"
is an example of another safe approach, Norman
said. A network device driver originally written
and compiled for Microsoft Windows cannot be a
derivative work of Linux. NDISwrapper itself is
GPL-licensed, and probably a derivative work of Linux (it's very
difficult to make a kernel module that isn't).
The API used in the Windows driver clearly has nothing
to do with the copyleft API.
Another approach is to split the GPL code into
a server process and put the proprietary code in
the client. However, this is unlikely to work if the
server and client are distributed together on the same
CD, relying on what the GPL calls "mere aggregation."
Any "intimate communication" between the server
and client could also create a derivative work,
Norman said. The last approach is to time-shift
the creation of a derivative work to the end user.
An example is the NVIDIA proprietary device drivers
for Linux, which the company distributes separately
from the kernel. This only works for technical
or hobbyist end users, not for an integrated
product. There's also a potential patent problem:
distributing two pieces that, when combined, infringe a
patent constitutes contributory infringement, Norman
A problem shimming scenario is using it to attempt
to undo a previous decision to combine software.
It could be "admitting that what you did was
problematic." If possible, try to buy a exception
from the copyright holder instead, Norman said.
Shimming is possible and might even be necessary,
as in the case of third-party code that can't be
relicensed. But the lesson is that companies will
save time, use fewer developers, make a simpler
product, and avoid legal bills just sticking with
Comments (36 posted)
Page editor: Jonathan Corbet
Next page: Security>>