By Jonathan Corbet
November 9, 2011
Back in August, there was
a big fight over
whether the user-space "native Linux KVM tool" should be merged into the
mainline kernel repository. One development cycle later, we've had the
same fight with many of the same arguments and roughly the same result.
Sequels are rarely as good as the original; that applies to flame wars as
well as to more creative works. But there is a core issue here that has
relevance well beyond the kernel community: does the separation of projects
help the Linux community more than it hurts it?
The proponents of merging the tool into the kernel make a number of
points. Having the projects in the same repository makes development that
crosses the boundary between the two easier; in particular, it helps in the
creation of APIs that will stand the test of time. The project's overall
standards help to keep the quality of the tools high and the release cycle
predictable. Reuse of code
between user-space and kernel projects gets easier. All told, they say,
having the "perf" tool in the kernel tree has greatly helped its
development; see this message from Ingo
Molnar for a detailed description of the perceived advantages of this
mode of development. Artificial separation of projects, instead, is said
to have high costs; Ingo went so far as to claim that Linux lost the desktop market as
the result of an ill-advised separation of projects.
Opponents, instead, say that putting the kernel and the tools in the same
tree makes it easier to create API regressions for out-of-tree tools. The
reason that perf has a relatively good record on this front, Ted Ts'o said, has more to do with the
competence of the developers involved than its presence in the kernel
tree. Adding user-space tools bloats the kernel source distribution, puts
competing out-of-tree projects at a disadvantage, and, Ted said, creates a number of difficulties for
distributors.
The one concrete end result of the discussion was that the pull request for
the KVM tool was passed over by Linus who, feeling that he had enough stuff
for this development cycle already, did not want to wander into this
particular disagreement. It is not hard to imagine that he will get
another chance in a future development cycle; it does not seem that any
minds have been changed by the discussion so far.
In the middle of this discussion, it was asked whether it would make sense
to bring other projects into the kernel - GNOME, for example. It was
pointed out that BSD-based systems tend to be developed in this mode - an
existence proof that operating system development can work that way. Ted
responded (in the message linked above) as follows:
[T]here has speculation that this was one of many contributions to
why they lost out in the popularity and adoption competition with
Linux. (Specifically, the reasoning goes that the need to package
up the kernel plus userspace meant that we had distributions in the
Linux ecosystem, and the competition kept everyone honest. If one
distribution started making insane decisions, whether it's forcing
Unity on everyone, or forcing GNOME 3 on everyone, it's always
possible to switch to another distribution. The *BSD systems
didn't have that safety valve....)
One could note that BSD does have one safety valve: to fork the entire
system. That has happened a number of times in the history of BSD;
pointing this out, though, only serves to reinforce Ted's point.
Distributors play a crucial role in the Linux ecosystem; they function as
the middleman between most development projects and their users. Most of
us, most of the time, do not obtain the software we run directly from those
who wrote it; it comes, instead, nicely packaged from our distributor. As
they ponder each package, distributors (the successful ones, at least) will
be keeping their users' needs in mind. If the package has obnoxious
anti-social features or security problems, the distributors will either fix
it or leave the package out altogether. The recent Calibre mess is a prime example; aware
distributors had already eliminated the worst problems before they were
generally known.
Distributors make it possible to change the source of your operating system
without having to stop running Linux. Anybody who has been working with
Linux long enough has almost certainly switched distributions at least once
during that time; the process is not without its disruptions, but the
amount of pain is usually surprisingly low. The lack of lock-in in the
Linux world has improved life for users and, at the same time, given
distributors an incentive to improve the Linux experience for everybody.
The role of the distributors is made possible by the boundaries between the
projects. If the entire system were integrated into a single source tree,
there would be little space for the distributors to do their own
integration work. The lack of independent *BSD distributions makes this
point clear. That suggests that too much integration at the project level
might not be a good thing for Linux.
So one could make an argument that bringing GNOME into the kernel source
tree is probably a bad idea for this reason alone; Linux as a whole may be
better served by having the kernel and the desktop environments be separate
components that can be combined (or not) at will. That makes it clear (if
it wasn't before - your editor can be slow at times, please bear with him)
that there is a line to be drawn somewhere; bringing some projects into the
kernel source tree may be harmful for Linux even without considering the
effects on the kernel itself. But separating the kernel from some
user-space projects may have costs that are just as high. There is no
consensus, currently, on what those costs are or where the line should be
drawn.
All of this implies that the debate over the inclusion of the KVM tool has
an importance that goes beyond the fate of that one project. Does (as some
allege) the integration between perf and the kernel impede the development
of alternatives and hurt the performance tooling ecosystem as a whole?
Would the integration of the KVM tool put QEMU at the mercy of a
fast-changing, regression-prone API over which its developers have no
control? Are we better served by a fence between the kernel and user space
that is as well defined at the project level as it is at the API level?
Or, on the other hand, does keeping the KVM tool out of the kernel
repository slow its growth and hurt the capability and usability of Linux
tooling as a whole? And, importantly, what does the reasoning that leads to
an answer to these questions tell us about which other projects should - or
should not - find a home in the kernel tree?
These issues arise at a number of levels; some distributors, for example,
are increasingly taking control of parts of the system through
tightly-controlled in-house projects. Android is an extreme example of
this approach, but it can be found in more traditional distributions as
well. There are clear advantages to doing things that way, but it is worth
asking whether that behavior is good for Linux in the long term and just
where the line should be drawn. The fences between our projects may have
played an important role in both the successes and failures of Linux;
decisions on whether to strengthen them or tear them down need some serious
thought.
Comments (17 posted)
By Jake Edge
November 9, 2011
While the talks at the 2011 GStreamer conference mostly focused on the
multimedia framework itself—not surprising—there were also some
that looked at the wider multimedia ecosystem. One of those was Christopher "Monty"
Montgomery's presentation about Xiph.org, and its
work to promote free and open source multimedia. Xiph is known for its
work on the Ogg container format (and the Vorbis and Theora codecs), but
the organization has worked on much more than just those. In addition,
Montgomery outlined a new strategy that Xiph is trying out to combat one of the
biggest problems in the free multimedia world: codec patents.
Xiph was founded in 1994, originally as a for-profit company (Xiph.com)
that was set
up to sell codecs. These days, it is a non-profit that consists of various
"loosely
grouped" codec projects. All of the members are volunteers, and
various FOSS companies pay the salaries of some of the members as donations
to Xiph.org. For example, Red Hat pays Montgomery's salary to allow him to
work on Xiph projects. The organization is "like a
coffee shop where skilled codec developers hang out", Montgomery
said.
Beyond Ogg, Vorbis, and Theora, there are a number of different projects
under the Xiph umbrella, Montgomery said. The cdparanoia compact disc
ripper program and library was something he wrote as a student that is now
part of Xiph. The Icecast streaming media server is another Xiph project,
he said, as are various codecs including Speex, FLAC, the new Opus
audio codec, and "a whole bunch of codecs that no one
remembers".
Xiph does hold "intellectual property", Montgomery said, and that is one of
the reasons it exists. Non-profits have an advantage when it comes to
patents because the board gets to decide what happens to the patents if the
organization goes out of business. That's different from for-profit
companies that go bankrupt, he said, because whoever buys the assets gets
the patents free of any promises or other entanglements (at least those
that aren't legally binding, like licenses). If the original company
promised not to assert some patents (e.g. for free software implementations or
to implement a standard), a new owner may
not be bound by that promise. A non-profit's board can ensure that
any patents end up with a like-minded organization, he said.
Codec news
The biggest Xiph news in the recent past is that Google chose Vorbis as the
audio codec for WebM. Montgomery said that he is very happy to see Vorbis
included into WebM, but is also glad to see that Google is stepping up to
help the cause of free codecs. Xiph has been trying to "hold the
line on free codecs", mostly by themselves, he said. He is hopeful
that Google picking up some of that will allow Xiph to "go back to
what we are actually good at", which is codec development.
Xiph will be continuing to do more codec development because the members
enjoy doing so, Montgomery said. Revising the Ogg container format is one
thing that's on the plate now. That is not something that Xiph wanted to
do while Ogg was part of its effort to hold the free codec line. With
the advent of WebM, which uses the Matroska container format, some of the
"legitimate complaints" about Ogg can now be addressed.
FLAC is now finished, he said. It is stable and mature with good
penetration; it is essentially the standard for lossless audio codecs, and
one that Apple has been unable to overturn, Montgomery said. He also noted
that there were plans for a Theora 1.2 release that never happened, partly
because "everyone went to work on VP8 and Opus". He believes
that the release will still happen at some point, but that the pressure is
off because of the existence of WebM.
Opus is a new audio codec that
incorporates pieces from Xiph's CELT codec and Skype's SILK codec. Opus is
designed for streaming voice or other audio over the internet, and is the
subject
of an IETF
Internet-draft. As is usual for such documents,
Intellectual
Property Rights (IPR) disclosures were made by various parties who
believed they had IP (e.g. patents) that are required to implement the
proposed standard. Qualcomm has filed such a disclosure for Opus, but,
unlike the other disclosing organizations, Qualcomm has not offered its
patents under a royalty-free license.
Patent strategy
Montgomery was clear that he wasn't singling out Qualcomm in his talk,
because what it has done is "business as usual" in the industry, and
Qualcomm is "not in any sense alone" in making these kinds of
claims. But it has led Xiph to spend almost as much time on patent strategy
as it has in writing code recently. Part of the problem is that these IPR
disclosures are immediately assumed to be valid by everyone, whether they
know something about patents in that space or not. The
presumption is that Qualcomm would never have made the claims without doing
a great deal of research.
But Montgomery is not convinced that there is much of substance to
Qualcomm's claims. The patent game is essentially a protection racket, he
said, and those who are trying to do things royalty-free are messing things
up for those who want to collect tolls. "The industry is pissed at
Google because they won't play the protection racket game", he
said. Qualcomm and others just list some
patents that look like they could plausibly read on a royalty-free codec,
because it doesn't cost them anything.
That leaves Xiph with few options, though. There is the
"thermonuclear option" of going to court and getting a
declaratory judgement, but there are some major downsides to pursuing that
strategy. It
will take a lot of time and money to do so and "no one will use it
while the litigation is going on". Montgomery's original
inclination was to pursue a declaratory judgement, to "bash in some
teeth" and "show that Xiph.org is not to be trifled
with". But even if Xiph won, it would only impact those few patents
listed by Qualcomm. What is needed is a way to "change 'business as
usual'", he said.
Companies "have figured out how to fight 'free'",
Montgomery said, by making it illegal. In order to fight back through the
courts, there would be an endless series of cases that would have to be
won, and each of those wins would not hurt the companies at all. There is
a "presumption of credibility" when a patent holder makes a
claim of infringement, and the press "plays along with that",
he said. But Eben Moglen has pointed out that an accusation of
infringement has no legal weight, so there is no real downside to making
such a claim.
One way to combat that is to document why the patents don't apply.
Basically, Xiph did enough research to show why the Qualcomm patents don't
apply to Opus and it is planning to release that information. It is a
dangerous strategy at some level because it gives away some of the defense
strategy, he said, but Xiph has to try something. By publishing the results
of the research, Xiph will be "giving away detailed knowledge of the
patents" and may be called to testify if those patents ever do get
litigated, but it should counter the belief that the Qualcomm patents cover
Opus.
Qualcomm could respond to the research in several different ways. It could
ignore it,
respond to it, or come back with more patents. It could also formally
abandon the claim. If Qualcomm doesn't respond,
Montgomery said, that does have some legal weight. One advantage of this
approach is that regardless of how Qualcomm responds, Xiph has something
concrete (i.e. the research) for the money that it has spent, which is not
really the case when taking the declaratory judgement route.
New codecs
Montgomery called Opus a "best in class codec" that Xiph would
like to see widely used. Hardware implementations of Opus have been
considered, but have not been done yet, he said. Finishing the Opus rollout and
"responding to patent claims" have been higher on the list,
but they will get to it eventually.
He mentioned two other codecs that Xiph will be working on, including
Ghost,
which splits audio into two components: strong tones and everything
else. Each of the components will be processed separately, much like what
the ears do, he said. Both can be represented compactly, but the same
transforms don't work on them, so representing them separately may make
sense. There was a need to "invent some amount of math for all of
this", he said. In addition, Xiph will be working on a new video
codec that is being done as part of a "friendly rivalry with
On2" (makers of the VP8 codec in WebM).
Montgomery painted a picture of an organization that is doing a great deal
to further the cause of
free multimedia formats. There are lots of technical and political
battles to fight, but Xiph.org seems to be up to the task. It will be
interesting to see how Qualcomm responds to the Opus research, and
generally how the codec patent landscape plays out over the next few
years. The battle is truly just beginning ...
[ I'd like to thank the Linux Foundation for helping with travel expenses
so that I could attend the GStreamer conference. ]
Comments (18 posted)
By Jake Edge
November 9, 2011
One of the outcomes from the kernel.org compromise is the
increased use of GPG among kernel developers. GPG keys are now required to
get write access to the kernel.org Git repositories, and folks are starting
to think about how to use those keys for other things. Authenticating pull
requests made by kernel hackers to Linus Torvalds are one possible
use. But, as the discussion on the linux-kernel mailing list shows, there
are a few different use-cases that
might benefit from cryptographic signing.
Most of the code that flows into the kernel these days comes from
Git trees that various lieutenants or maintainers manage. During the merge
window (and at other times), Torvalds is asked to "pull" changes from these
trees via an email from the maintainer. In the past, Torvalds has used some
ad hoc heuristics to determine whether to trust that the request (and the
tree) are valid, but, these days, stronger assurances are needed.
That's where GPG signing commits and tags may be able to help.
Conceptually the idea is simple: the basic information required to do a
pull (location and branch of the Git tree along with the commit ID of its
head) could
be signed by the developer requesting the pull. Torvalds could then use
GPG with
his keyring of kernel developer public keys to verify that the signature is
valid for the person who sent the request. That would ensure that the pull
request is valid. It could all be done manually, of course, but it could
also be automated by making some changes to Git.
The discussion on how to do that automation started after a signed pull
request for libata updates was posted by Jeff Garzik. The entire pull request
mail (some 3200+ lines including the diffs and diffstat) was GPG signed,
which mangled the diff output as Garzik noted. Beyond that,
though, it is unwieldy for Torvalds to check the signature, partly because
he uses the GMail web interface. In order to check it, he has to cut and
paste the entire message and feed it to GPG, which is labor intensive and
might be prone to the message being mangled—white space or other changes—that would lead to a false negative signature verification. As
Torvalds noted: "We need to automate this some sane way, both for the
sender and for the recipient."
The initial goal is just to find a way to ensure that Torvalds knows who
the pull
request is coming from and where to get it, all of which could be handled
outside of Git. Rather than signing the entire pull request email, just a
small, fixed-format piece of that mail could be signed. In fact, Torvalds
posted a patch to git-request-pull
to do just that. It still leaves the integrator (either Torvalds or a
maintainer who is getting a pull request from another developer) doing a
cut-and-paste into GPG for verification, however.
There are others who have an interest in a permanent trail of signatures
that could be audited if the provenance of a particular part of the kernel
needs to be traced. That would require storing the signatures inside the
Git tree somehow, so that anyone with a copy of Torvalds's tree could see
any of the commits that had been signed, either by Torvalds or by some
other kernel hacker. But, as Torvalds pointed
out, that information is only rarely useful:
Having thought about it, I'm also not convinced I really want to
pollute the "git log" output with information that realistically
almost nobody cares about. The primary use is just for the person who
pulls things to verify it, after that the information is largely stale
and almost certain to never be interesting to anybody ever again. It's
*theoretically* useful if somebody wants to go back and re-verify, but
at the same time that really isn't expected to be the common case.
Torvalds's idea is that the generation of the pull request is the proper time for a developer
to sign something, rather than having it tied to a specific commit. His
example is that a developer or maintainer may wish to push the tree out for
testing (or to linux-next), which requires that it be committed, but then
request a pull for that same commit if it passes the tests. Signing before
testing has been done is likely to be a waste of time, but signing the
commit later requires amending the commit or adding a new empty commit on
top, neither of which were very palatable. Git maintainer
Junio C. Hamano is not convinced that
ephemeral signatures (i.e. those that only exist for the pull-request) are
the right way to go, though: "But my gut feeling is that 'usually hidden not to disturb normal users,
but is cast in stone in the history and cannot be lost' strikes the right
balance."
The conversation then turned toward tags, which can already be signed with
a GPG key. One of the problems is that creating a separate tag for each
commit that gets signed rapidly becomes a logistical nightmare. If you
just consider the number of trees that Torvalds pulls in a normal merge
window (hundreds), the growth in the number of signed tags becomes
unwieldy quickly. If you start considering all of the sub-trees that get
pulled into the trees that Torvalds pulls, it becomes a combinatorial
explosion of tags.
What's needed is an automated method of creating tag-like entries that live
in a different namespace. That's more or less what Hamano proposed by adding a refs/audit
hierarchy into the .git directory data structures. The audit objects would act much like tags, but
instead carry along information about the signature verification status of
the merges that result from pulls. In other words, a git-pull
would verify the signature associated with the remote tag (which are often
things like "for-linus" that get reused over and over) and create an entry
in the local audit hierarchy
that recorded the verification. Since the audit objects wouldn't pollute
the tag namespace, and would be pulled and created automatically, they will
have much less of an impact on users and existing tools. In addition,
the audit objects could then be pushed
into Torvalds's public tree so that audits could be done.
So far, Hamano has posted a patch set that
implements parts of his proposed solution. In particular, it allows for
signing commits, verifying the signatures, and for pulling signed tags.
Other pieces of the problem are still being worked on.
As is often the case in our communities, adversity results in pretty rapid
improvements. For the kernel, the SCO case brought about the Developer's Certificate of
Origin, the relicensing of BitKeeper gave us Git, the kernel.org
break-in brought about a closer scrutiny of security practices, and the adoption
of GPG keys because of that break-in will likely lead to even better
assurances of the provenance of kernel code. While we certainly don't want
to court adversity, we certainly do take advantage of it when it happens.
Comments (12 posted)
Page editor: Jonathan Corbet
Next page: Security>>