Brief items
The current development kernel is 3.11-rc2,
released on July 21. Linus says:
"
the O_TMPFILE flag that is new to 3.11 has been going through a few
ABI/API cleanups (and a few fixes to the implementation too), but I think
we're done now. So if you're interested in the concept of unnamed temporary
files, go ahead and test it out. The lack of name not only gets rid of
races/complications with filename generation, it can make the whole thing
more efficient since you don't have the directory operations that can cause
serializing IO etc."
Stable updates: 3.10.2, 3.9.11, 3.4.54, and 3.0.87 were released on July 21; 3.9.11
is the last of the 3.9 series.
As of this writing, 3.10.3 and 3.2.49 are in the review process; they can be
expected sometime on or after July 25.
Comments (none posted)
I'm not even going to speculate why people interested in InfiniBand
switches end up buying paper towels.
—
Roland
Dreier
I'm cantankerous, and hard to please. Send me too much and I yell,
and send me too little and I yell. Because I'm the Goldilocks of
kernel development, and I want my pull requests "just right".
—
Linus Torvalds
Though I must confess that I have shifted from being mostly worried
about people yelling at me to being mostly worried about my own
code yelling at me. Either way, I do find that being worried about
some consequence or another does help me get a better result.
—
Paul McKenney
Comments (none posted)
Kernel development news
By Jonathan Corbet
July 24, 2013
Kernel developers working on the x86 architecture are spoiled; they develop
for hardware that, for the most
part, identifies itself when asked, with the result that it is usually easy
to figure out how a specific machine is put together. Other architectures
— most notably ARM — are rather messier in this regard, requiring the
kernel to learn about the configuration of the hardware from somewhere
other than the hardware itself. Once upon a time, hard-coded "board files"
were used to build ARM-system-specific kernels; more recently, the
device tree mechanism
has emerged as the preferred way to describe a system to the kernel. A
device tree file provides the enumeration information that the hardware
itself does not, allowing the kernel to understand the configuration of the
system it is running on. The device tree story is one of success, but,
like many such stories, success is bringing on some growing pains.
A device tree "binding" is the specification of how a specific piece of
hardware can be described in the device tree data structure. Most drivers
meant to run on platforms where device trees are used include a
documentation file describing that driver's bindings; see Documentation/devicetree/bindings/net/can/cc770.txt
as a randomly chosen example. The kernel contains nearly 800 such files,
plus a hundreds more ".dts" files describing complete
system-on-chips and boards, and the number is growing rapidly.
Maintenance of those files is proving to be difficult for a number of
reasons, but the core of the problem can be understood by realizing that a
device tree
binding is a sort of API that has been exposed by the kernel to the world.
If a driver's
bindings change in an incompatible way, newer kernels may fail to boot on
systems with older device trees. Since the device tree is often buried in
the system's firmware somewhere, this kind of problem can be hard to fix.
But, even when the fix is easy, the kernel's normal API rules should apply;
newer kernels should not break on systems where older kernels work.
The clear implication is that new device tree bindings need to be reviewed
with care. Any new bindings should adhere to existing conventions, they
should describe the hardware completely, and they should be supportable
into the future. And this is where the difficulties show up, in a couple
of different forms: (1) most subsystem maintainers are not device tree
experts, and thus are not well equipped to review new bindings, and
(2) the maintainers who are experts in this area are overworked
and having a hard time keeping up.
The first problem was the subject of a
request for a Kernel Summit discussion with the goal of educating
subsystem maintainers on the best practices for device tree bindings.
One might think that a well-written document would suffice for this
purpose, but, unfortunately, these best practices still seem to be in the
"I know it when
I see it" phase of codification; as Mark Brown put it:
At the minute it's about at the level of saying that if you're not
sure or don't know you should get the devicetree-discuss mailing
list to review it. Ideally someone would write that document,
though I wouldn't hold my breath and there is a bunch of convention
involved.
Said mailing list tends to be overflowing with driver postings, though,
making it less useful than one might like. Meanwhile, the best guidance,
perhaps, came from David Woodhouse:
The biggest thing is that it should describe the *hardware*, in a
fashion which is completely OS-agnostic. The same device-tree
binding should work for Solaris, *BSD, Windows, eCos, and
everything else.
That is, evidently, not always the case, currently; some device tree
bindings can be strongly tied to specific kernel versions. Such bindings
will be a maintenance problem in the long term.
Keeping poorly-designed bindings out of the mainline is the responsibility
of the device tree maintainers, but, as Grant Likely (formerly one of those
maintainers) put it, this maintainership
"simply isn't working right now." Grant, along with Rob
Herring, is unable to keep up with the stream of new bindings (over 100 of
which appeared in 3.11), so a lot of substandard bindings are finding their
way in. To address this problem, Grant has announced a "refactoring" of
how device tree maintainership works.
The first part of that refactoring is Grant's own resignation, with lack of
time given as the reason. In his place, four new maintainers (Pawel Moll,
Mark Rutland, Stephen Warren and Ian Campbell) have been named as being
willing to join Rob and take responsibility for device tree bindings;
others with an interest in this area are encouraged to join this group.
The next step will be for this group to figure out how device tree
maintenance will actually work; as Grant noted, "There is not yet any
process for binding maintainership." For example, should there be a
separate repository for device tree bindings (which would make review
easier), or should they continue to be merged through the relevant
subsystem trees (keeping the code and the bindings together)? It will take
some time, and possibly a Kernel Summit discussion, to figure out a proper
mechanism for the sustainable maintenance of device tree bindings.
Some other changes are in the works. The kernel currently contains
hundreds of .dts files providing complete device trees for
specific systems; there are also many .dtsi files describing
subsystems that can be included into a complete device tree. In the short
term, there are plans to design a schema that can be used to formally
describe device tree bindings; the device tree compiler utility
(dtc) will then be able to verify that a given device tree file
adheres to the schema. In the longer term, those device tree files are
likely to move out of the kernel entirely (though the binding documentation
for specific devices will almost certainly remain).
All told, the difficulties with device trees do not appear to be anything
other than normal growing pains. A facility that was once only used for a
handful of PowerPC machines (in the Linux context, anyway) is rapidly
expanding to cover a sprawling architecture that is in wide use. Some
challenges are to be expected in a situation like that. With luck and a
fair amount of work, a better set of processes and guidelines for device
tree bindings will result from the discussion — eventually.
Comments (21 posted)
By Jonathan Corbet
July 24, 2013
The
exFAT filesystem is a
Microsoft product, designed for flash media. It lacks support in the Linux
kernel; as a proprietary, heavily patented filesystem, it is not the sort
of thing one would expect to see free support for. Still, when the
exfat-nofuse repository
showed up on GitHub, some dared to hope that Linux would gain exFAT support
after all. Instead, what we appear to have gained is an ugly licensing
mess and code that is best avoided.
From what can be determined by looking at the repository, the code appears
to work. It was originally written by Samsung, it seems, and was shipped
with one or more Android devices. The problem is that, as far as anybody
can tell, Samsung never intended to distribute this code under the GPL.
Instead, a GitHub user who goes by "rxrz" somehow came by a copy of the
code, removed the original proprietary licensing headers, and inserted a
GPL license declaration into the code. The code claimed to have a GPL
license, but the copyright owner never released the code under that
license.
On July 9, another GitHub user filed a bug noting
that the license declaration was incorrect and suggesting a removal of the
repository. The entity known as rxrz was not impressed, though, saying:
It's a leaked code of a proprietary exfat driver, written by
Samsung, Inc. It works, you can use it. What else do you want, a
signed paper from your parents on whether you can or can not use
it? I'm a programmer, not a lawyer. You got the code, now decide
what to do with it, it's up to you.
The code has since been edited to remove the GPL declaration and restore
the proprietary license, but it
remains available on GitHub and rxrz evidently feels that nothing wrong was
done by posting it there. It also appears that GitHub has no interest in pulling
down the repository in the absence of an explicit takedown notice from
Samsung, so this "leaked" driver may remain available for some time.
This whole episode seems like a fairly straightforward case of somebody
trying to liberate proprietary code by any means available. There are some
interesting questions
raised by all of this, though. The first of those is: what if somebody had
tried to merge this code into the mainline kernel? The immediate answer is
that they would have been chased off the list once developers actually had
a look at the code, which, to put it gently, does not much resemble Linux
kernel code. In the absence of this obvious barrier, one can hope that our
normal review mechanisms would have kept this code from being merged until
the developer was able to provide a satisfactory explanation of where it
came from.
But it is not clear that all of our code is reviewed to that level, so it
is hard to be sure. An exFAT implementation is likely to attract enough
attention to ensure that the right questions are asked. Had the code in
question been a driver for a relatively obscure piece of hardware, instead,
it might not have been looked at very closely.
Then, one might ask: why is Samsung shipping this as a proprietary module
in the first place? After all, Samsung appears to have figured out how
Linux kernel development works and has made a solid place for itself as one
of the largest contributors to the kernel. One can only guess at the
answer, but it likely has to do with claims that Microsoft makes over the
exFAT format. Microsoft has shown itself to be willing to assert patents
on filesystem formats, so taking some care with an implementation of a new
Microsoft filesystem format would seem like an exercise in basic prudence.
Whether this exercise led to ignoring the GPL in an imprudent
manner is the subject of another debate entirely.
Similarly, some prudence would be advisable for anybody thinking to use
the code
as a reverse-engineering tool for a new exFAT implementation. It is hard
to reverse-engineer one's way around patent problems. exFAT may well be a
format that is best left alone.
Finally, for those who have been in this community for a long time, the
attitude revealed by a number of participants in the GitHub issue thread
may be surprising. Licensing, GPL or otherwise, appears not to matter to
many of these people. All that matters is that the code can be downloaded
and that it works. This attitude can be found elsewhere on GitHub; indeed,
many have complained that GitHub itself seems to be indifferent at best to
the licensing of the code it distributes.
Perhaps we are heading into some sort of post-copyright era where licensing
truly no longer matters. But it would not be surprising if those who are
interested in copyright resist that future for a while yet. We are not
just talking about the entertainment industry here; the simple
fact of the matter is that anybody who values the provisions of the GPL is
indeed interested in copyright. It is hard to demand respect for the GPL
while refusing to respect the terms of other licenses.
Among other things, that means that the kernel community must continue to
be careful not to incorporate code that has not been contributed under a
suitable license. So code that shows up on the net must be looked at
carefully, no matter how useful it appears to be. In this case, there was
no danger that the exFAT code would ever be merged; nobody even suggested
that it should be. But there will be other modules of dubious provenance
in the future, some of which may seem more legitimate at first glance. Even
then, though, our processes should be good enough to find the problems and
avoid a merger that we will later regret. Hopefully.
(Thanks to Armijn Hemel for the heads-up).
Comments (53 posted)
By Jonathan Corbet
July 24, 2013
Tens of thousands of changes make their way into the mainline kernel every
year. For most of those changes, the original motivation for the work is
quickly forgotten; all that remains is the code itself and the changelog
that goes with it. For this reason, kernel maintainers tend to insist on
high-quality changelogs; as Linus recently
put
it, "
We have a policy of good commit messages in the
kernel." Andrew Morton also famously pushes developers to document
the reasons explaining why a patch was written, including the user-visible
effects of any bugs fixed. Kernel developers do not like having to reverse
engineer the intent of a patch years after the fact.
With that context in mind, and having just worked through another merge
window's worth of patches, your editor started wondering if our changelogs
were always as good as they should be. A bit of scripting later, a picture
of sorts has emerged; as one might expect, the results were not
always entirely encouraging.
Changelogs
A patch's changelog is divided into three parts: a one-line summary,
a detailed change explanation, and a tags section. For the most
trivial patches, the one-line summary might suffice; there is not much
to add to "add missing include of foo.h", for example. For anything else,
one would expect a bit more text describing what is going on. So patches
with empty explanation sections should be relatively rare.
As of this writing, just under 70,000 non-merge changesets have been pulled
into the mainline repository since the release of the 3.5 kernel on
July 21, 2012. Of those, 6,306 had empty explanations — 9% of the
total. Many of them were as trivial as one might expect, but others were
rather less so.
Some developers are rather more laconic than others. In the period since
3.5, the developers most inclined to omit explanations were:
| Developer | Count |
| Al Viro | 570 |
| Ben Skeggs | 224 |
| Mark Brown | 213 |
| Hans Verkuil | 204 |
| Andreas Gruenbacher | 143 |
| Axel Lin | 130 |
| Philipp Reisner | 126 |
| Antti Palosaari | 118 |
| James Smart | 107 |
| Alex Deucher | 85 |
| Laurent Pinchart | 84 |
| Kuninori Morimoto | 75 |
| Eric W. Biederman | 75 |
| Pavel Shilovsky | 72 |
| Rafał Miłecki | 72 |
| David S. Miller | 65 |
| David Howells | 61 |
| Peter Meerwald | 61 |
| Maxime Ripard | 55 |
| YOSHIFUJI Hideaki | 51 |
For the curious, a page listing the
no-explanation patches merged by the above developers is available. A
quick look shows that a lot of patches with empty explanations find their way
into the core virtual filesystem layer; many of the rest affect graphics
drivers, audio drivers, Video4Linux drivers, and the DRBD subsystem. But
they can be found anywhere; of the 1,065 changes that touched the
mm/ subdirectory, 46 lacked an explanation, for example.
If one believes that there should be fewer patches with empty explanations
going into the kernel, one might be inclined to push subsystem maintainers
to be a bit more demanding in this regard. But, interestingly, it has
become much harder to determine which maintainers have had a hand in
directing patches into the kernel.
Signoffs
The Signed-off-by line in the tags section is meant to document the
provenance of patches headed into the mainline. When a developer submits a
patch, the changelog should contain a signoff certifying that the patch can properly
be contributed to the kernel under a GPL-compatible license. Additionally,
maintainers who accept patches add their own signoffs documenting that they
handled the patch and that they believe it is appropriate for submission to
the mainline. In theory, by following the sequence of Signed-off-by lines,
it is possible to determine the path that any change followed to get to
Linus's tree.
The truth is a little bit more complicated than that. To begin with, of
the changes merged since 3.5, 79 had no signoffs at all. Roughly half of
those were commits by Linus changing the version number; he does not apply
a signoff to such changes, even for those
that contain added data beyond the version number update. The rest are
all almost certainly mistakes; a handful are the result of obvious
formatting errors. See the full
list for details. The mistakes are innocent, but they do show a
failure of a process which is supposed to disallow patches that have not
been signed off by their authors.
Arguably, there is another class of patches that is more interesting: those
that contain a single Signed-off-by line. Such patches have, in theory,
been managed by a single developer who wrote the patch and got it into the
mainline unassisted. One might think that only Linus is in a position to
do any such thing; how could anybody else get a change into the mainline on
their own?
In fact, of the 70,000 patches pulled into the mainline during the period
under discussion, 16,651 had a single signoff line. Of those, 11,527 (16%
of the total) had no other tags, like Acked-by, Reviewed-by, or Tested-by,
that would indicate attention from at least one other developer. For the
purposes of this discussion, only the smaller set of patches has been
considered. The most frequent committers of single-signoff patches are:
| Developer | Count |
| Al Viro | 891 |
| Takashi Iwai | 525 |
| Mark Brown | 492 |
| Johannes Berg | 414 |
| Alex Deucher | 391 |
| Mauro Carvalho Chehab | 389 |
| Ben Skeggs | 362 |
| Greg Kroah-Hartman | 292 |
| Trond Myklebust | 279 |
| David S. Miller | 264 |
| Felipe Balbi | 259 |
| Tomi Valkeinen | 258 |
| Arnaldo Carvalho de Melo | 172 |
| Eric W. Biederman | 147 |
| Josef Bacik | 145 |
| Shawn Guo | 142 |
| J. Bruce Fields | 141 |
| Ralf Baechle | 132 |
| Arnd Bergmann | 131 |
| Samuel Ortiz | 129 |
(See this page for a list of the
single-signoff patches merged by the above developers).
These results are, of course, a result of the use of git combined with the
no-rebasing rule. Once a patch has been
committed to a public repository, it becomes immutable and can never again
acquire tags like Signed-off-by. To pick one example from the list above,
wireless developer Johannes Berg maintains his own tree for mac80211
changes; when he commits a patch, it will carry his signoff. Changes flow
from that tree to John Linville's wireless tree, then to Dave Miller's
networking tree, and finally to the mainline repository. Since each of
those moves is done with a Git "pull" operation, no additional signoffs
will be attached to any of those patches; they will arrive in the mainline
with a single signoff.
One might contend that patches become less subject to review once they
enter the Git stream; they can be pulled from one repository to the next
sight-unseen.
Indeed, early in the BitKeeper era, developers worried that pull requests
would be
used to slip unreviewed patches into the mainline kernel. Single-signoff
patches might be an indication that this is happening. And, indeed,
important patches like the
addition of the O_TMPFILE option went to the mainline with, as
far as your editor can tell, no public posting or review (and no
explanation in the changelog, for that matter). It also seems
plausible that single-signoff patches merged into the sound subsystem or
the Radeon driver (to name a couple of examples) have not been reviewed in
detail by anybody other than the author; there just aren't that many people
with the interest and skills to review that code.
Without a chain of signoff lines, we lose more than a picture of which
maintainers might have reviewed the patches; we also lose track of the path
by which
a patch finds its way into the mainline. A given changeset may pass
through a number of repositories, but those passages leave no mark on the
changeset itself. Sometimes that path can be worked out from the mainline
repository history, but doing so can be harder than one might imagine, even
in the absence of "fast-forward merges" and other actions that obscure that
history. Given that the Signed-off-by line was introduced to document how
patches get into the kernel, the loss of this information may be a reason
for concern.
The kernel community prides itself on its solid foundation of good
procedures, including complete changelogs and a reliable signoff chain.
Most of the time, that pride is entirely justified. But, perhaps, there
might be room for improvement here and there — that is unsurprising when
one considers that no project that merges 70,000 changes in a year can be
expected to do a perfect job with every one of them. Where there is
imperfection, there is room for improvement — though improving the signoff
chain will be difficult as long as the tools do not allow it. But even a
bit more verbiage in commit messages would be appreciated by those of us
who read the patch stream.
Comments (47 posted)
Patches and updates
Kernel trees
Core kernel code
Development tools
Device drivers
Filesystems and block I/O
Memory management
Networking
Architecture-specific
Security-related
Miscellaneous
Page editor: Jonathan Corbet
Next page: Distributions>>