KS2011: Patch review
Ted Ts'o talked about the plight of developers who get a lot of
detail-oriented review which they duly address. Only later does a
subsystem maintainer come by and say that the basic premise of the patch is
wrong and that the previous revisions were for nothing. It would be
better, he said, if maintainers could perform high-level review of patches
quickly to help developers avoid useless work. Ingo Molnar disagreed,
saying that reviewer bandwidth is, in general, the limiting factor in
kernel development. So he insists that the code be clean and free of
stylistic problems before he will review it at all.
Another issue raised by Ted is ensuring that problems raised in previous reviews have been addressed in a new revision of a patch set. Tools like Gerrit, he said, can be most helpful in this regard. With Gerrit, it is possible to look at the changes in a patch over time and to track the comments that were made. At his work, use of Gerrit is mandated by policy, and it helps a lot.
Johannes Berg said that there is more review happening than people think, it's just that a lot of it is invisible. He looks at almost every wireless patch that goes by but, if he sees nothing to complain about, he'll usually just move on quietly. Responding to every patch, even if he has no objections, would increase the amount of email flying around and slow things down.
Peter Zijlstra complained that he has a whole mailbox full of patches that he needs to review. Alan Cox corrected him: what Peter really has is a mailbox of patches whose developers need to find reviewers for. Trying to do all the review himself is never going to scale; that is a lesson that has been learned many times in the kernel's history. It is necessary to find a way to spread out the work.
Andrew Morton said that the quality of the reviews and Acked-by responses is not always that great. Sometimes a patch will go out, only to be followed by several Acked-by responses within a few minutes; he knows that all of those people are not reviewing that patch in that time. Some of those responses, he said, are clearly bogus. Peter suggested that "corporate interest" can be behind that kind of coordinated response. But Dave Airlie pointed out that quick acks can also happen if the patch has already been through internal review and the developers involved already know that they are happy with it. James Bottomley said that he'll often ignore reviews that come from the same company as the original patch; he will also ignore a Reviewed-by response that did not contain any actual comments.
Alan said that part of the problem comes from development that has been contracted out to other companies - often located in China or India - where the contract requires that the resulting code be merged upstream. Ted noted that this is actually an improvement - upstreaming code was once seen as unimportant. But, he said, it would be good to get any in-house review done in public rather than behind closed doors; he acknowledged that this idea can be a hard sell in some companies. Alan observed that many of those patches, in their early versions, are not something that most people would want to see.
Rafael concluded that if there is a review problem, it is specific to some subsystems. The maintainers of those subsystems will need to figure things out. But, Ted said, a related problem is large patch sets that touch many subsystems. Those patch sets may never get in at all, or they may see review only for some parts of the set.
Andrew complained about the acceptance of entirely new features into the kernel. Those features often land on his doorstep without much justification, forcing him to ask the developers to explain their motivations. The kernel community, he complained, is not supporting him well. Who can tell him if a given patch makes sense? Mistakes have been made in the past; bad features have been merged and good stuff has been lost. How, he asked, can he find people who know better about the desirability of specific patches?
Grant Likely suggested the possible creation of an architectural review group that could consider patches as a whole. Andrew said that might help, but it would have to meet frequently enough to respond to patches quickly. Nobody took responsibility for organizing this group, so it would not be surprising if the idea goes no farther.
When "no" becomes "yes"
Alan said that, once upon a time, Linus used to simply say "no" to a lot of patches. If the distributors shipped it anyway for long enough, he would usually give in and merge it anyway. It is not an infallible method, but it provided a good indication of whether a patch is worthwhile or not. Chris Mason observed that the distributors have changed; they no longer want to ship out-of-tree code in their kernels, so this method has lost some of its efficacy.
Inevitably somebody pointed out an important example of just this process in action, though. Many of the changes in the Android kernel have been blocked from merging into the mainline, but Android (which certainly qualifies as a sort of distribution) shipped it anyway, and it now has many millions of users. What does that say about how we should treat the Android code?
As one might imagine, the discussion became rather unfocused and fragmented
for a while. It came back together when Linus took the microphone and
stated that, simply, code that actually is used is the code that is
actually worth something.
The Android code is certainly being used; the in-kernel code aimed
at the same problems is just a vague idea that is worthless in comparison.
We should, he said, consider merging suspend blockers as a real option.
Even if it truly is crap, we've had crap in the kernel before. The code
does not get any better out of tree. Alan Cox agreed that it is probably a
good idea to merge that code. The interface is important and has a lot of
users; getting the code merged is the best way to fix the implementation.
Ingo also agreed, saying that when code has millions of users, we have to
say "yes" to it.
Tim Bird said that there is a project underway in the Consumer Electronics Linux Forum to fix up the Android patches and resubmit them for merging. That news came as a surprise to the others in the room; it seems that there is an independent effort in Linaro to do the same thing. Hopefully one result of this discussion is that these people will find each other and join forces.
Ted said that, as kernel developers, we often have an inflated idea of our ability to block things by saying "no." We can't stop things that way; they will happen anyway; Jesse Barnes added that we don't have the power to stop product development. Ben Herrenschmidt (and others) noted that, for a lot of hardware vendors, the requirement now is to provide an Android kernel for their products - not a mainline kernel. There is a Linux fork out there, and it is threatening to take over. We are, Ben said, shooting ourselves in the foot by keeping that code out.
Linus said that it comes down to a marketing issue. Once upon a time, the traditional Linux distributors used to ship a lot of out-of-tree code in their quest to have the best kernel. They have gotten past that phase now and can pat themselves on the back, but they did the same things that Android is doing now. We should, he said, calm down about the whole thing.
There were no hard conclusions from the session. Given the way things went, though, it would be seriously surprising not to see the Android code seriously considered for merging in the 3.3 or 3.4 time frame. Android has shown staying power, and much of the opposition appears to have moderated over time. That is a surprising conclusion from a session on patch review, but a worthwhile one nonetheless.
Next: Development process issues
Index entries for this article | |
---|---|
Kernel | Android |
Kernel | Development model/Code review |
Posted Oct 25, 2011 0:59 UTC (Tue)
by neilbrown (subscriber, #359)
[Link] (2 responses)
This makes no sense to me. As defined in SubmittingPatches, the Signed-off-by: line says nothing about the quality of the patch, only the origin.
So it seems to me that if the original statement is true, then people are interpreting SOB wrongly.
SubmittingPatches gives a "Reviewer's statement of oversight" which defines what Reviewed-by: is meant to mean, and it sounds like it should carry a whole lot more weight...
Do we need to revise SubmittingPatches ??
Posted Oct 26, 2011 3:32 UTC (Wed)
by tytso (subscriber, #9993)
[Link]
(1) when a patch is written by the original author and submitted for consideration. At that point, the author had better be convinced that it is Correct and Sane, since he or she is putting their name and reputation on the line with that patch. In some cases patches are submitted for comments, and in that case sometimes the signed-off-by line is not present (because the author knows some changes might be necessary), or to make the point that the patch really isn't suitable for inclusion, might even have a NOT-Signed-off-by: line to make that point clear.
(2) when the patch is copied from e-mail into a git repository. If the person with the git repository then sends a pull request to Linus, it had better mean that she has put her name and reputation on the line, and is confident that the patch is good. It may not mean that the person has reviewed it; it may be that he is depending on his reviewers or sub-maintainers to have done a good job. But ultimately, it is the maintainer's reputation which is on the line. If the maintainer isn't sure about a sub-maintainer's competence in review and good taste, then she should be doing spot checks or even review every single commit before accepting a pull from the sub-maintainer --- or not stop pulling form the sub-maintainer altogether.
This also implies that in practice, a commit should generally only have two signed-off-by lines. There is an exception, which is if there is a sub-maintainer who isn't using git, so a patch gets forwarded outside of git an additional time. An example of this Andrew Morton's mm tree, which is maintained as a quilt patch series outside of git. But in general, patches are reviewed on e-mail, and once they hit git, they're not going to accumulate any more signed-off-by lines. As those git commits get merged into more senior trees, the maintainer who accepts the pull request is effectively putting their imprimatur on them. So they had better have reviewed the patch, or be willing to put their reputation on the line based on their trust of their sub-maintainers.
Posted Oct 26, 2011 22:32 UTC (Wed)
by BenHutchings (subscriber, #37955)
[Link]
Posted Oct 25, 2011 3:55 UTC (Tue)
by dlang (guest, #313)
[Link] (8 responses)
in many cases there is a driver for a device in a new kernel, but since that's not an 'android' kernel, people spend a lot of time duplicating or backporting drivers.
I'm not saying that the kernel.org kernel needs to accept all the implementation details of the android kernels, but they do need to accept enough that the interfaces can be made available and the result at least run.
in the case of wakelocks, on my tablet I don't have all the different weird things going on that can happen on a phone, so I would even be happy with a noop version of wakelocks that didn't actually do anything beyond allowing the android userspace to run without complaining.
Posted Oct 25, 2011 7:33 UTC (Tue)
by swetland (guest, #63414)
[Link] (7 responses)
To be very clear, we (Google/Android) are not opposed to some level of compromise, incremental shifts, and migration of APIs. We just find the "no you should throw it all away and rewrite your userspace to be different" approach to be something we're not interested in doing.
I suspect there's a middle ground somewhere between "put all of the Android changes in the kernel unmodified right now" and "delete them forever because they suck". Here's hoping we find it.
Posted Oct 25, 2011 8:00 UTC (Tue)
by dlang (guest, #313)
[Link] (6 responses)
and then of those interfaces, which ones need to actually be something other than a stub?
for example, wakelocks could probably be a stub, but an inter-process communications would have to actually work (I seem to remember reading that Android had a kernel assisted IPC protocol)
Posted Oct 25, 2011 8:08 UTC (Tue)
by swetland (guest, #63414)
[Link] (5 responses)
Some of the notable ones off the top of my head:
Where possible we try to use standard interfaces -- for example, the power supply framework for battery charging and status, a lot of assorted interfaces via /sys, etc.
Posted Oct 25, 2011 8:17 UTC (Tue)
by dlang (guest, #313)
[Link] (2 responses)
as for ION, since there is no code available yet, it's hard to talk about it.
Posted Oct 25, 2011 8:23 UTC (Tue)
by swetland (guest, #63414)
[Link] (1 responses)
We're still getting back online after the kernel.org downtime, so I don't have a handy URL for browseable source. Before kernel.org went down the ION kernel side sources were being developed in the open in the kernel repositories at android.git.kernel.org.
Hopefully it'll all be sorted out in the near future alongside the ICS userspace source release.
Posted Oct 25, 2011 18:01 UTC (Tue)
by dlang (guest, #313)
[Link]
I am thinking that the interfaces needed to just make things 'run' on a kernel.org kernel is much smaller than what people have been trying to get submitted. And then once this smaller set is in place, things can be worked on much more easily.
while I have your attention, did you see Neil Brown's proposal for a userspace wakelock-type daemon? how hard would it be to modify the Android userspace to work with his proposal? As I understand it, there's only one daemon that actually talks to the kernel, everything else in the Android userspace talks to that daemon.
Posted Oct 25, 2011 9:31 UTC (Tue)
by xav (guest, #18536)
[Link] (1 responses)
[1] https://lwn.net/Articles/462524/
Posted Oct 26, 2011 15:37 UTC (Wed)
by a0273185@ti.com (guest, #74477)
[Link]
dmabuf just deals with buffer sharing.
I imagine CMA + dmabuf would displace ION.
Posted Nov 1, 2011 18:55 UTC (Tue)
by badcow (guest, #71514)
[Link] (1 responses)
Posted Nov 8, 2011 18:58 UTC (Tue)
by wookey (guest, #5501)
[Link]
KS2011: Patch review
It represents a "Developer's Certificate of Origin" and is meant to address questions of copyright, not questions of value.
KS2011: Patch review
KS2011: Patch review
KS2011: Patch review
KS2011: Patch review
KS2011: Patch review
KS2011: Patch review
- wakelocks
- binder (IPC subsystem) -- /dev/binder -- self-contained, in theory one could replace it with an entirely userspace implementation, in practice probably a lot of work
- logger driver -- /dev/logs/* -- multiple ring buffers for logging state without a lot of context switch overhead -- back when running on virtually tagged ARM9s this was very useful
- ION -- /dev/ion -- a unified approach to buffer management and sharing between display, GPU, camera, codecs, etc, new in Ice Cream Sandwich
KS2011: Patch review
KS2011: Patch review
KS2011: Patch review
KS2011: Patch review
KS2011: Patch review
KS2011: Patch review
KS2011: Patch review