LWN.net Logo

Tightening the merge window rules

By Jonathan Corbet
September 9, 2008
The 2005 kernel summit included a discussion on a recurring topic: how can the community produce kernels with fewer bugs? One of the problems which was identified in that session was that significant changes were often being merged late in the development cycle with the result that there was not enough time for testing and bug fixing. In response, the summit attendees proposed the concept of the "merge window," a two-week period in which all major changes for a given development cycle would be merged into the mainline. Once the merge window closed, only fixes would be welcome.

Three years later, the merge window is a well established mechanism. Over that time, the discipline associated with the merge window has gotten stronger; it is now quite rare that significant changes go into the mainline outside of the merge window. The one notable exception is that new drivers can be accepted later in the cycle, based on the reasoning that a driver, being completely new and self-contained functionality, cannot cause regressions. Even then, there are hazards: the UVC webcam driver, merged quite late in the 2.6.26 cycle (in 2.6.26-rc9), brought a security hole with it.

The merge window rule is often expressed as "only fixes can go in after the -rc1 release." Recent discussions have made it clear, though, that Linus is starting to develop a rather more restrictive view of how development should go outside of the merge window. The imminent 2008 kernel summit may well find itself taking on this topic and making some changes to the rules.

In short, Linus has concluded that "fixes only" is not disciplined enough; a lot of work characterized as a "fix" can, itself, be a source of new regressions. So here's how Linus would like developers to operate now:

Here's a simple rule of thumb:
  • if it's not on the regression list
  • if it's not a reported security hole
  • if it's not on the reported oopses list
then why are people sending it to me?

There can be no doubt that the tighter rules have come as a surprise to a number of developers - if nothing else, the frequency with which Linus has found himself getting grumpy with patch submitters makes that clear.

And, the truth of the matter is that Linus has not enforced anything like the above rule in the past. Beyond new drivers, post-merge-window changes have typically included things like coding style and white space fixups, minor feature enhancements, defconfig updates, documentation updates, annotations for the sparse tool, and so on. Relatively few of these changes come equipped with an entry on the regression list.

To look at this another way, here's a table which appeared in the 2.6.26 development statistics article, updated with 2.6.27 (to date) information:

ReleaseChangesets merged
For -rc1after -rc1
2.6.2345052570
2.6.2471323221
2.6.2596293078
2.6.2675552577
2.6.27*77332451
* (Through September 9).

2.6.27 appears to be following the trend set by previous kernels: on the order of 25% of the total changesets will be merged outside of the nominal merge window. The most recent 2.6.27 regression summary shows a total of 150 regressions during this development cycle, of which 33 were unresolved. That suggests that at least 2300 patches merged since 2.6.27-rc1 were not fixes for listed regressions.

So the "regression fixes only" policy is truly new - and not really effective yet. Should this policy hold, it could have a number of interesting implications including, perhaps, an increase in the number of non-regression fixes shipped in distributor kernels. It might make developers become more diligent about reporting regressions so that the associated fix can be merged. With fewer changes going in later in the cycle, development cycles might just get a little shorter, perhaps even to the eight weeks that was, once, the nominal target. And, of course, we might just get kernel releases with fewer bugs, which would be a hard thing to complain about. In the short term, though, expect more grumpy emails to developers who are still trying to work by the older rules.


(Log in to post comments)

Tightening the merge window rules

Posted Sep 11, 2008 1:10 UTC (Thu) by modernjazz (guest, #4185) [Link]

As a user, I think this is a promising development.

Tightening the merge window rules

Posted Sep 11, 2008 3:49 UTC (Thu) by dilinger (subscriber, #2867) [Link]

As a developer, I also think this is a promising development.

Tightening the merge window rules

Posted Sep 11, 2008 4:23 UTC (Thu) by jwb (guest, #15467) [Link]

Mozilla.org has a good way to handle this. There is a "sheriff" -- more often than not, this is an entire IRC channel rather than an individual -- and the sheriff can close the tree for checking in if they feel like the work is not moving in the right direction. For instance, the tree might be closed to land a particularly hairy patch, or it might be closed except for patches addressing a certain issue, or patches which fix a given bug or set of bugs. And if the tree is "on fire", if it cannot be built or doesn't pass automated regression tests, then everybody who checked in recently is "on the hook" for fixing it.

I know this doesn't work for Linux because people will just work off their private trees forever and never send patches to Linus if the tree were really closed. I just wanted to pass along the management process used by a different, equally large project.

Tightening the merge window rules

Posted Sep 11, 2008 22:41 UTC (Thu) by wingo (guest, #26929) [Link]

That's interesting, thanks for the insight.

Tightening the merge window rules

Posted Sep 11, 2008 8:02 UTC (Thu) by dlang (✭ supporter ✭, #313) [Link]

the real cutoff isn't at the end of -rc1, Linus frequently pulls things in from Andrew in -rc2, and sometimes puts patches that he things likely to conflict with other changes in -rc1 so that people get a cheap bisect-like test to see which one causes the problem

it's usually around -rc3 that the changes really stop.

the other thing is that not all regressions get documented as such.

documented regressions are usually when a person finds the problem and needs to report it for others to research. if the developer finds the problem they just send a patch and explain why it's a regression, they don't waste their time to submit a regression report.

breakout by -rc number?

Posted Sep 12, 2008 1:35 UTC (Fri) by kirkengaard (subscriber, #15022) [Link]

Good point -- it might be instructive to see a chart like is in the article for -rc1/post-rc1, broken down by specific -rc number. If -rc1 and -rc2 contain what one might expect, with occasional lapses into -rc3, that might be a more manageable result to deal with.

Tightening the merge window rules

Posted Sep 11, 2008 13:57 UTC (Thu) by NAR (subscriber, #1313) [Link]

There are a lots of complaints from kernel developers when someone dumps a whole lot of code on the kernel for merging (new filesystem, driver, etc.). The kernel people usually say that these developers should have worked with them from the point of typing the first line of the new code.

Of course, if development starts in the open, there is a push to merge it early to the mainline (so if someone changes yet an other internal API, the code will be automatically updated). However, code released early tend to contain lots of bugs that could be easily fixed - hence there's a contant flow of small fixes and new features and I guess is hard to stop this flow.

So I this problem is inherent in the current development process.

Tightening the merge window rules

Posted Sep 12, 2008 10:28 UTC (Fri) by mingo (subscriber, #31122) [Link]

The thing is, every new kernel project should start small and should aim for constant stability.

Dropping a large amount of code on upstream with a large amount of open problems means the project has been done wrong from the get go.

If a project starts small in the upstream kernel, it is not a problem at all to have a constant flow of updates - as long as they are stabilized and are merged in the merge window only. That's how the kernel evolves, gradually.

A project that is in a constant state of breakage makes little sense.

Project flow

Posted Sep 13, 2008 22:53 UTC (Sat) by man_ls (guest, #15091) [Link]

Of course this is good engineering practice, but you will appreciate that it is not how software projects are usually managed. The usual process starts with a blueprint, then goes through to analysis and development and finally testing (at which point it's a huge mess of code which doesn't work at all). It takes months to get things working again.

It has taken decades for a few people to value constant stability, and even so most of the world isn't there yet. So it is not strange that it should take a couple of years to get used to such a process.

Project flow

Posted Sep 14, 2008 13:10 UTC (Sun) by mingo (subscriber, #31122) [Link]

It has taken decades for a few people to value constant stability, and even so most of the world isn't there yet. So it is not strange that it should take a couple of years to get used to such a process.

Yes, and even for the kernel it has taken almost a decade to reach that state. (Btw., the technological trigger was Git - it enabled the new, distributed, "evolving" workflow.)

So shouting at folks for not getting it right would be rather hypocritical, and in practice upstream is rather flexible about it all.

The comment i replied to claimed that there was a problem with the kernel's development process. I disagree with that, and i think it's natural to expect that if some code wants to reach upstream ASAP it should try to follow and adopt to its development flow.

I.e. new projects should 'become upstream' well before they touch upstream (they should adopt similar principles) - that way there will be a lot less friction after the merge point as well.

Project flow

Posted Sep 15, 2008 14:43 UTC (Mon) by etienne_lorrain@yahoo.fr (guest, #38022) [Link]

> The usual process starts with a blueprint, then goes through to
> analysis and development and finally testing

For my small projet, I have just coded so that I knew exactly what
I wanted to do. Then I rewritten most of the stuff nearly from scratch,
that is just keeping the lower layer functions and reorganising the
whole code.
The problem is that you have something working before the "rewrite",
but nobody else would understand it - and you cannot submit a patch
before the complete organisation, patch which would be huge moving stuff
around, renaming, factoring...
After the rewrite people complain that they have not been involved
in the design, but you just know they would have complained even more
before the reorganisation.
That is just life (of code)...

Etienne.

Tightening the merge window rules

Posted Sep 11, 2008 14:32 UTC (Thu) by iabervon (subscriber, #722) [Link]

It's worth noting that the theoretical policy is not that Linus won't take new stuff after -rc1; it's that he won't take new stuff sent to him after -rc1. In order to make things easier to debug, he doesn't merge everything at once, and makes releases with only some of the changes, but (at least in theory) he's going through his backlog of things he got during the merge window for the major changes.

Tightening the merge window rules

Posted Sep 15, 2008 10:42 UTC (Mon) by dgm (subscriber, #49227) [Link]

Noted how the rule ends:

> then why are people sending it to me?

It's a question, not a command. The clever thing is not to find a process that works always (there's not such a thing), but one that works more often than not, and keep an eye for the need to make an exception.

Tightening the merge window rules

Posted Sep 18, 2008 19:06 UTC (Thu) by anton (guest, #25547) [Link]

And, of course, we might just get kernel releases with fewer bugs [...]
If only regression fixes are accepted, I would expect fewer regressions, but more other bugs. Still, that might be experienced as fewer bugs by most people.

BTW, in my experience the kernel has no bugs, and all the problems I have come from other components.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds