PostgreSQL's commitfest clog

By Jonathan Corbet
August 12, 2021

While it may seem like the number of developers would be the limiting factor in a free-software project, the truth of the matter is that, for all but the smallest of project, the scarcest resource is reviewer time. Lots of people like to crank out code; rather fewer can find the time to take a close look at somebody else's patches. Free-software projects have taken a number of different approaches to address the review problem; the PostgreSQL developer community is currently struggling with its review load and considering changes to its commitfest process in response.

Part of the review problem is clerical in nature: patches must be tracked along with their review status. Some projects, like the Linux kernel, take a distributed approach; review status is tracked in the patches themselves and subsystem maintainers are expected to keep up with which patches are ready to be merged. PostgreSQL developers, naturally, prefer to keep that information in a central database. Roughly every other month, outstanding patches are gathered for a month-long commitfest, during which the project makes a decision on the fate of each one of them. Each commitfest has a designated manager who is responsible for ensuring that all patches have been dealt with by the end of the commitfest.

That is the intended result, anyway. What actually happens, as Simon Riggs recently pointed out on the PostgreSQL Hackers mailing list, is that a lot of patches languish in the queue with no firm decision being made; this can happen as the result of a lack of reviews or a failure of the author to respond, among other reasons. Riggs noted that the 2021-09 commitfest, which is scheduled for September, has 273 patches queued (since increased to 279): "Of those, about 50 items have been waiting more than one year, and about 25 entries waiting for more than two years". The community has been working hard to clear the queue during each commitfest, Riggs said, but still "it's overflowing".

A look at past commitfests (which can be viewed at commitfest.postgresql.org) shows that a great many patches are "dealt with" by deferring them to the next commitfest. The recently concluded 2021-07 commitfest considered 342 patches; of those, 233 (just over 2/3) were deferred to the next commitfest. When the punting rate is that high, actually clearing out the commitfest queue becomes a distant prospect at best.

There is a longstanding expectation within the PostgreSQL project that anybody submitting a patch for consideration in a given commitfest should take the time to review somebody else's patch, preferably one of similar complexity. In theory, that would balance the numbers of submitters and reviewers; in practice it does not seem to be getting the job done. Part of the problem, almost certainly, is that some submitters just never quite get around to fulfilling that side of the bargain; life is busy after all. One of the commitfest manager's jobs is to encourage developers to do reviews; that is, needless to say, a task that is even less fun than patch review. In the discussion, Noah Misch suggested an obvious technical solution: track each submitter's review balance in the database to make it clear who is not living up to expectations. But, as Tomas Vondra pointed out, there are a lot of subjective questions about what constitutes a review, equivalent complexity, and more.

Even if the rule were fully observed, though, it seems unlikely that the problem could go away. Many patches need input from multiple reviewers; they may also go through the review process many times with changes in response to feedback in between. Thus, the number of needed reviews is sure to exceed the number of submitted patches by a significant margin.

Bruce Momjian suggested that part of the problem, in the last year at least, is the complete lack of in-person developer meetings. Greg Stark agreed: "Every year there are some especially contentious patches that don't get dealt with until the in-person meeting which pushes people to make a call". He also noted, though, that the number of patches in this category isn't sufficient to explain the size of the backlog. Michael Banck suggested holding virtual meetings to make decisions on patches; there are few developers out there who are clamoring for more online meetings, but enough might be convinced to attend one to make some progress possible.

One thing that almost everybody seemed to agree on is that many of the patches that slide from one commitfest to another simply should not be there. According to Tom Lane:

As a community, we don't really have the strength of will to flat-out reject patches. I think the dynamic is that individual committers look at something, think "I don't like that, I'll go work on some better-designed patch", and it just keeps slipping to the next CF.

Robert Haas expressed a similar point of view:

I think [commitfest managers] have been pretty timid about bouncing stuff that isn't really showing signs of progress. If a patch has been reviewed at some point in the past, and say a month has gone by, and now we're beginning or ending a CommitFest, the patch should just be closed. Otherwise the CommitFest just fills up with clutter.

Improving the project's ability to close dead-end patches might not be easy. Arguably, it is up to the commitfest manager to make that decision; Riggs suggested that the manager only be allowed to defer consideration on 50% of the submitted patches. But Lane (in the message quoted above) noted that only managers who are "assertive enough and senior enough" have been able to "kill off patches that didn't look like they were going to go anywhere" and said that perhaps this should not be the commitfest manager's job in any case. Certainly disappointing dozens of submitters over the course of a month — and, undoubtedly, hearing their thoughts on the matter — is not going to make the job of commitfest manager more appealing.

One other idea that came up a few times was to place a limit on the number of commitfests that a patch could be allowed to slide through before being removed. This approach has the advantage of being relatively automatic and objective; nobody would have to step up as the evil maintainer who decided to reject a whole pile of patches. It would also bring a natural end to patches that nobody can find a way to care about.

The conversation wound down without reaching any solid conclusions. Perhaps this issue, too, will be deferred to the next commitfest for a decision. But the discussion may at least motivate some developers to put more time into cleaning out the queue and reducing the backlog — in the short term, at least. In the longer term, the PostgreSQL will have to continue with a shortage of reviewers, just like many other projects.

PostgreSQL's commitfest clog

Posted Aug 12, 2021 15:44 UTC (Thu) by kleptog (subscriber, #1183) [Link] (2 responses)

"Preferably of similar complexity" is difficult in another way. At $DAYJOB I found an issue with Postgres and spent more than a week digging through code I'd never seen before, debugging it and eventually coming up with a solution, after some emails to the mailing list for clues. What is then "similar complexity"? Another patch that would take another week to dig into the code to understand what it does before you feel qualified to have an opinion?

For people who work on Postgres for a lot of time the cost is lower because they already know the code, and they already know many of the other core developers so know which bit they have to pay attention to.

At $DAYJOB code needs to be reviewed but there's a different dynamic in play. If you don't immediately understand something you simply talk to the person who wrote it. By asking questions you get an idea of (a) why was this solution chosen (b) what are the risks (c) what's the skill level so I know which kind of bugs I need to watch for. Also get a feeling of the urgency. A patch that has been in the queue for a year might be useless, or the writer has had it running in production that whole time and is confident it works. You can do this via email but, well, for complex patches that doesn't cut it.

Not sure if there's any easy solution, but I can imagine that for complex/controversial patches that a quick chat between the writer and a reviewer can help make a lot more progress than a commitfest manager sending emails. And I have no doubt that in-person conferences in the past helped immensely here.

PostgreSQL's commitfest clog

Posted Aug 13, 2021 13:06 UTC (Fri) by taladar (subscriber, #68407) [Link] (1 responses)

Agreed. The idea that someone who contributes a single patch for a problem they themselves encountered in production and who has otherwise never looked at the code base should review a patch of similar complexity is ridiculous.

Reviews always require context and someone who never looked at the code base simply does not have that context.

PostgreSQL's commitfest clog

Posted Aug 13, 2021 13:14 UTC (Fri) by andresfreund (subscriber, #69562) [Link]

I don't think small bug fixes or very simple one-off improvements are a significant part of the problem. They're easily enough committed and reviewed. What tends to clog up things are patches there's no agreements for and large patchsets that aren't quite there, but good enough to not just want to throw them out.

PostgreSQL's commitfest clog

Posted Aug 12, 2021 18:40 UTC (Thu) by iabervon (subscriber, #722) [Link] (6 responses)

Maybe the need is for reviewers to make a list of the patches that they tried to look at and didn't get anywhere on reviewing? That would distinguish patches that nobody noticed from ones that people saw, but which have problems that they didn't feel able to express tactfully. The commitfest manager could then try to come up with suggestions on what would make reviewers more likely to get through reviewing the patch next time, which is an easier context in which to be tactful, I think.

In any case, having a patch that came from a previous commitfest only go into a third one if it gets updated somehow would make sense. If it was already there and nobody reviewed it, it's unlikely that anyone will review it in the future unless something gets done to change the situation.

PostgreSQL's commitfest clog

Posted Aug 13, 2021 9:18 UTC (Fri) by k3ninho (subscriber, #50375) [Link] (2 responses)

Contemporary team practices talk about 'definition of done' where the community agree on the checklist of things to mark a task complete so it can be made generally available to users.

I like the 'attempted review' idea, in any case this needs extra effort in release management. I'd aim for making visual the amount of stuff at the review stage -- so that a contemporary approach like Kanban can stop people working on new items until the tasks In Review are down to some acceptable number (Toyota car manufacturing, where this came from, set this to zero for producing no more parts until stock is consumed).

You could visualise the 'attempted review' stage, too, for stuff that's failing to garner adequate technical expertise to move to a state where they're 'signed off'. I have an addiction to sticky notes on whiteboards, but online visual dashboards also exist.

K3n.

PostgreSQL's commitfest clog

Posted Aug 13, 2021 14:05 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (1 responses)

> (Toyota car manufacturing, where this came from, set this to zero for producing no more parts until stock is consumed)

My understanding is that Kanban started at Toyota, but was misunderstood and simplified by other manufacturers into being just-in-time manufacturing (where stocks are kept lean as an end rather than as a side-effect of the underlying Kanban principles). Toyota themselves understood that some buffers are necessary and is why they fared better than other car manufacturers during shortages. This video[1] explains some of the backstory here.

[1] https://www.youtube.com/watch?v=b1JlYZQG3lI

PostgreSQL's commitfest clog

Posted Aug 19, 2021 9:12 UTC (Thu) by k3ninho (subscriber, #50375) [Link]

>stocks are kept lean as an end rather than as a side-effect of the underlying Kanban principles

I don't disagree -- I think that buffers are essential to flow (the real goal) but they need a whole-system view to find the not-too-little, not-too-much Goldilocks zone. An example discussed here is Bufferbloat with some WiFi devices having too large a set of buffers which caused problems with the dumb flow control of TCP sessions.

K3n.

PostgreSQL's commitfest clog

Posted Aug 13, 2021 14:28 UTC (Fri) by intelfx (subscriber, #130118) [Link] (2 responses)

> If it was already there and nobody reviewed it, it's unlikely that anyone will review it in the future unless something gets done to change the situation.

Auto-dropping patches if they don't get attention sounds contrary to the idea of commitfests. What do I do as an author of the patch if it just gets ignored through no fault of my own? What *can* I do to "change the situation"?

PostgreSQL's commitfest clog

Posted Aug 14, 2021 1:08 UTC (Sat) by iabervon (subscriber, #722) [Link]

Talk about its benefits on the mailing list? Add explanation as to how it's organized? Ask if there's a better way to design it? Ideally, the commitfest manager would have some suggestions particular to the patch about what might be putting reviewers off.

Often a patch will get the code exactly right, but the commit message doesn't explain enough about why it's right and give reviewers enough information to check that the author understands the implications entirely. It's almost always possible to add more to the presentation of the patch such that the code ends up the same but the review is easier to do or, especially, easier to get started on. A patch that touches several files will generally have the diffs ordered alphabetically and that's likely not the order in which a logical argument for making the change would go. Beyond the point where the patch is correct and applying it would be the right thing to do, there's a variety of ways that a patch author can make it more obvious that that is the case, and it may be necessary to make the patch more approachable in non-technical ways in order to motivate reviewers.

PostgreSQL's commitfest clog

Posted Aug 15, 2021 18:03 UTC (Sun) by NYKevin (subscriber, #129325) [Link]

I wonder what would happen if they adopted the opposite policy: If nobody reviews it, that means nobody has any objections to it, so let's just merge it with no review.

This would, at the very least, provide a stronger motivation for more senior developers to review old patches that are close to the limit (i.e. "you had better review it now, or else it might just get merged by default"). However, I don't imagine it would be appropriate to make such a policy retroactive, and there would probably need to be some kind of limiting principle to prevent abuse (e.g. only if you've sent at least X accepted patches in the past, only if you have fewer than X other patches outstanding, etc.).

PostgreSQL's commitfest clog

Posted Aug 15, 2021 16:04 UTC (Sun) by intgr (subscriber, #39733) [Link]

If a community wants more reviewers, the community should work on making the process easier and more familiar for new people.

I think part of the problem is the antiquated requirement of doing reviews by email. I used to submit and review some simpler Postgres patches way back.

After using the review process in GitHub and GitLab in various settings for years, going back to an email client feels so inconvenient that I'm demotivated from even trying.

PostgreSQL's commitfest clog

Posted Aug 19, 2021 8:25 UTC (Thu) by alsuren (subscriber, #62141) [Link] (1 responses)

> Michael Banck suggested holding virtual meetings to make decisions on patches;

We have a technique at my current workplace for speeding up the review of complex patches: "mob reviews". The author of the patch sets up a video call and posts a link to it in the team channel. When one or more reviewer joins, the author goes through the patch in the order that they think makes sense, and writes down any feedback from the reviewers on the pull request. They then immediately go away and fix the feedback, and the second round of reviews is usually just a quick approval.

This pushes some of the mental load onto the patch author.

PostgreSQL's commitfest clog

Posted Aug 21, 2021 12:12 UTC (Sat) by halla (subscriber, #14185) [Link]

That sounds like a good idea... Maybe we should give this a try for Krita, too.