Removing the PostgreSQL contrib tree
The PostgreSQL contrib tree contains a number of useful tools and other features that are not part of the database core for various reasons. In some ways, the contrib tree can act something like the kernel's staging tree, providing a way to try out and improve features before they get "promoted" into the PostgreSQL core. But the tree is shipped in parallel with the core, adding to the maintenance burden for the core developers, so a reorganization of the contents of contrib has been proposed—with the idea of eliminating the contrib tree itself entirely.
Joshua D. Drake raised the idea of
eliminating contrib in a post
to the pgsql-hackers mailing list. In his post, he prefaced his proposal
by noting the explanation
from the
documentation of why the
contents of the contrib directory are not part of the core: "mainly because they address a
limited audience or are too experimental to be part of the main source
tree
". That, however, "does not preclude their
usefulness
", it continues. In addition, Drake said, the contrib tree has
long been considered to be the place to keep features that are eventually
destined for
the core.
The discussion of an auditing extension (pg_audit) that was committed to contrib by Stephen Frost in mid-May (which has since been reverted) was the proximate cause of Drake's proposal. The discussion started in the pgsql-committers mailing list but soon moved to pgsql-hackers. Some developers were not pleased with the code and documentation quality of pg_audit, were concerned about potential security holes it introduced, and were unhappy with how it got committed, which they thought circumvented the normal PostgreSQL community development process.
If pg_audit (and, by extension, other similar efforts) were not committed into the contrib tree in parallel with the core, some of these problems would go away, Drake argued. The code and documentation quality questions, along with any circumvention of the normal development process, would be non-issues if these new features were put into some other repository. That would also reduce the amount of code that the core developers need to maintain and test. He suggested going through the list of 45 modules to determine which should move into the core and which should move to either a new project (perhaps called "contrib") elsewhere or, if they are extensions, to the PostgreSQL Extension Network (PGXN). He suggested the criteria for where the contrib modules ended up come down to having a visible community (so, narrowly focused features would not make the cut) as well as having been included into contrib for at least two releases.
Frost agreed that some clarification of the mission of contrib is probably in order, but he noted that there is no location in the existing core tree where extensions could be placed. An "extensions" directory could be created, of course. He also wondered how the new contrib project would differ from PGXN. In addition, Peter Eisentraut concurred with the general idea, but suspected it would be a big undertaking.
But Drake didn't think the problem was all that large and claimed that it should be
"obvious
" to simply
include many of the contrib modules into core. Another option might be
to freeze contrib: "What is in there now, is all there will ever be in there
and the goal is to slowly reduce it to the point that it doesn't
matter.
"
Frost was not in favor of the freezing approach. But he did go through the list making recommendations on
which modules belonged in the core. For his part, Drake largely agreed with Frost's suggestions. But others
are not so sure. Fabian Coelho wondered if
there was enough benefit to go through the exercise: "Reaching a consensus about what to move here or there
will consume valuable time that could be spent on more important
tasks... Is it worth it?
" Jeff Janes was also concerned about how users would decide which
modules and extensions to trust.
But Drake would be willing to see all of contrib move into the core and be installed by default as part of the standard installation. He simply doesn't think that the contrib distinction in the main tree is useful:
Contrib made sense years ago. It does not any longer. Let's put the old horse down and raise a new herd of ponies on a new pasture.
On the other hand, though, Robert Haas thinks that it doesn't make sense to talk
about getting rid of contrib when each new PostgreSQL release adds new
modules to it. Those features are generally "pretty good
stuff
". There needs to be a place for that code: "We wouldn't
have been
better off rejecting it, and we wouldn't have been better off putting
it into the main tree.
" Contrib has already been cleaned up along
the way, he said, and simply renaming contrib is not particularly productive either. Even just categorizing the modules may not be as
straightforward as it seems:
things we want to include in the core distribution without baking them irrevocably into the server". Trying again, Drake restated his position, trying to clarify some of the points that Haas and others had disagreed with. But Andres Freund didn't agree with Drake's reasoning, nor did Haas.
One of the main reasons Drake cites for the change is the disagreement about pg_audit, which is, he said, the same argument that has come up frequently over the last fifteen years. But both Freund and Haas see that issue differently. Moving things into core will just move the argument from what goes into contrib to what goes into core, Freund said. Haas was more specific:
As might be guessed, Frost did not agree with that characterization. By then, however, he had already reverted the change.
In another sub-thread, Jim Nasby elaborated on the concern raised by Janes: users (and distribution packagers) tend to trust the contrib tree because it comes with PostgreSQL. But that could all be made more explicit, Nasby said:
Personally, I'd rather we publish a list of formally vetted and approved versions of PGXN modules. There are many benefits to that, and the downside of not having that stuff as part of make check would be overcome by the explicit testing we would need to have for approved modules.
There was general agreement with Nasby's idea; a vetted list of extensions
and other modules would be highly useful. In fact, Neil Tiffin said that he would never install anything from
PGXN because it is hard to tell "what is actively maintained and tested, and what is an abandoned
proof-of-concept or idea
". Furthermore, it is not clear what
version of PostgreSQL a PGXN module runs on or has been tested on, he said.
That is
a problem for open-source software in general, David E. Wheeler pointed out. Beyond that, the PGXN Tester does provide some of the
information Tiffin is seeking.
Overall, it would seem that there aren't too many other core developers who see the problems the way that Drake does. The changes he suggested may not even address the main problem he sees. But there clearly are some changes that could be made—Nasby's suggestion chief among them. Whether a list of vetted and "blessed" extensions becomes a reality is unclear; no one seemed to volunteer for that particular mission. On the flipside, though, it is clear that not everyone is on the same page about the purpose of the contrib tree; that is probably worth clarifying one way or another.
