Python ponders release numbering
Release engineering for a large project is always a tricky task. Balancing the needs of new features, removing old cruft, and bug fixing while still producing releases in a timely fashion is difficult. Python is currently struggling with this as it is trying to determine which things go into a 3.0.1 release versus those that belong in 3.1.0. The discussion gives a glimpse into the thinking that must go on as projects decide how, what, and when to release.
It is very common to find bugs shortly after a release that would seem to necessitate a bug fix release. Ofttimes these are bugs that would have been considered show-stopping had they been found before the release. But what about features that were supposed to be dropped, after having been deprecated for several releases, but were mistakenly left in? That is one of the current dilemmas facing Python.
One of the changes made in Python 3.0 was a change to comparisons and, in particular, removing the cmp() function. That function takes two arguments, returning -1, 0, or 1 based on whether the first argument was less than, equal to, or greater than the second. Python 3.0 set out to clean up some of the "warts" of the language; cmp() could be handled in other, more efficient ways. The only problem is: cmp() didn't really get removed from the Python 3.0 release in December.
It was recognized quite quickly (the bug report shows it being
reported three days after the release), but it wasn't exactly clear what to
do about it. There may now exist "valid" Python 3.0 programs that use
cmp() and function correctly. This led Guido van Rossum to say: "Bah. That means
we'll have to start deprecating cmp() in 3.1, and won't
be able to remove it until 3.2 or 3.3. :-)
" He seems to have only
been half-serious, as the smiley might indicate, eventually concluding: "OK, remove it
in 3.0.1, provided that's released this year.
" Unfortunately, the
"this year" he was referring to is 2008.
Because Python 3 was such a major shift in the language, the 2to3 tool was created to help fix old code to work with the new interpreter. But, 2to3 did not change calls to cmp(), so code created using that tool will run in Python 3.0. That makes for a bit of a tangle as van Rossum explains:
As of this writing, Python 3.0.1 is intended for release on February 13 with the removal of cmp(). There seem to be a number of reasons that the release slipped into 2009, not least is the holiday season that tends to eat up a fair chunk of December. But it was also more complicated to remove cmp() than it at first appeared. There were several standard libraries and tests that were still using it as well Python internals that still referred to it. Inevitably, as those things were getting worked out, other problems cropped up.
There are some fairly serious performance problems with the new I/O library, with some experiencing read performance three orders of magnitude slower on Python 3.0. There are also problems with chunked HTTP responses when using urllib. Both of these require fairly extensive fixes, though, which also requires lots of testing. It all adds up to a lot of work, so folks start to wonder if much or all of the work shouldn't get pushed into the 3.1 release which is targeted at an April or May time frame.
There are others who argue that the 3.0 series should be abandoned entirely in the near term. Rather than have a 3.0.1 with substantial changes from 3.0—including the incompatible removal of cmp()—3.1 should be released quickly so that it is the release targeted by developers. As Raymond Hettinger put it:
There are some fairly important new features—notably moving the new I/O to C for performance reasons—that will not make it for a release in February, though. Since a 3.2 release would be quite a ways off, those features would languish for too long. 3.1 release manager Benjamin Peterson would would rather see an immediate 3.0.1 release:
There are also concerns that an immediate release called 3.1 might lead to confusion and unhappiness for users. Martin Löwis voiced those fears to general agreement:
Part of the problem is the "no new features" rule for bug fix
releases—those that are typically numbered by bumping the third digit
of the version number. Python established that rule in the 2.x series, to try to protect
the "most conservative users
" as van Rossum points out. Those users have not moved to
Python 3 yet, so van Rossum argues that the rule can be suspended:
This argument seemed to help crystallize a consensus of sorts. There were some other discussions of exactly which "features" should make an appearance in 3.0.1, but the push for numbering the bug fix release as 3.1 seemed to fade. The 3.0.1 release is currently scheduled for February 13th, while other new features—undoubtedly along with additional fixes—will come with the 3.1 release in April or May.
Part of what was considered in the deliberations was the impact on users and what they will expect from how the releases are numbered. It is a difficult problem, as KDE found out a year ago. Users have certain expectations based on release numbering, which are largely outside of a project's control. But, some kinds of changes, especially those that are not backward compatible, necessitate a "large enough" numeric change to indicate that.
It is a fine line, which is why Python has struggled with it. One hopes that any development for Python 3—a large, incompatible language overhaul itself—avoided using cmp(), and will then be unaffected. If not, the relatively small window in time should keep the number of affected programs to a minimum.
