|
|
Subscribe / Log in / New account

Python and "dead" batteries

By Jake Edge
June 12, 2019

Python is, famously, a "batteries included" language; it comes with a rich standard library right out of the box, which makes for a highly useful starting point for everyone. But that does have some downsides as well. The standard library modules are largely maintained by the CPython core developers, which adds to their duties; the modules themselves are subject to the CPython release schedule, which may be suboptimal. For those reasons and others, there have been thoughts about retiring some of the older modules; it is a topic that has come up several times over the last year or so.

It probably had been discussed even earlier, but a session at the 2018 Python Language Summit (PLS) is the starting point this time around. At that time, Christian Heimes listed a few modules that he thought should be considered for removal; he said he was working on a PEP to that end. PEP 594 ("Removing dead batteries from the standard library") surfaced in May with a much longer list of potentially dead batteries. There was also a session at this year's PLS, where Amber Brown advocated moving toward a much smaller standard library, arguing that including modules in the standard library stifles their growth. Some at PLS seemed to be receptive to Brown's ideas, at least to some extent, though Guido van Rossum was apparently not pleased with her presentation and "stormed from the room".

PEP 594

After PLS, Heimes posted the first draft of PEP 594 to the python-dev mailing list. It is a much more ambitious list than the one he made back in 2018; there are 31 modules listed, but four of those are actually modules that were once proposed for retirement that the PEP now recommends keeping. The modules are scattered throughout the standard library, but: "the majority of modules are for old data formats or old APIs. Some others are rarely useful and have better replacements on PyPI [the Python Package Index]."

The PEP lists alternatives from PyPI for most of the modules, along with a justification for their removal (or, in those few cases, their retention). In addition, the deprecation schedule being proposed is described; the modules agreed upon will be documented as "deprecated" in the upcoming 3.8 release and may raise a PendingDeprecationWarning exception. In 3.9, the modules will start raising DeprecationWarning exceptions and in 3.10 they will be removed along with their tests and documentation. Given the Python support window, the modules will still be supported by the core team until the end of life for Python 3.9, which is estimated to occur in 2026.

The modules listed to be removed in the first draft were:

Type Modules
Data encoding binhex, uu, and xdrlib
Multimedia aifc, audioop, colorsys, chunk, imghdr, ossaudiodev, sndhdr, and sunau
Networking asynchat, asyncore, cgi, cgitb, smtpd, and nntplib
OS interface crypt, macpath, nis, and spwd
Miscellaneous fileinput, formatter, imp, msilib, and pipes

As might be guessed, the PEP posting set off a bit of a storm of both suggestions for other modules to consider for removal, as well as concerns about some of the targeted modules. In particular, Andrew Svetlov suggested that socketserver might be a good candidate for removal, but it is used by http.server and others, so Heimes said he decided against it. But that led Glenn Linderman to suggest removing that module as well. He pointed out that it lacked a lot of functionality (e.g. HTTPS support) and that the PEP suggests removing the cgi module, further reducing its utility.

Van Rossum and others saw the value in http.server, though. It is used by other tools and modules, for one thing, but it is also useful as a quick and dirty local HTTP server—remembering how to configure and run a full web framework is too heavyweight for that kind of task. Linderman thought that the Bottle web framework might make a reasonable alternative for a simple, local server, but it turns out that Bottle uses http.server as well.

Several argued against removing specific modules. One of the more controversial choices that Heimes made was nntplib, which provides client-side code for accessing Network News Transfer Protocol (NNTP) services. Antoine Pitrou raised an objection to removing nntplib among others he thought were dubious choices (cgitb for generating tracebacks for web pages and crypt for interfacing to the crypt() one-way hash function). "NNTP is still quite used (often through GMane, but probably not only) so I'd question the removal of nntplib."

André Malo agreed with Pitrou about nntplib; he wondered how much of a maintenance burden it would be given how old the protocol is. But Victor Stinner pointed out that the "maintenance burden is real even if it's not visible"; for example, there are a number of sporadic test failures from nntplib in the continuous integration (CI) system and "nobody managed to come with a fix.. in 6 years". Beyond that, the administrator of the server used in some of those tests has asked about support for the NNTP compression extension, so there are still features that nntplib may need.

Giampaolo Rodolà thought that if nntplib was on the chopping block, telnetlib should probably join it, though he was not actually advocating that:

Overall, I think the bar for a module removal should be set very high, especially for “standard” things such as these network protocols, that despite being old are not likely to change. That means that also the maintenance burden for python-dev will be low or close to none after all.

Heimes replied that he had missed telnetlib (it has since been added), but that nntplib does have a high maintenance burden because it has no maintainer, outstanding bugs, and missing features. Rodolà also argued against removing crypt and spwd (which provides access to the shadow password file). He noted that the reasons behind removing those two were security related, which makes handling their removal different than others on the list; since it may be useful to be able to work with passwords on Unix systems, having something available to do so out of the box would be good. But Heimes said that those two modules have some serious security problems that makes them "very dangerous batteries".

The nature of the PEP as an omnibus including many different modules means that objections should be handled differently, Van Rossum said. The PEP will eventually either need to be accepted or rejected as a whole, which means that any particular modules eliciting complaints should probably simply be dropped off the list:

In order to get a consensus to pass the PEP, it may be necessary to compromise. IOW I would recommend removing modules from the PEP that bring up strong opposition, *even* if you yourself feel strongly that those modules should be removed.

The vast majority of modules on the list hasn't elicited any kind of feedback at all -- those are clearly safe to remove (many people are probably, like myself, hard-pressed to remember what they do). I'm not saying drop anything from the list that elicits any pushback, but once the debate has gone back and forth twice, it may be a hint that a module still has fans.

Not dead

Several commenters objected to the name of the PEP, arguing that "dead" was not an accurate description of the state of many of the modules. Stinner said: "A module is never 'dead', there are always users, even if there are less than 5 of them." He was generally in favor of the overall plan, even though he had multiple concerns and questions. Steven D'Aprano was also unhappy with the name; the batteries are working, he said, they are just unloved. But he was worried about users who will not find it easy to add the batteries back into their Python after they get removed.

Many Python users don't have the privilege of being able to install arbitrary, unvetted packages from PyPI. They get to use only packages from approved vendors, including the stdlib, what they write themselves, and nothing else. Please don't dismiss this part of the Python community just because they don't typically hang around in the same forums we do.

The current thinking seems to be that many of the modules that get removed will move over to PyPI in some form; it is possible they could even use the existing name because PyPI disallows module-name collisions with the standard library. Or, at least, the "useful" modules will move. How that will work, exactly, and how to make it easy for users affected by the module removal to fix it, are still up in the air, though it has been discussed in a thread on the Python Discourse forum. The only firm position that the PEP takes is that the core developers would stop maintaining the modules once they are removed (and the relevant Python versions reach their end of life). While moving modules to PyPI and providing some path for users to start picking them up from there is attractive, it has some possible downsides too; there are concerns that doing so could lead to another event-stream incident.

Meanwhile, Barry Warsaw was a bit worried that the PEP didn't go far enough toward solving the longstanding tension between various goals for the standard library:

We have two competing pressures, one to provide a rich standard library with lots of useful features that come right out of the box. Let's not underestimate the value that this has for our users, and the contribution such a stdlib has made to making Python as popular as it is.

But it's also true that lots of the stdlib don't get the love they need to stay relevant, and a curated path to keeping only the most useful and modern libraries. I wonder how much the long development cycle and relatively big overhead for contributing to stdlib maintenance causes a big part of our headaches with the stdlib. Current stdlib development processes also incur burden for alternative implementations.

We've had many ideas over the years, such as stripping the CPython repo stdlib to its bare minimum and providing some way of *distributing* a sumo tarball. But none have made it far enough to be adopted. I don't have any new bright ideas for how to make this work, but I think finding a holistic approach to these competing pressures is in the best long term interest of Python.

Heimes has posted a second draft of the PEP on python-dev, then moved the discussion to a Discourse thread, perhaps in the interests of involving those who are not inclined toward the mailing list. For the most part, the responses were similar, mostly pleas to keep certain modules, though it is clear that some are not really aware of the maintenance burden that is borne by the core developers in keeping things going for the standard library. It is also clear from the whole discussion that there are multiple places where the core developers simply have not been able to keep up with the maintenance duties.

While moving some set of modules out of the standard library will certainly help alleviate the maintenance headaches for those modules, it will not magically grow new maintainers for them. It is possible that it might cause some interested parties to step up to fix, maintain, and, even, develop new features for some of the modules, especially if the tie to the CPython release schedule and development process has been holding some back from those chores.

In the end, it is a tricky balancing act to provide enough "batteries included" that Python is useful out of the box without overburdening the core maintainers—or constraining potentially better solutions. One of the complaints about the standard library is that it locks users into a particular approach that may be suboptimal. For example, by most accounts the Requests package provides a much more rational approach to using HTTP from Python, but users often opt for the standard library "equivalents" because they come with Python. Moving these standard libraries to PyPI might allow other modules to rise to the occasion.

On the flip side of that, of course, is that environments that are limited in their choices of third-party packages (e.g. due to policy or internet connectivity) will have a starting point with fewer features. For the most part, the modules under consideration here will not likely truly be a barrier for users that need them if they are no longer shipped with Python. It will be the case that some programs suddenly stop working unexpectedly but, with luck, a path toward minimizing even that will be found.


Index entries for this article
PythonDeprecation
PythonPython Enhancement Proposals (PEP)/PEP 594
PythonStandard library


to post comments

Python and "dead" batteries

Posted Jun 12, 2019 16:47 UTC (Wed) by ikm (guest, #493) [Link] (10 responses)

Yet another reason to stick to 2.7.

Python and "dead" batteries

Posted Jun 12, 2019 17:13 UTC (Wed) by mirabilos (subscriber, #84359) [Link] (1 responses)

Hehe… nice one.

“chunk” is a PITA… I tried to make https://evolvis.org/plugins/scmgit/cgi-bin/gitweb.cgi?p=u... extend it first, but it’s only for reading, *and* it’s buggy, so I gave up and implemented the whole shit myself, using “chunk” only as a rough guideline.

“uu” is a mandatory piece of Unix history (and still in use) though… so it must not go. I cannot imagine this to be a maintenance burden, either… either it works or it’s trivial to fix and never worked. It’s not like uuencode format changes now.

Python and "dead" batteries

Posted Jun 13, 2019 7:14 UTC (Thu) by fyrchik (guest, #124371) [Link]

It is still possible to work with uuencode format using `binascii` module.

Python and "dead" batteries

Posted Jun 12, 2019 17:18 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (2 responses)

After Python 3.x drops support for these modules, there is nothing stopping you from pulling their source code out of the Git history.

I mean, it obviously wouldn't have upstream support, but at the end of this year, neither will 2.7.

Python and "dead" batteries

Posted Jun 12, 2019 18:18 UTC (Wed) by pgdx (guest, #119243) [Link]

You will also have the option of installing legacylib [1].

[1] https://github.com/tiran/legacylib

Python and "dead" batteries

Posted Jan 25, 2021 10:32 UTC (Mon) by ceplm (subscriber, #41334) [Link]

There is hope that these modules would shift to independent PyPI packages, so you can get it from there.

Python and "dead" batteries

Posted Jun 12, 2019 20:18 UTC (Wed) by MortenSickel (subscriber, #3238) [Link]

Maybe for you. Which of the listed modules do you need? Are you standing up to maintain them?

Python and "dead" batteries

Posted Jun 13, 2019 17:12 UTC (Thu) by logang (subscriber, #127618) [Link] (3 responses)

I know this comment is a bit of a troll but I think it gets at the heart of an important issue. People choose to use standard libraries largely because they have a high chance of being maintained and available for the foreseeable future. People avoid PyPi modules because the maintenance and quality story is often questionable.

Python caused a lot of headaches for their users with the 3.0 transition and I don't think they can afford pushing more pain their users way. If they are going to remove modules they should use a multiple-cycle process that starts with deprecation in the documentation a couple cycles later start printing warnings and a couple cycles after that finally removing. The only code they should remove quickly is code that they can prove no one is using because it's obviously broken -- and I don't think many of the candidates that were proposed fit that bill. The argument that a module could be made available through PyPi so it can be quickly removed from the standard library is a bad one.

Some of the candidate modules don't seem like a large maintenance burden anyway compared to the pain that would be caused for potential users. For example: colorsys, fileinput, imghdr, pipes, and probably others seem like relatively small helper functions that shouldn't take a lot of effort to maintain but would cause anyone who chose to use them big pains if they were to suddenly disappear. Also, based on its documentation, imghdr shows improvements have been made in recent versions so, today, it seems well maintained and appropriate to use.

Python and "dead" batteries

Posted Jun 13, 2019 18:50 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link] (2 responses)

If they are going to remove modules they should use a multiple-cycle process that starts with deprecation in the documentation a couple cycles later start printing warnings and a couple cycles after that finally removing.

They are proposing a multi-cycle process. They're suggesting only one cycle each for "pending deprecation" and "deprecation" rather than several, but according to the article that would still leave them available until 2026, so they're not disappearing overnight by any means. And presumably they could be rescued from deprecation if a maintainer were willing to step forward and promise to support them moving forward.

Python and "dead" batteries

Posted Jun 13, 2019 19:10 UTC (Thu) by logang (subscriber, #127618) [Link] (1 responses)

I don't think the 2026 number is valid and it still should be slower. At this rate, users who are affected are forced to update their code or not support versions past 3.9. Just because 3.9 is supported until 2026 doesn't really help the issue. Unsuspecting developers could be writing new code today that won't be supported on 3.10 when it's released around 2021. That's effectively only two years between looking perfectly safe to use and requiring it to be changed; and only one year before their users start complaining about warnings. The python team should mark the documentation as deprecated for a *lot* longer than that so users writing new code have a lot longer to notice the deprecation and stop using those features.

This is one of the things I think Linus gets right with Linux: his uncompromising stance toward breaking users and the general policy of very long deprecation cycles -- typically features aren't removed until there is a reasonable argument that nobody is using them. If more libraries and programming languages took that stance we'd be in a much better position. Too many projects just break things and foist the pain and responsibility on their user base.

Python and "dead" batteries

Posted Jun 14, 2019 10:53 UTC (Fri) by NAR (subscriber, #1313) [Link]

"Unsuspecting developers could be writing new code today that won't be supported on 3.10 when it's released around 2021."

If those unsuspecting developers will maintain their code in 2021, then they will update their code. If they don't maintain it, their users are screwed anyway.

Python and "dead" batteries

Posted Jun 12, 2019 20:30 UTC (Wed) by roc (subscriber, #30627) [Link] (5 responses)

More proof, if any were needed, that modern package management with some curation beats "batteries included".

Python and "dead" batteries

Posted Jun 13, 2019 8:47 UTC (Thu) by LtWorf (subscriber, #124958) [Link] (4 responses)

I don't think Pypi is that modern. For example there is no way of take ownership of modules if the original author disappears. So on such occasions, forks are required, and of course developers won't know of the fork, and so on…

Python and "dead" batteries

Posted Jun 13, 2019 9:37 UTC (Thu) by dottedmag (subscriber, #18590) [Link] (1 responses)

I remember taking over an abandoned module by sending a e-mail to PyPI maintainers and answering some questions.

Python and "dead" batteries

Posted Jun 13, 2019 12:09 UTC (Thu) by cesarb (subscriber, #6266) [Link]

Was this before or after the npm event-stream incident? I expect taking over abandoned modules to have become harder after that.

Python and "dead" batteries

Posted Jun 13, 2019 11:19 UTC (Thu) by roc (subscriber, #30627) [Link]

Yeah, I'm not saying PyPI is great.

Python and "dead" batteries

Posted Jun 25, 2019 23:17 UTC (Tue) by HelloWorld (guest, #56129) [Link]

> For example there is no way of take ownership of modules if the original author disappears.
That sounds like a feature to me, not a bug. How would you protect users from wrongdoers who seize unmaintained modules and then ship malware that way?

Python and "dead" batteries

Posted Jun 12, 2019 21:02 UTC (Wed) by rgmoore (✭ supporter ✭, #75) [Link] (1 responses)

While moving some set of modules out of the standard library will certainly help alleviate the maintenance headaches for those modules, it will not magically grow new maintainers for them.

This seems like the key point to me. If the packages are unmaintained, that's the problem, not whether they're distributed with the standard library or through PyPI. The only way this improves things is if being moved out of the standard library makes it easier to attract developers. That seems like something that needs to be discussed.

Python and "dead" batteries

Posted Jun 13, 2019 2:04 UTC (Thu) by NYKevin (subscriber, #129325) [Link]

I think it depends on your perspective. One possible perspective:

  • This code is useful.
  • Useful code should be maintained.
  • Therefore, somebody should maintain the code, regardless of whether it continues living in the stdlib.

Another possibility:

  • No core developer wants to maintain this code.
  • Unmaintained code should not live in the stdlib.
  • Therefore, we should remove the code from the stdlib, regardless of whether some other developers want to maintain it elsewhere.

These perspectives are not in conflict with one another. In fact, they are entirely orthogonal. You can agree with either or both of them without contradiction. However, I feel obligated to point out that the first perspective does not identify who, precisely, ought to maintain the code. Unless someone volunteers, that perspective is rather academic.

Equally obvious is that you can disagree with either or both of these perspectives, of course. For example, I have no evidence whatsoever that "no developer wants to maintain this code" is in fact true.

Python and "dead" batteries

Posted Jun 12, 2019 23:49 UTC (Wed) by ms-tg (subscriber, #89231) [Link] (2 responses)

I’d like to remind the members of the Python community that hang out on LWN of the nearly identical problem domain encountered somewhat earlier by the Ruby community, and that the solution path chosen by the Ruby community is probably also ultimately going to be what’s most attractive for the Python community.

Out of a similar set of options, it was decided to strip many libraries over time out of the stdlib, and move them to Ruby gems distributed on rubygems.org (equivalent of PyPI).

As mentioned near the end of this article as a “if only we could” suggestion, this separation of project and code maintenance of each library is decoupled from the pace of their removal from Ruby releases, by including a set of “bundled” and “default” gems with each of the forthcoming Ruby releases. I imagine a Python solution will at some point adopt the same decoupled structure.

For more information on the Ruby process here, please see the official website for this effort:

https://stdgems.org/

Python and "dead" batteries

Posted Jun 13, 2019 8:23 UTC (Thu) by smcv (subscriber, #53363) [Link]

Perl has had the same thing for a long time ("dual-life" modules that exist both bundled with Perl and on CPAN). It seems like a good approach.

Python and "dead" batteries

Posted Jun 20, 2019 7:13 UTC (Thu) by njs (subscriber, #40338) [Link]

There's definitely been discussion of this option – it's basically how Python has been shipping 'pip' for a few years now, and one of the outcomes of Amber's talk at the language summit was the idea to start experimenting with doing the same for IDLE. There's also a discussion here that goes into more detail: https://discuss.python.org/t/1738

Python and "dead" batteries

Posted Jun 13, 2019 0:04 UTC (Thu) by JohnVonNeumann (guest, #131609) [Link] (3 responses)

> The modules are scattered throughout the standard library, but: "the majority of modules are for old data formats or old APIs. Some others are rarely useful and have better replacements on PyPI [the Python Package Index]."

See I get what's going on and that the community doesn't have enough maintainers, but IMO if I can do something with the stdlib I'll most likely do it with the stdlib, I try to avoid calling in 3rd party deps if I can. Naturally, this only works for smaller pieces of work but I disagree with the idea that just because there is something better as a 3rd party dep, the stdlib version should be killed off. I'm slightly simplifying their argument though.

Python and "dead" batteries

Posted Jun 13, 2019 0:35 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link] (2 responses)

I disagree with the idea that just because there is something better as a 3rd party dep, the stdlib version should be killed off.

If anything, this seems like an argument in favor of replacing the current stdlib version with the better version from PyPI, not removing the feature from stdlib completely.

It also seems to me that this might be a sign that there needs to be a different relationship between the core developers and PyPI. Instead of having a separate set of modules that are developed in stdlib, all of the stdlib modules could be developed in PyPI. Long term support versions could be branched to include in the stdlib, but users who wanted the newest features could replace it with the latest, greatest version if they chose. It would require changing the PyPI rules for those modules, but it would provide a useful compromise between development speed and stability.

Python and "dead" batteries

Posted Jun 13, 2019 1:45 UTC (Thu) by ms-tg (subscriber, #89231) [Link] (1 responses)

> If anything, this seems like an argument in favor of replacing the current stdlib version with the better version from PyPI, not removing the feature from stdlib completely.

> It also seems to me that this might be a sign that there needs to be a different relationship between the core developers and PyPI. Instead of having a separate set of modules that are developed in stdlib, all of the stdlib modules could be developed in PyPI. Long term support versions could be branched to include in the stdlib, but users who wanted the newest features could replace it with the latest, greatest version if they chose. It would require changing the PyPI rules for those modules, but it would provide a useful compromise between development speed and stability.

Is this project similar to what you're thinking of?
https://stdgems.org/

Python and "dead" batteries

Posted Jun 13, 2019 19:32 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link]

That's not exactly what I had in mind, but I like it. The basic approach of bundling libraries with the distribution that aren't organizationally part of the core is a good compromise. It avoids the complaint that moving things out of the core makes them unavailable to people with strict install policies while still getting the benefit of developing them outside the core.


Copyright © 2019, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds