Unplugging old batteries
Python is famous for being a "batteries included" language—its standard library provides a versatile set of modules with the language—but there may be times when some of those batteries have reached their end of life. At the 2018 Python Language Summit, Christian Heimes wanted to suggest a few batteries that may have outlived their usefulness and to discuss how the process of retiring standard library modules should work.
The "batteries included" phrase for Python came from the now-withdrawn PEP 206 in 2006. That PEP argued that having a rich standard library was an advantage for the language since users did not need to download lots of other modules to get real work done. That argument still holds, but there are some modules that are showing their age and should, perhaps, be unplugged and retired from the standard library.
![Christian Heimes [Christian Heimes]](https://static.lwn.net/images/2018/pls-heimes-sm.jpg)
For example, Heimes listed several different obsolete modules. He included the uu module, which implements encoding and decoding for uuencoded files. That standard is from 1980 and long predates MIME, he said. Three separate ancient media format libraries were also on the list: chunk (for IFF data), aifc (an Amiga audio format), and sunau (Sun's au audio format) . The final module in this list was nis, which implements Network Information Service (NIS, also known as "yellow pages"); its successor, NIS+, has been around since 1992, he said. No one spoke up to oppose retiring those modules.
The motivation to retire these (and other) modules is to reduce the cruft in the standard library. That will lead to a leaner standard library so there isn't such a "huge long list" of modules that greets new developers. It will also reduce the maintenance burden. Beyond that, there are almost certainly security flaws that exist in some of these older, largely unloved and undeveloped modules.
He has started drafting a PEP, which is about 80% done, but there are still open questions. There are also quite a number of debatable modules that might be considered for retirement, such as sndhdr and imghdr, which try to determine the type of a sound or image file but are woefully out of date. Several web modules had fixes and enhancements proposed for Python 2.1 in PEP 222, but that never happened. The old import library imp has been deprecated since 3.4; Brett Cannon said that it would be moving out of the standard library in 2020 when the Python 2.x series reaches its end of life. And so on.
Even if the list of modules to retire was agreed upon, there are still questions about how that process should work. Would the modules simply be removed with the hope that someone would pick them up and maintain them in the Python Package Index (PyPI)? Or would PyPI modules be created for them? Would they live in a single "dead battery" namespace or each using the existing module name? PyPI does not allow standard library module names for submitted modules, so it should be possible to use the existing names in PyPI if that is deemed desirable.
Ned Deily was not sure that a "ten minute discussion" at the summit was the right way to decide which modules to remove. He suggested that the process needed more visibility throughout the community, perhaps via a poll. There are probably uses for these modules that attendees are completely unaware of, he said. Posting the PEP will help raise the visibility, an attendee said; it will get discussed more widely at that point.
Index entries for this article | |
---|---|
Conference | Python Language Summit/2018 |
Python | Standard library |
Posted Jun 5, 2018 18:24 UTC (Tue)
by smoogen (subscriber, #97)
[Link] (7 responses)
===
===
NIS+ and NIS are as similar as Java and Javascript. Every time I think that no one is going to use it, some one sees that it makes accounts, hosts, mail etc simply available and deploys it again. That said, I expect that the sites using python and NIS are limited and they will be using python2.7 until the end of Unix time anyway.
Posted Jun 7, 2018 13:56 UTC (Thu)
by paulj (subscriber, #341)
[Link] (6 responses)
Event today, I still see NIS on some networks (madness, given there are now good "out of the box / batteries included" LDAP server solutions and client support).
Posted Jun 7, 2018 13:57 UTC (Thu)
by paulj (subscriber, #341)
[Link] (5 responses)
Posted Jun 7, 2018 16:22 UTC (Thu)
by smoogen (subscriber, #97)
[Link] (4 responses)
Posted Jun 8, 2018 7:40 UTC (Fri)
by epa (subscriber, #39769)
[Link] (3 responses)
Posted Jun 8, 2018 15:22 UTC (Fri)
by nix (subscriber, #2304)
[Link] (2 responses)
Posted Jun 8, 2018 15:43 UTC (Fri)
by paulj (subscriber, #341)
[Link] (1 responses)
Posted Jun 8, 2018 17:59 UTC (Fri)
by nix (subscriber, #2304)
[Link]
Posted Jun 6, 2018 1:41 UTC (Wed)
by pabs (subscriber, #43278)
[Link] (4 responses)
Posted Jun 6, 2018 8:01 UTC (Wed)
by k8to (guest, #15413)
[Link] (3 responses)
(Not familiar with yaml.open. Are there problems with this beyond the whole unsafe parser problem?)
Posted Jun 6, 2018 8:50 UTC (Wed)
by smcv (subscriber, #53363)
[Link] (2 responses)
In PyYAML, the function that only constructs "safe" types like lists, dicts, strings, booleans etc., analogous to json.load(), is yaml.safe_load().
(PyYAML isn't a standard library module anyway, though, so it isn't directly relevant here.)
Posted Jun 7, 2018 0:39 UTC (Thu)
by cjwatson (subscriber, #7322)
[Link]
Posted Jun 7, 2018 15:55 UTC (Thu)
by k8to (guest, #15413)
[Link]
Posted Jun 6, 2018 6:12 UTC (Wed)
by eru (subscriber, #2753)
[Link] (17 responses)
Posted Jun 6, 2018 7:41 UTC (Wed)
by mpr22 (subscriber, #60784)
[Link] (14 responses)
Posted Jun 6, 2018 7:51 UTC (Wed)
by ringerc (subscriber, #3071)
[Link] (2 responses)
Deprecated: majority of user base pays attention.
Obsolete it, add warnings: If the warnings are enabled by default, users complain unless you offer a way to turn it off, at which point devs turn it off by default. If they're for a maintainer mode, devs never turn it on. Nobody sees or pays attention to the warnings.
Remove it: oh my god it's the apocalypse, $everything was using that, bring it back!
Posted Jun 6, 2018 9:12 UTC (Wed)
by mbunkus (subscriber, #87248)
[Link] (1 responses)
Well, it does, more or less. Perl does it all the time. They often remove packages from the core distribution and make them available as separate CPAN packages users can install if they want to. That includes often-used (though ugly) things like the CGI module (removed in 5.22) as well as development-only things (Module::Build, Devel::DProf) and more obscure ones (Log::Message, Text::Soundex).
They're not shy about this either. You can use the "corelist" tool in order to get an idea of how many packages have been removed since, say, 5.10.0 (current release is 5.26.0):
> corelist --diff 5.10.0 5.26.0 | grep -E '\(absent\)$'
Posted Jun 12, 2018 18:05 UTC (Tue)
by jezuch (subscriber, #52988)
[Link]
[1] That somebody being Stuart Marks https://twitter.com/DrDeprecator
Posted Jun 6, 2018 8:55 UTC (Wed)
by eru (subscriber, #2753)
[Link] (10 responses)
In the case of a published, popular language or API, there is no practical way to find out if there are approximately zero users. I find it hard enough even in the case of the in-house language and tools I maintain.
Posted Jun 6, 2018 9:32 UTC (Wed)
by smurf (subscriber, #17840)
[Link] (9 responses)
Anybody who still uses these modules is unlikely to update their Python installation, much less upgrade to Python3. So why should we still carry them around?
Posted Jun 6, 2018 9:45 UTC (Wed)
by dottedmag (subscriber, #18590)
[Link] (3 responses)
The only valid reason to remove anything from a standard library is security. Unmaintained parsers for old file formats definitely are security problem, so ejecting them is not so bad.
What would be the reason for removing colorsys or pipes? Mark them as obsolete, tuck them into a separate section in documentation and forget them until the times come for a next epoch change.
Posted Jun 6, 2018 15:15 UTC (Wed)
by excors (subscriber, #95769)
[Link] (2 responses)
The only way to avoid security problems is to actually fix the bugs, then make sure the fixed version is more easily accessible than old buggy versions.
Posted Jun 6, 2018 18:09 UTC (Wed)
by epa (subscriber, #39769)
[Link]
I suggest that if the objective is to eliminate bugs (in general, not just security bugs) it is just as acceptable to remove the buggy feature as to modify it to be safe and bug-free. It's better to have a smaller body of code which you can stand behind and maintain actively, with some degree of confidence that it works as described. Of course you do have to consider whether anyone is using that feature in practice; but if they aren't, kill it.
Also, if a library is included in the standard library with Python, there is the expectation that it's somehow 'blessed' by the core developers and actively maintained.
Posted Jun 30, 2018 20:21 UTC (Sat)
by ssmith32 (subscriber, #72404)
[Link]
The problem with keeping them in standard libraries is what makes standard libraries nice: I can be a little lazier with them, because I assume a certain level of quality, review, and maintenance. When that level cannot be maintained, best to throw things out.
Posted Jun 6, 2018 12:02 UTC (Wed)
by eru (subscriber, #2753)
[Link] (3 responses)
Actually, in my workplace there is a big set of servers used for product development where NIS is used. Old-school NIS. Now I don't know if there is any local Python code that would depend on the nis package because of this, but I would not bet against it. Lots of groups make little tools for their own use.
File format parsers also should be kept around, even for formats no longer used much, to help fight the "digital amnesia" phenomenon.
Posted Jun 6, 2018 18:38 UTC (Wed)
by khim (subscriber, #9252)
[Link] (2 responses)
Unfortunatelly these are also very real security hazard (remember these NES files?
Very few guys even thought about secirity issues when they wrote these old parsers for obsolete formats. So it's very real danger of "digital amnesia" vs also very real danger of "0day exploits". Hard to pick a side there, really.
Posted Jun 7, 2018 7:42 UTC (Thu)
by epa (subscriber, #39769)
[Link] (1 responses)
To avoid digital amnesia, I might try running older, unsafe C code using an interpreter like <https://github.com/kframework/c-semantics>. It will be a thousand times slower than compiled code but that shouldn't matter much given how much faster machines are now than when these old formats were created.
Posted Jun 7, 2018 15:03 UTC (Thu)
by zlynx (guest, #2285)
[Link]
Posted Jun 7, 2018 15:58 UTC (Thu)
by k8to (guest, #15413)
[Link]
I'm sure I can work around the problem though, as I only run the code locally.
Posted Jun 6, 2018 12:58 UTC (Wed)
by k3ninho (subscriber, #50375)
[Link] (1 responses)
> you create needless extra work for someone with 100% certainty
Regular refactoring is a normal part of software maintenance. Your interfaces are separate from your implementation (because SOLID is a list of good practices) and so maintaining access via old interfaces isn't drudgery -- instead it's good separation of concerns. (You can even write programs which list and manipulate what those interfaces are, to facilitate integration testing.) Retiring old/unused interfaces is normal and healthy.
K3n.
Posted Jun 7, 2018 16:48 UTC (Thu)
by eru (subscriber, #2753)
[Link]
Your bullet points make sense in the context of a professional programmer or advanced hobbyist developing software as the main task. However, a lot of programs have been created for specific jobs, and their users and authors prefer not to touch them unless necessary for those jobs. The author may be well someone whose interests and expertise is not in building software, and is therefore not following what features get deprecated in the language his program is written in. Until the day comes when some small change has to be made, or perhaps a port to a new computer is needed, and this turns into a bigger than expected job, because the language has changed.
(Often at this point some bright fellow comes along, and proposes a rewrite in the language du jour, which is supposedly faster to do, because the new language is more powerful, and has these nice development tools. After using twice the time promised (if lucky), the resulting program does most of the same things as the old, but using 10x more memory and CPU... </old-git-mode>)
Posted Jun 6, 2018 10:00 UTC (Wed)
by dgm (subscriber, #49227)
[Link]
Posted Jun 6, 2018 14:28 UTC (Wed)
by jond (subscriber, #37669)
[Link]
Posted Jun 8, 2018 9:49 UTC (Fri)
by awilfox (guest, #124923)
[Link]
Unplugging old batteries
the final module in this list was nis, which implements Network Information Service (NIS, also known as "yellow pages"); its successor, NIS+, has been around since 1992, he said. No one spoke up to oppose retiring those modules.
Unplugging old batteries
Unplugging old batteries
Unplugging old batteries
Unplugging old batteries
Unplugging old batteries
Unplugging old batteries
Unplugging old batteries
Unplugging old batteries
Unplugging old batteries
Unplugging old batteries
Unplugging old batteries
Unplugging old batteries
Bad idea
You can make changes like that only in languages where the user base is small and you can reach them all, or languages clearly still in development phase. Otherwise backward-incompatible changes are just plain rudeness.
The drudgery of maintaining backward-compatibility is the price of success. (At least if you want to be considerate towards your users).
Or, for features which genuinely have approximately zero users or which need to die (o hai gets() strcpy() scanf() fscanf(); sscanf() gets to stay because it doesn't have the conceptual problems of scanf()/fscanf()), you can follow the well-established "deprecate, declare obsolete, remove" sequence.
Bad idea
Bad idea
Bad idea
Bad idea? Great idea!
for features which genuinely have approximately zero users
Bad idea
Bad idea
Bad idea
Bad idea
Bad idea
Bad idea
NIS is long deprecated
Bad idea
File format parsers also should be kept around, even for formats no longer used much, to help fight the "digital amnesia" phenomenon.
Bad idea
Bad idea
Bad idea
Bad idea
Perhaps we're victims of bad design and bad habits
If you're a user of a deprecated language feature, you're either:
* not doing work to maintain your code, so 'don't change what isn't broken' across the platform as well as your code package, or
* doing work to maintain your code and are a legitimate user of the deprecated functionality, so can adopt it and keep it alive, or
* doing work to maintain your code and can't adopt the deprecated functionality, so need to refactor your program.
(I'll be explicit about security concerns causing a need to upgrade: this is work to maintain your codebase.)
Perhaps we're victims of bad design and bad habits
Funnily, I found this Stack Overflow question about how to use the uu module, dated in 2015.
Unplugging old batteries
Unplugging old batteries
Unplugging old batteries