|
|
Subscribe / Log in / New account

Unplugging old batteries

By Jake Edge
June 5, 2018

Python Language Summit

Python is famous for being a "batteries included" language—its standard library provides a versatile set of modules with the language—but there may be times when some of those batteries have reached their end of life. At the 2018 Python Language Summit, Christian Heimes wanted to suggest a few batteries that may have outlived their usefulness and to discuss how the process of retiring standard library modules should work.

The "batteries included" phrase for Python came from the now-withdrawn PEP 206 in 2006. That PEP argued that having a rich standard library was an advantage for the language since users did not need to download lots of other modules to get real work done. That argument still holds, but there are some modules that are showing their age and should, perhaps, be unplugged and retired from the standard library.

[Christian Heimes]

For example, Heimes listed several different obsolete modules. He included the uu module, which implements encoding and decoding for uuencoded files. That standard is from 1980 and long predates MIME, he said. Three separate ancient media format libraries were also on the list: chunk (for IFF data), aifc (an Amiga audio format), and sunau (Sun's au audio format) . The final module in this list was nis, which implements Network Information Service (NIS, also known as "yellow pages"); its successor, NIS+, has been around since 1992, he said. No one spoke up to oppose retiring those modules.

The motivation to retire these (and other) modules is to reduce the cruft in the standard library. That will lead to a leaner standard library so there isn't such a "huge long list" of modules that greets new developers. It will also reduce the maintenance burden. Beyond that, there are almost certainly security flaws that exist in some of these older, largely unloved and undeveloped modules.

He has started drafting a PEP, which is about 80% done, but there are still open questions. There are also quite a number of debatable modules that might be considered for retirement, such as sndhdr and imghdr, which try to determine the type of a sound or image file but are woefully out of date. Several web modules had fixes and enhancements proposed for Python 2.1 in PEP 222, but that never happened. The old import library imp has been deprecated since 3.4; Brett Cannon said that it would be moving out of the standard library in 2020 when the Python 2.x series reaches its end of life. And so on.

Even if the list of modules to retire was agreed upon, there are still questions about how that process should work. Would the modules simply be removed with the hope that someone would pick them up and maintain them in the Python Package Index (PyPI)? Or would PyPI modules be created for them? Would they live in a single "dead battery" namespace or each using the existing module name? PyPI does not allow standard library module names for submitted modules, so it should be possible to use the existing names in PyPI if that is deemed desirable.

Ned Deily was not sure that a "ten minute discussion" at the summit was the right way to decide which modules to remove. He suggested that the process needed more visibility throughout the community, perhaps via a poll. There are probably uses for these modules that attendees are completely unaware of, he said. Posting the PEP will help raise the visibility, an attendee said; it will get discussed more widely at that point.


Index entries for this article
ConferencePython Language Summit/2018
PythonStandard library


to post comments

Unplugging old batteries

Posted Jun 5, 2018 18:24 UTC (Tue) by smoogen (subscriber, #97) [Link] (7 responses)

So this caught my eye as one of those "Kids these days.. not knowing the difference between a 56 and a 58 Chevy" where I need to find my bottle of Geritol after posting.

===
the final module in this list was nis, which implements Network Information Service (NIS, also known as "yellow pages"); its successor, NIS+, has been around since 1992, he said. No one spoke up to oppose retiring those modules.

===

NIS+ and NIS are as similar as Java and Javascript. Every time I think that no one is going to use it, some one sees that it makes accounts, hosts, mail etc simply available and deploys it again. That said, I expect that the sites using python and NIS are limited and they will be using python2.7 until the end of Unix time anyway.

Unplugging old batteries

Posted Jun 7, 2018 13:56 UTC (Thu) by paulj (subscriber, #341) [Link] (6 responses)

I've never seen a NIS+ deployment. There were never any (decent/complete anyway) implementations for non-Solaris, indeed I'm not even sure there was even much in the way solid NIS+ client support outside of Solaris. So anyone who wanted any kind of interoperability across Unix boxes kept using NIS or went to LDAP.

Event today, I still see NIS on some networks (madness, given there are now good "out of the box / batteries included" LDAP server solutions and client support).

Unplugging old batteries

Posted Jun 7, 2018 13:57 UTC (Thu) by paulj (subscriber, #341) [Link] (5 responses)

tl;dr: Between NIS and NIS+ modules, /NIS+/ is the one that could be safely retired.

Unplugging old batteries

Posted Jun 7, 2018 16:22 UTC (Thu) by smoogen (subscriber, #97) [Link] (4 responses)

Agreed. The issue with ldap is that the out of the box implementations are rarely 'run make in this directory and everyone now has updated hosts, groups, services, etc etc' from whatever you have on your master box. It is usually set up this service, go to this webservice, add this thing, etc

Unplugging old batteries

Posted Jun 8, 2018 7:40 UTC (Fri) by epa (subscriber, #39769) [Link] (3 responses)

It all seems a lot of complexity when you can just push out the new passwd and groups files (etc) with scp.

Unplugging old batteries

Posted Jun 8, 2018 15:22 UTC (Fri) by nix (subscriber, #2304) [Link] (2 responses)

Why is there never any love for Hesiod? All you need for that is a nameserver! (Is it just that it's so simple that there's no need for supporting tooling, so there's no need for elaborate marketing campaigns? Also it was written decades ago so it must be obsolete...)

Unplugging old batteries

Posted Jun 8, 2018 15:43 UTC (Fri) by paulj (subscriber, #341) [Link] (1 responses)

Never realised Hesiod was that easy to setup. Downside might be that DNS can be harder to secure. ?

Unplugging old batteries

Posted Jun 8, 2018 17:59 UTC (Fri) by nix (subscriber, #2304) [Link]

Yeah, Hesiod dates from a kinder era -- you'd need local DNSSEC if you wanted to be secure against local-network DNS spoofers. However, for a lot of us that is a non-issue: anyone who could spoof my DNS could also interfere with my NFS traffic, etc...

Unplugging old batteries

Posted Jun 6, 2018 1:41 UTC (Wed) by pabs (subscriber, #43278) [Link] (4 responses)

Can we retire hard edges like os.system/os.popen/yaml.open too?

Unplugging old batteries

Posted Jun 6, 2018 8:01 UTC (Wed) by k8to (guest, #15413) [Link] (3 responses)

You mean traps for the unwary?

(Not familiar with yaml.open. Are there problems with this beyond the whole unsafe parser problem?)

Unplugging old batteries

Posted Jun 6, 2018 8:50 UTC (Wed) by smcv (subscriber, #53363) [Link] (2 responses)

It's a recurring (anti-)pattern in YAML libraries (in several languages!) that a function called something like yaml.load() executes arbitrary code (by passing arbitrary fields from the YAML to object constructors, which a creative attacker can turn into arbitrary code execution).

In PyYAML, the function that only constructs "safe" types like lists, dicts, strings, booleans etc., analogous to json.load(), is yaml.safe_load().

(PyYAML isn't a standard library module anyway, though, so it isn't directly relevant here.)

Unplugging old batteries

Posted Jun 7, 2018 0:39 UTC (Thu) by cjwatson (subscriber, #7322) [Link]

The PyYAML case will finally be fixed in the next release, although no doubt code depending on it will still need to be careful for a while if it needs to support old versions: https://github.com/yaml/pyyaml/pull/74

Unplugging old batteries

Posted Jun 7, 2018 15:55 UTC (Thu) by k8to (guest, #15413) [Link]

Yes, that would be the problem I was referring to. I'm very sad this "feature" exists.

Bad idea

Posted Jun 6, 2018 6:12 UTC (Wed) by eru (subscriber, #2753) [Link] (17 responses)

In any language, a standard library module is effectively part of the language. If you remove it or change it in an incompatible way, you create needless extra work for someone with 100% certainty. This is especially true in a succesful language like Python.
You can make changes like that only in languages where the user base is small and you can reach them all, or languages clearly still in development phase. Otherwise backward-incompatible changes are just plain rudeness.
The drudgery of maintaining backward-compatibility is the price of success. (At least if you want to be considerate towards your users).

Bad idea

Posted Jun 6, 2018 7:41 UTC (Wed) by mpr22 (subscriber, #60784) [Link] (14 responses)

Or, for features which genuinely have approximately zero users or which need to die (o hai gets() strcpy() scanf() fscanf(); sscanf() gets to stay because it doesn't have the conceptual problems of scanf()/fscanf()), you can follow the well-established "deprecate, declare obsolete, remove" sequence.

Bad idea

Posted Jun 6, 2018 7:51 UTC (Wed) by ringerc (subscriber, #3071) [Link] (2 responses)

Problem is, that sequence doesn't work.

Deprecated: majority of user base pays attention.

Obsolete it, add warnings: If the warnings are enabled by default, users complain unless you offer a way to turn it off, at which point devs turn it off by default. If they're for a maintainer mode, devs never turn it on. Nobody sees or pays attention to the warnings.

Remove it: oh my god it's the apocalypse, $everything was using that, bring it back!

Bad idea

Posted Jun 6, 2018 9:12 UTC (Wed) by mbunkus (subscriber, #87248) [Link] (1 responses)

> Problem is, that sequence doesn't work.

Well, it does, more or less. Perl does it all the time. They often remove packages from the core distribution and make them available as separate CPAN packages users can install if they want to. That includes often-used (though ugly) things like the CGI module (removed in 5.22) as well as development-only things (Module::Build, Devel::DProf) and more obscure ones (Log::Message, Text::Soundex).

They're not shy about this either. You can use the "corelist" tool in order to get an idea of how many packages have been removed since, say, 5.10.0 (current release is 5.26.0):

> corelist --diff 5.10.0 5.26.0 | grep -E '\(absent\)$'

Bad idea? Great idea!

Posted Jun 12, 2018 18:05 UTC (Tue) by jezuch (subscriber, #52988) [Link]

Java will do this too. There are many things that have been deprecated for 20 years and finally somebody[1] said "enough is enough!". And so now there is an official process for removing old junk, that everybody hopes will be followed without mercy :)

[1] That somebody being Stuart Marks https://twitter.com/DrDeprecator

Bad idea

Posted Jun 6, 2018 8:55 UTC (Wed) by eru (subscriber, #2753) [Link] (10 responses)

for features which genuinely have approximately zero users

In the case of a published, popular language or API, there is no practical way to find out if there are approximately zero users. I find it hard enough even in the case of the in-house language and tools I maintain.

Bad idea

Posted Jun 6, 2018 9:32 UTC (Wed) by smurf (subscriber, #17840) [Link] (9 responses)

You can approximate this, though. NIS is long deprecated. So are various obsolete file formats, and/or incomplete methods of detecting them.

Anybody who still uses these modules is unlikely to update their Python installation, much less upgrade to Python3. So why should we still carry them around?

Bad idea

Posted Jun 6, 2018 9:45 UTC (Wed) by dottedmag (subscriber, #18590) [Link] (3 responses)

Why _not_ carry it around?

The only valid reason to remove anything from a standard library is security. Unmaintained parsers for old file formats definitely are security problem, so ejecting them is not so bad.

What would be the reason for removing colorsys or pipes? Mark them as obsolete, tuck them into a separate section in documentation and forget them until the times come for a next epoch change.

Bad idea

Posted Jun 6, 2018 15:15 UTC (Wed) by excors (subscriber, #95769) [Link] (2 responses)

Why would ejecting them help security? If a developer wants to parse an old file format, they'll use the module in the standard library, and if it's not there they'll find the identical one that's been moved to PyPI, else they'll download some random code from GitHub. They'll be exposed to the same security issues in any case. And if a developer doesn't want to parse that old file format, they'll never use the module even if it's in the standard library, so its presence doesn't hurt anything.

The only way to avoid security problems is to actually fix the bugs, then make sure the fixed version is more easily accessible than old buggy versions.

Bad idea

Posted Jun 6, 2018 18:09 UTC (Wed) by epa (subscriber, #39769) [Link]

By that argument, most of the cleanup work in LibreSSL (where they ripped out dozens of crufty ciphers and obsolete options from the OpenSSL code) is wasted. If someone wants to turn on some wacko crypto feature, they'll end up using random code from GitHub (or OpenSSL itself) if the feature is no longer supported in LibreSSL. And if the option is never used, it doesn't hurt security to have it. What is wrong with this argument?

I suggest that if the objective is to eliminate bugs (in general, not just security bugs) it is just as acceptable to remove the buggy feature as to modify it to be safe and bug-free. It's better to have a smaller body of code which you can stand behind and maintain actively, with some degree of confidence that it works as described. Of course you do have to consider whether anyone is using that feature in practice; but if they aren't, kill it.

Also, if a library is included in the standard library with Python, there is the expectation that it's somehow 'blessed' by the core developers and actively maintained.

Bad idea

Posted Jun 30, 2018 20:21 UTC (Sat) by ssmith32 (subscriber, #72404) [Link]

Practically speaking, it raises the level of effort required to use said bad code, hopefully discouraging is use by the lazy, and encouraging some thought about how to contain the security implications by those that are not.

The problem with keeping them in standard libraries is what makes standard libraries nice: I can be a little lazier with them, because I assume a certain level of quality, review, and maintenance. When that level cannot be maintained, best to throw things out.

Bad idea

Posted Jun 6, 2018 12:02 UTC (Wed) by eru (subscriber, #2753) [Link] (3 responses)

NIS is long deprecated

Actually, in my workplace there is a big set of servers used for product development where NIS is used. Old-school NIS. Now I don't know if there is any local Python code that would depend on the nis package because of this, but I would not bet against it. Lots of groups make little tools for their own use.

File format parsers also should be kept around, even for formats no longer used much, to help fight the "digital amnesia" phenomenon.

Bad idea

Posted Jun 6, 2018 18:38 UTC (Wed) by khim (subscriber, #9252) [Link] (2 responses)

File format parsers also should be kept around, even for formats no longer used much, to help fight the "digital amnesia" phenomenon.

Unfortunatelly these are also very real security hazard (remember these NES files?

Very few guys even thought about secirity issues when they wrote these old parsers for obsolete formats. So it's very real danger of "digital amnesia" vs also very real danger of "0day exploits". Hard to pick a side there, really.

Bad idea

Posted Jun 7, 2018 7:42 UTC (Thu) by epa (subscriber, #39769) [Link] (1 responses)

A lot of the dangers in writing a file format parser are avoided by using a safe language like Python instead of C.

To avoid digital amnesia, I might try running older, unsafe C code using an interpreter like <https://github.com/kframework/c-semantics>. It will be a thousand times slower than compiled code but that shouldn't matter much given how much faster machines are now than when these old formats were created.

Bad idea

Posted Jun 7, 2018 15:03 UTC (Thu) by zlynx (guest, #2285) [Link]

Until someone writes an optimized parser in C. Python is full of these things.

Bad idea

Posted Jun 7, 2018 15:58 UTC (Thu) by k8to (guest, #15413) [Link]

For a silly data point, I still have code that I still run that process amiga data files that I use in an archival project. I'm a little sad about some of the modules I use being removed from python.

I'm sure I can work around the problem though, as I only run the code locally.

Perhaps we're victims of bad design and bad habits

Posted Jun 6, 2018 12:58 UTC (Wed) by k3ninho (subscriber, #50375) [Link] (1 responses)

We're thinking about software maintenance and backwards-compatibility wrong. The system as a whole -- not just the code, I'm including stuff as far as the other users it has and the other networked devices with which it might interact -- is always a progressing and developing thing. Think 'life moves forward'. There is always a need to budget for time and effort to respond to this ongoing change. We can't assume that the job is done once the product is shipped and the upstream contributions tucked away in their code repo (ha!).

> you create needless extra work for someone with 100% certainty
If you're a user of a deprecated language feature, you're either:
* not doing work to maintain your code, so 'don't change what isn't broken' across the platform as well as your code package, or
* doing work to maintain your code and are a legitimate user of the deprecated functionality, so can adopt it and keep it alive, or
* doing work to maintain your code and can't adopt the deprecated functionality, so need to refactor your program.
(I'll be explicit about security concerns causing a need to upgrade: this is work to maintain your codebase.)

Regular refactoring is a normal part of software maintenance. Your interfaces are separate from your implementation (because SOLID is a list of good practices) and so maintaining access via old interfaces isn't drudgery -- instead it's good separation of concerns. (You can even write programs which list and manipulate what those interfaces are, to facilitate integration testing.) Retiring old/unused interfaces is normal and healthy.

K3n.

Perhaps we're victims of bad design and bad habits

Posted Jun 7, 2018 16:48 UTC (Thu) by eru (subscriber, #2753) [Link]

Your bullet points make sense in the context of a professional programmer or advanced hobbyist developing software as the main task. However, a lot of programs have been created for specific jobs, and their users and authors prefer not to touch them unless necessary for those jobs. The author may be well someone whose interests and expertise is not in building software, and is therefore not following what features get deprecated in the language his program is written in. Until the day comes when some small change has to be made, or perhaps a port to a new computer is needed, and this turns into a bigger than expected job, because the language has changed.

(Often at this point some bright fellow comes along, and proposes a rewrite in the language du jour, which is supposedly faster to do, because the new language is more powerful, and has these nice development tools. After using twice the time promised (if lucky), the resulting program does most of the same things as the old, but using 10x more memory and CPU... </old-git-mode>)

Unplugging old batteries

Posted Jun 6, 2018 10:00 UTC (Wed) by dgm (subscriber, #49227) [Link]

Funnily, I found this Stack Overflow question about how to use the uu module, dated in 2015.

Unplugging old batteries

Posted Jun 6, 2018 14:28 UTC (Wed) by jond (subscriber, #37669) [Link]

Since one of the reasons for doing this is the perceived maintenance burden of keeping them in, the actual, measureable maintenance burden, per-module, should be a factor in the decision. I suspect "uu", for example, requires almost zero maintenance.

Unplugging old batteries

Posted Jun 8, 2018 9:49 UTC (Fri) by awilfox (guest, #124923) [Link]

Funny. I just used the sunau module two months ago, from 3.6, to have a tinker and learn about how sound works from a digital I/O perspective. Really simple interface and really educational. Ah well, I'm sure there's other little modules about that can take its place (or maybe it can go to PyPI).


Copyright © 2018, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds