|
|
Subscribe / Log in / New account

Python "standard" library

By Jake Edge
July 24, 2019

Python is often mentioned in the same breath with the phrase "batteries included", which refers to the breadth of its standard library. But there is an effort underway to trim back the standard library by removing some unloved modules. In addition, there has been persistent talk of a major restructuring of the library, into a fairly minimal core as described in Amber Brown's talk at this year's Python Language Summit, or in other ways as discussed on the python-dev mailing list in January (though it has come up many times before that as well). A mid-July python-ideas mailing list thread picked up on some of that; it ended up showing, once again, that there is no real consensus on what the standard library is—or should be.

A fairly simple idea for a Python enhancement was posted by Abdur-Rahmaan Janhangeer; the discussion likely went in directions he was not expecting. He suggested adding a stdlib module, akin to the existing builtins module, that would provide a way to discover all of the modules in the standard library. Later in the thread, he expanded on the idea:

Like right now if you want to know the stdlib functions, you have to go through the docs. While teaching programming, i find the builtins module very convenient as students can discover by themselves. Similarly, a stdlib module when inspected, shows what you can import right out of the box. Inspection goes a long way in making learning fun, going through the dosctrings etc.

Andrew Barnert thought the suggestion had merit, but that it could go even further:

Is it just the names that are in stdlib, or some kind of lazy imports for the modules themselves, so you can do "from stdlib import pprint"? The latter is a bit more complicated, but it seems like it would make the feature a lot more useful—it's a way to guarantee that you get the standard pprint even if it's been shadowed by a local file, a way to make sure you get an early error if your distributor decided to leave pprint out of the distribution, and so on. And that would be similar to the builtins module (builtins.print is the builtin print, even if you've shadowed it with a module global).

He pointed Janhangeer at his stdlib project from a year ago, which allows getting names from standard library modules without remembering which modules define them:

    >>> import stdlib
    >>> stdlib.ETree
    <module 'xml.etree.ElementTree' from ... >
Barnert suggested Janhangeer implement the from stdlib ... feature as a Python Package Index (PyPI) module so that people could try it out. Christopher Barker agreed, but wanted to consider going even further: "This would be an opportunity to clearly define the 'standard library' as something other than 'all the stuff that ships with cPython'".

Steven D'Aprano wondered if there was a real use case for a stdlib module, however. builtins is often used to ensure that the code refers to the built-in function and not some shadowed name. For example, a module that defines its own open() would use builtins.open() to access the "real" function. Shadowing standard library names is probably not used all that often, so the need for stdlib is somewhat dubious: "[...] it isn't clear to me that shadowing parts of the std lib is useful or common (except by accident, which is a problem to fix not a feature to encourage)".

It turns out that there are a lot of corner cases in any real definition of what is contained in the standard library, however, and thus what would appear in a hypothetical stdlib module (or namespace). Barnert asked a series of questions about whether to include platform-specific modules (e.g. framebuf for MicroPython or Apple's PyObjC shipped with Python on macOS), platform-specific removals (e.g. Linux distributions that ship separate packages for parts of the "standard library"), language-internal modules (e.g. __future__, and C-language "modules" like _datetime or _compression), and more. The overarching question would seem to be: is the standard library the same everywhere or is it tuned to a particular environment?

Barker had a set of answers for those questions but, as might be guessed, others differed. Beyond that, though, D'Aprano objected to the "scope-creep" inherent in any attempt to define the standard library more precisely. CPython provides the reference implementation for the language and other implementations should, in general, strive to ship everything that comes with CPython—unless there is a good reason not to, he said. "'Standard' doesn't mean 'available everywhere, in every version of every implementation'." He noted that getting a PEP written and approved for a stdlib namespace would be "hard enough" without adding other battles into the process.

D'Aprano had a fairly straightforward definition of what should be in the stdlib namespace: everything that is documented on the "Python Standard Library" page. But even that has exceptions in his mind:

If there's some special case (let's say, the HovercraftFullOfEels module, which we want to document but for some reason we don't want to be accessible in the stdlib namespace), then its fine for it to be left out (and documented as such).

But Barker is not afraid of widening the scope to refine the definition for the standard library. He sees it as an opportunity, and one that would be "far more useful than simply providing a new namespace for all the cruft that is already in there". Perhaps reusing the name "standard library" is not the best way forward; he suggested perhaps using "common library" to denote the subset of what ships with CPython but is expected to be available "everywhere". It is a subject with lots of potential arguments, however:

As someone said -- there is room for a lot of bike shedding around the edges -- so be it. If someone takes up the mantle and makes a test implementation and starts a PEP. then we will have a framework for that bike shedding.

That "someone" will not be him and Janhangeer has not indicated any interest in the larger idea. Overall, Barker's goal seems to be clearly delineating the "things that are really designed to be generically useful and counted on everywhere" versus those that are simply shipped with CPython. He noted the discussions about removing modules from the standard library and thought his idea might provide a middle ground of sorts; things could be moved from the common library to the standard library as part of the deprecation-signaling process. "The fact is that Python is almost entirely platform agnostic, which is a really great feature -- I'm suggesting it would be a tad better if the non-standard parts were more clearly labelled, that's all."

Chris Angelico pointed out some areas where it would be difficult to tease apart the platform specificity in the standard library and that's about where the conversation ended. Part of the problem with any attempt to tackle the standard library is the historical accretion it has undergone. If Python were somehow magically rebooted today, the standard library would likely look much different. It might well, in fact, look a lot like what Barker is advocating, though probably even more minimal than that. Would the Python core developers still want to design a "batteries included" standard library if they started over?

The other place where reworking the standard library runs aground is the problem of backward compatibility. After the huge mess that the Python 3 transition caused, which is, of course, still being felt today and likely for the next five years or more, one would guess the core developers will be extremely careful about breaking things moving forward. That makes it hard to do much more than slowly deprecate a fairly small number of modules over the coming years; a true rework of the standard library seems like something we will not be seeing anytime soon—if ever.


Index entries for this article
PythonStandard library


to post comments

Python "standard" library

Posted Jul 24, 2019 21:12 UTC (Wed) by Kamilion (subscriber, #42576) [Link] (2 responses)

This is one of those spots that micropython really shines. For some reason, the unix/windows port doesn't have the help command defined like the embedded images do. It's not hard to turn on, pop into ports/unix/mpconfigport.h, look for the MICROPY_PY_BUILTINS section
#define MICROPY_PY_BUILTINS_HELP            (1)
#define MICROPY_PY_BUILTINS_HELP_MODULES    (1)
This'll enable the enumeration of the built-in upy modules available:
MicroPython v1.11-167-g331c224e0-dirty on 2019-07-24; linux version
Use Ctrl-D to exit, Ctrl-E for paste mode
>>> help('modules')
__main__          micropython       uheapq            uselect
_thread           sys               uio               usocket
array             termios           ujson             ussl
btree             ubinascii         umachine          ustruct
builtins          ucollections      uos               utime
cmath             ucryptolib        upip              utimeq
ffi               uctypes           upip_utarfile     uwebsocket
gc                uerrno            urandom           uzlib
math              uhashlib          ure
Plus any modules on the filesystem
>>>

Python "standard" library

Posted Jul 25, 2019 14:41 UTC (Thu) by nowster (subscriber, #67) [Link] (1 responses)

Did you know that help("modules") is available in most versions of Python?

Also that help() in an interactive session opens a help utility?

Python "standard" library

Posted Jul 25, 2019 22:35 UTC (Thu) by Kamilion (subscriber, #42576) [Link]

I did not!
(It sure took a while, and spewed a bunch of errors in the process, modules trying to load C extensions and such...)

Python "standard" library

Posted Jul 25, 2019 6:33 UTC (Thu) by oussoren (subscriber, #6039) [Link] (1 responses)

A minor nit: it is not “Apple’s PyObjC”. An old version of the project is shipped by Apple with macOS, but that’s it.

Python "standard" library

Posted Jul 25, 2019 13:46 UTC (Thu) by jake (editor, #205) [Link]

> it is not “Apple’s PyObjC”. An old version of the project is shipped by Apple with macOS

ah, ok, thanks for the correction ... i amended the text to hopefully clear that up ...

jake


Copyright © 2019, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds