|
|
Subscribe / Log in / New account

Postponing some feature removals in Python 3.9

By Jake Edge
February 4, 2020

Python 2 was officially "retired" on the last day of 2019, so no bugs will be fixed or changes made in that version of the language, at least by the core developers—distributions and others will continue for some time to come. But there are lots of Python projects that still support Python 2.7 and may not be ready for an immediate clean break. Some changes that were made for the upcoming Python 3.9 release (which is currently scheduled for October) are causing headaches because support for long-deprecated 2.7-compatibility features is being dropped. That led to a discussion on the python-dev mailing list about postponing those changes to give a bit more time to projects that want to drop Python 2.7 support soon, but not immediately.

There will actually be one final release of Python 2, Python 2.7.18, in April. It is something of a celebratory release that will be made in conjunction with PyCon. There were some fixes that accumulated in the branch between the 2.7.17 release in October and the end of the year, so those fixes will be flushed and the branch retired. Other than the release itself, no other changes will be allowed for that branch in 2020.

Compatibility

Fedora has recently started testing the Python 3.9 alpha releases in preparation for shipping 3.9 with Fedora 33, which is due around the end of the year. Victor Stinner and Miro Hrončok reported that a lot of packages will not build with the new version due to dropping the compatibility with Python 2.7. When that was posted on January 23, there were more than 150 packages broken by these changes, so they suggested that a handful of the ones causing the most problems be reverted.

Miro and me consider that Python 3.9 is pushing too much pressure on projects maintainers to either abandon Python 2.7 right now (need to update the CI, the documentation, warn users, etc.), or to introduce a *new* compatibility layer to support Python 3.9: layer which would be dropped as soon as Python 2.7 support will be dropped (soon-ish).

As described in the message, many existing Python packages have code to handle differences between Python 2.7 and Python 3.x, but the changes currently in 3.9 might require another layer of compatibility code—one that may be short-lived. Instead of requiring that all of these other developers make those changes immediately, they said that it makes more sense for the core language to maintain the compatibility for one more release:

While it's certainly tempting to have "more pure" code in the standard library, maintaining the compatibility shims for one more release isn't really that big of a maintenance burden, especially when comparing with dozens (hundreds?) of third party libraries essentially maintaining their own.

One of the examples provided is for the Sequence abstract base class (ABC) that can be used to create types similar to lists and tuples. In Python 2.7, it lives in the top level of the collections module, but for Python 3.3, the ABCs moved into collections.abc, though the top-level aliases were maintained. Those aliases are slated to be removed for 3.9, which will mean adding code like the following for projects that are still supporting the older version:

    try:
	from collections.abc import Sequence
    except ImportError:
	# Python 2.7 doesn't have collections.abc
	from collections import Sequence

If the developers of the third-party packages end up dropping support for 2.7 over the next year or so, then they will have needlessly added code like that when they could simply have put in the .abc wherever it was needed. While fixes are in progress for many of the projects, it will take time to get them out there, so Python 3.9 support could lag, which is probably not what the core developers want. Toward the bottom of the message, they list five incompatible changes (from the list in the in-progress Python 3.9 release notes); delaying those would eliminate most of the problems.

Reaction

The reaction was somewhat mixed. Steering council member Barry Warsaw was in favor of postponing those changes. Since Python has recently moved to a one-year cadence (down from 18 months), that shortened the time frame for projects to make these kinds of changes, he said. "And if it helps with the migration off of Python 2.7, then +1 from me."

Others were concerned that doing so would simply add another year to a longstanding plan to get rid of the compatibility cruft once 2.7 reached end-of-life. Ivan Levkivskyi thought that it was important to stick to the schedule that was established, but he also was not sure that delaying a year is truly beneficial: "For example, importing ABCs directly from collections was deprecated 8 years ago, what would 1 extra year change?" Eric V. Smith was also unsure about adding another year:

I think the concern is that with removing so many deprecated features, we're effectively telling libraries that if they want to support 3.9, they'll have stop supporting 2.7. And many library authors aren't willing to do that yet. Will they be willing to in another year? I can't say.

But Hrončok saw things differently:

The concern is not that they don't want to drop 2.7 support, but that it is a nontrivial task to actually do and we cannot expect them to do it within the first couple weeks of 2020. While at the same time, we want them to support 3.9 since the early development versions in order to be able to detect regressions early in the dev cycle.

Given that, Smith said that he was not opposed to a postponement as one-time thing to help those projects in the interim. He was guessing that deprecation warnings had been ignored from Python 3 in order to continue supporting 2.7, so a transition period is needed. It should be noted that deprecation warnings were not as prominent prior to the Python 3.7 release in 2018. Council member Brett Cannon concurred with postponing the deprecations:

I'm also okay with a one-time delay in removals that are problematic for code trying to get off of Python 2.7 this year and might not quite cut it before 2021 hits. I'm sure some people will be caught off-guard once 3.9b1 comes out and they realize that they now have to start caring about deprecation warnings again. So I'm okay letting 3.9 be the release where we loudly declare, "deprecation warnings matter again! Keep your code warnings-free going forward as things will start being removed in 3.10".

Paul Moore was concerned that it will just give projects more time to procrastinate: "I think that far more people will see this as yet another delay before 2.7 dies, and treat it as one more reason to do nothing". He was not opposed to the specific changes suggested, however. But Serhiy Storchaka thought that postponing would just lead to problems when it came time to put together 3.10, 3.11, and beyond. Also, the deprecations will help surface code that is no longer maintained:

I consider breaking unmaintained code is an additional benefit of removing deprecated features. For example pycrypto was unmaintained and insecure for 6 years, but has 4 million downloads per month. It will not work in 3.9 because of removing time.clock. Other example is nose, unmaintained for 5 years, and superseded by nose2.

Guido van Rossum cautioned against that approach, though he admitted to making similar statements in the past:

I now think core Python should not be so judgmental. We've broken enough code for a lifetime with the Python 2 transition. Let's be *much* more conservative when we remove things from Python 3.

Overall, the sense is that the specific reversions are reasonable, even if there are some concerns about delaying them for another release—thus sending the wrong signal to some projects. Stinner is also a steering council member, so it would seem there is a majority on that body in favor of pushing them back to 3.10, which will presumably come out in late 2021. It's hard to see things pushing out further than that, though, so projects should definitely take the opportunity to work out their strategy with regard to 2.7 compatibility.

Though Python 2.7 is "dead" in some sense, there are differing ideas of what that means in practice. There will certainly still be an enormous amount of Python 2 code running at the end of 2020; in fact, there will undoubtedly still be a lot of that code running at the end of the 2020s, but presumably much less at that point. The core development team has long been ready to leave that legacy behind, but smoothing the path for those who are not quite ready to come to terms with the demise of the Norwegian Blue ("beautiful plumage") is seemingly for the best.


Index entries for this article
PythonLinux distributions
PythonPython 3/Compatibility


to post comments

Postponing some feature removals in Python 3.9

Posted Feb 6, 2020 17:40 UTC (Thu) by logang (subscriber, #127618) [Link] (22 responses)

I really wish more projects would take a harder stance against breaking users. I very much appreciate Linus's stance on this. You can't remove a feature until you can reasonably make the case that nobody is using it.

Furthermore, making a change to intentionally break "unmaintained" projects is insane. Imagine if the Linux Kernel did that: "oh your software doesn't run anymore? I guess it's unmaintained and you should stop using it." Madness.

Postponing some feature removals in Python 3.9

Posted Feb 6, 2020 21:09 UTC (Thu) by amarao (guest, #87073) [Link] (1 responses)

I definitely a slowpoke on py2/3 migration, but after I migtated a medium-sized project I was amazed on how many small things were broken. Max() has different behavior for None, ValueError no longer is a child of StandardError, Exceptions no longer have .message, argparse now make subparsers optional... They aren't big things, but annoying for real.

At the same time python 3 still confuses bools and ints. This code wasn't fixed in py3:

a ={1:'first', True:'second'}
a[1]

'second'

Postponing some feature removals in Python 3.9

Posted Feb 7, 2020 16:03 UTC (Fri) by anselm (subscriber, #2796) [Link]

There's nothing to fix because Python does not have a separate Boolean type. bool is a subclass of int which coerces values to True or False according to the rules, as in

>>> print(bool(333))
True
>>> print(bool([]))
False
>>> print(bool({'a': 'b'}))
True

But True and False are just aliases for 1 and 0 that look different when you print them but still work as integers, hence

>>> print(True + 2)
3

(see The Python Library Reference).

In your dict example, Python works exactly as the documentation promises:

If a comma-separated sequence of key/datum pairs is given, they are evaluated from left to right to define the entries of the dictionary: each key object is used as a key into the dictionary to store the corresponding datum. This means that you can specify the same key multiple times in the key/datum list, and the final dictionary’s value for that key will be the last one given.
Python Language Reference, section 6.2.7 (Dictionary Displays)

Since 1 and True both evaluate to 1, the second key/datum pair wins.

Postponing some feature removals in Python 3.9

Posted Feb 6, 2020 21:21 UTC (Thu) by kjp (guest, #39639) [Link] (17 responses)

Maybe people just need to accept that python is an *unstable* *scripting language* and need to stop writing business critical applications in it. Or anything critical that needs to stay working in another year, I guess. You get what you pay for, I guess.

Postponing some feature removals in Python 3.9

Posted Feb 6, 2020 23:03 UTC (Thu) by logang (subscriber, #127618) [Link]

If they keep making decisions like this then, yes, I agree!

Shell keeps working

Posted Feb 9, 2020 0:06 UTC (Sun) by david.a.wheeler (subscriber, #72896) [Link] (15 responses)

Meanwhile, shell scripts written in the 1980s still work unchanged.

This is madness. The point of a programming language is to solve problems, not to impose pain on its users.

The fact that something was "deprecated 8 years ago" is irrelevant. Transitioning from Python 2 to Python 3 wasn't viable for years, and even now it's a huge cost with *no* user benefits (massive increased speed or huge new functionality).

Python3 by itself is a fine language, but the 2/3 transition was a complete botch. I think the developers of Python need to give the users of Python more time & tools to actually do transition. There's still a huge amount of Python2-only code, *because* the developers of Python made it hard to transition. Making transition harder does not help, it just gives a reason to stay on Python2. And yes, many projects are (still) staying on Python2.

Shell keeps working

Posted Feb 9, 2020 0:57 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (4 responses)

Frankly, if you want to write a new shell script like they did "in the 1980s," it's quite painful (unless you are referring to 1989 specifically, which was the only year "in the 1980s" when bash existed). Here are some things you can't use because they're not in POSIX:

- find -print0 (and xargs -0)
- find -exec
- Extended globbing
- Any other way of reliably walking a file tree (except for file in "$1"/* in a recursive shell function)
- The [[ conditional ]] syntax (have to use [ this style ] or test(1) instead, which is less powerful and more error-prone)
- Arrays
- Type safety (everything is a string)
- PCRE anything (grep, etc.)

Yes, backcompat is important. But tools have gotten far better since the 80s. It is disingenuous to claim that there are "no benefits" to Python 3; a massive amount of new functionality (since 2.7) is recorded at https://docs.python.org/3/whatsnew/index.html, and most of it has not been backported (or has only been backported as an external third-party dependency which you have to manage). Of course, much of that functionality may be irrelevant to your particular interests, but judging by https://python3statement.org/, a large number of library authors either disagree with you or just don't care.

Shell keeps working

Posted Feb 9, 2020 15:41 UTC (Sun) by rgmoore (✭ supporter ✭, #75) [Link] (2 responses)

Frankly, if you want to write a new shell script like they did "in the 1980s," it's quite painful

You know what else is really painful? Having to rewrite working code because language maintainers decided to stop supporting a language feature it depended on. That's the complaint, here, not that it will be harder to write new code going forward. BASH is great not just because it added a bunch of new features but because it maintains backward compatibility. For example, if you invoke it as /bin/sh rather than /bin/bash it tries to mimic the behavior of older shells just to make it easier to use old code. That's the kind of thing programming languages ought to do. Instead, Python is deliberately creating a burden on users by breaking backward compatibility.

Shell keeps working

Posted Feb 11, 2020 1:50 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (1 responses)

Well, you got Python for free, and if you want to keep supporting it yourself, you can do so. Microsoft is now turning down support for Windows 7, and people *paid* for that. There's been some grumbling,* but nothing like the vitriol that I've seen from the Python 2-ists.

Moral: If you don't want random people second-guessing your every move, write closed-source software and charge for it.

* In the circles where Microsoft products are actually used, which I imagine has relatively little overlap with the typical LWN crowd.

Shell keeps working

Posted Feb 11, 2020 4:37 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

The thing is, Win7 to Win10 upgrade is like Py2.6 to Py2.7 upgrade. Sure, some small stuff breaks but you don't need a wholesale rewrite of software.

Shell keeps working

Posted Feb 17, 2020 16:59 UTC (Mon) by david.a.wheeler (subscriber, #72896) [Link]

> Frankly, if you want to write a new shell script like they did "in the 1980s," it's quite painful (unless you are referring to 1989 specifically, which was the only year "in the 1980s" when bash existed). Here are some things you can't use because they're not in POSIX ...

You're proving my point for me.

New capabilities are routinely added to the Linux kernel, and shell, that make life more pleasant. But notice that all those cases did *not* requiring rewriting the universe of all existing software.

If you have an *already* *existing* shell script, there's no need to waste time rewriting the code because POSIX and various implementations added new capabilities. New versions of the Linux kernel, and new implementations of shell, are deliberately nice to real users with existing code.

Compare that with Python3. Shell existed long before Python existed, and code written in shell still works... while code written afterwards in Python cannot.

The Python2/3 transition debacle has probably cost over $10 billion dollars in lost productivity, because people have to rewrite code for no functional benefit. "Running on Python3 instead of 2" is not a benefit; they already had working code. "Massively improved speed to due parallelism or JIT compilation" would be a benefit, oh whups, you don't get those with Python3.

I *like* Python3, and use it. But this hostility to end-users with existing code is not acceptable. People have actual problems to solve. Climate change is a real problem for example, and people need to focus on building on existing programs to solve them, not waste their time converting programs that already work. People don't need a rat-race of backwards incompatibility.

The transition to Python3 would have mostly happened 10 years ago if there had been some simple imports that made it easy to start using Python3 within Python2 code, enabling a slow transition. Indeed, it's not clear that there was a need to *ever* drop support for Python2 from Python3.

Shell keeps working

Posted Feb 10, 2020 11:49 UTC (Mon) by anselm (subscriber, #2796) [Link] (9 responses)

a huge cost with *no* user benefits (massive increased speed or huge new functionality)

That depends. Recent versions of Python 3 are noticeably faster than Python 2.7 in various important respects, and one could argue that some of the new functionality in Python 3, like async support, isn't entirely insignificant, either.

Shell keeps working

Posted Feb 10, 2020 15:22 UTC (Mon) by ibukanov (subscriber, #3942) [Link] (1 responses)

Python3 is still slower on startup than Python2 making justification to transition to Python3 even harder.

Shell keeps working

Posted Feb 10, 2020 15:32 UTC (Mon) by anselm (subscriber, #2796) [Link]

Many of my Python programs are long-running servers. If they take a little longer starting up this is irrelevant in the face of various runtime speed and memory-use improvements in Python 3 that apply for the whole duration that they are running. (YMMV.)

Shell keeps working

Posted Feb 10, 2020 17:35 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

Py3's speed became competitive with Py2 when they switched to utf-8 for internal string representation. Now it's about even with Py2 (usually a bit faster), but I haven't seen any drastic improvements in real-world programs.

Shell keeps working

Posted Feb 10, 2020 20:31 UTC (Mon) by rschroev (subscriber, #4164) [Link] (1 responses)

Python doesn't use utf-8 for its internal string representation. PEP 393 [1] describes the representation in use since 3.3:
"The Unicode string type is changed to support multiple internal representations, depending on the character with the largest Unicode ordinal (1, 2, or 4 bytes). This will allow a space-efficient representation in common cases, but give access to full UCS-4 on all systems.
...
With the proposed approach, ASCII-only Unicode strings will again use only one byte per character; while still allowing efficient indexing of strings containing non-BMP characters (as strings containing them will use 4 bytes per character)."

[1] https://www.python.org/dev/peps/pep-0393/

Shell keeps working

Posted Feb 10, 2020 21:34 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]

Yes, I meant this change. Thanks for correction!

Shell keeps working

Posted Feb 17, 2020 17:12 UTC (Mon) by david.a.wheeler (subscriber, #72896) [Link] (3 responses)

> Recent versions of Python 3 are noticeably faster than Python 2.7 in various important respects, and one could argue that some of the new functionality in Python 3, like async support, isn't entirely insignificant, either.

Improved speed is nice, but you normally wouldn't choose Python if speed matters anyway. Python is dog-slow; the only thing fast are the libraries written in other languages. That's fine for many applications, since many applications don't really need top speed or they can be plugged in.

Async is also a minor nice improvement. But async is single-threaded, single-process design. Which makes sense, because CPython still has the GIL. BTW, there are async libraries for Python2, so async is not really a big reason to go to Python3. Indeed, it's even *harder* to switch existing Python2 with async to Python3.

If Python3 had real guaranteed parallelism with threads, or performance to compete with Go or Rust or C, that'd be really something. But it doesn't. It's a minor incremental improvement with a massive transition cost.

Again, I *like* Python3, and I use it. The problem isn't the destination. The problem was the botched transition process. If the Python developers want more people to transition from Python2 to Python3, they need to (1) make it easy to transition (by making it easy to write code that runs in both), and (2) provide a *reason* other than "3 is larger than 2". Dramatic speed improvements, real threading, or other serious functionality would be argument #2. But frankly, what's more important is making it easy to transition. Even if there's a reason, that may not be enough. If I have to spend $500,000 to convert a Python2 program to work on Python3, I bet I can find a long list of better uses of the money. Many organizations will have to spend over $10 million if they want to transition their code from Python2 to Python3, and that's not a price everyone can afford.

Shell keeps working

Posted Feb 17, 2020 17:12 UTC (Mon) by david.a.wheeler (subscriber, #72896) [Link] (1 responses)

Quick note: the $10 million is a guess, I don't have real figures. But I think it's an informed guess.

Shell keeps working

Posted Feb 17, 2020 23:30 UTC (Mon) by anselm (subscriber, #2796) [Link]

Transitioning from Python 2 to Python 3 is only a problem if you actually started out on Python 2. At work we converted a reasonably big code base from Python 2 to Python 3 during the course of last year; sure, it wasn't a pleasant experience, but on the other hand it wasn't an insurmountable obstacle either. But at the same time we have been writing new code for different projects in Python 3 for years, and that works for us just fine.

Samba's Python2 -> Python3 costs

Posted Feb 21, 2020 8:00 UTC (Fri) by abartlet (subscriber, #3928) [Link]

This is how I feel about it also.

Samba converted to Python3 because Red Hat, SuSE and (my employer) Catalyst were generous enough to put engineering resources into it.

That time spent did mean that Samba improved - we upgraded to a new version of our build system (Waf) and with my Samba hat on I insisted that untested subsystems be tested before being converted. New developers also grew in experience through the time spent.

But it was a lot of time, and while there are a few nice new Python3 features we could take advantage of, essentially up to now Samba has had no specific benefit form Python3 in particular over what was working before.

I've said for a while that I think at least four engineer-years were sunk into the transition, particularly but I'm sure not exclusively from the companies listed above. I do wonder what else we could have done with that time.

Postponing some feature removals in Python 3.9

Posted Feb 10, 2020 23:35 UTC (Mon) by ras (subscriber, #33059) [Link]

> I really wish more projects would take a harder stance against breaking users. I very much appreciate Linus's stance on this.

Linus's stance on the syscall ABI being backward compatible makes Linux a usable operating system. I hate to think what life would be like without it.

That sounds extreme, but it's like that because the kernel devs refuse to support a stable kernel space ABI. That essentially means there are no 3rdpaty drivers [0], and combine that with nobody back porting drivers to older kernels and it means when you upgrade your hardware, you must upgrade your kernel to a version that supports it. We acquire new hardware all the time, so while we run the same user space programs on that hardware the kernel changes often.

In my experience an upgrade to a new version of software takes enormous amounts of my effort. If I had to go through that effort every time I chose a new kernel, it would be crippling. But I don't, because the kernel is syscall API backward compatible. I can happily slide between versions and only focus on how well the hardware does or doesn't work with that version, safe in the knowledge the binaries running on it won't notice.

The kernel preserves syscall API compatibility to an extent that is almost mind boggling. Once I had some young 20 yo tell me he had to upgrade to his 32bit userspace to a 64bit userspace because he wanted to run a 64bit kernel. That's rubbish of course, and I told him so. After hearing that the look his face said this grey beard had clearly lost more than just his hair (and the beard). So I pulled down the oldest 32 bit version of Debian I could find (Woody or something), debpotstrap'ed it to a chroot running on a shiny new 64bit kernel, and of course a creaky version of vim almost as old as me sprang into life like some newborn, working as well as it always did.

That degree of backward compatibility creates a aura that is similar to the airlines weapon ban including nail clippers. While you don't care about nail clippers themselves, but the signal banning nail clippers sends is we can be absolutely certain anything they do care about won't be slipping through. And equally, now that 20yo has seen a 64bit Linux kernel run Woody, he will never, ever be doubting Linux's commitment to syscall ABI compatibility while Linus is at the wheel guarding it, and he too will now flip between kernel versions to get the hardware support he needs without a care in the world.

Although they are not obviously connected, without the absolutely stable syscall ABI I don't think the kernels current insistence on absolute freedom to change the kernel ABI at will and without notice, and the heavy distaste of out of tree code would be sustainable.

[0] That's gilding the lily, as we have 3rdparty drivers like NVidia. But they are not popular, and if I do use NVidia drivers they are invariably the thing that breaks when I move between kernel versions.

Postponing some feature removals in Python 3.9

Posted Feb 20, 2020 22:47 UTC (Thu) by abartlet (subscriber, #3928) [Link]

I strongly agree.

We don't mind making modern Samba versions do the right thing in python, but what is harder is changing history. With these changes (open(): remove 'U' mode, deprecated since Python 3.3) old Samba versions won't build on modern systems.

This makes Samba development harder, as it makes it more painful to bisect back to a failing revision.

The worst part of this is that working python 3.x code is no longer working Python 3.x code, let alone the assumption that there is no need to be at all compatible with Python 2.7 any more. Samba still builds on Python 2.6 and is likely to at least build on Python 2.7 for quite some time, so to suggest that just because upstream support ended in 2020 that the use case has gone away feels quite wrong. (Samba only operates at runtime with Python 3.5, but building eg smbd, written in C, uses a python build system).


Copyright © 2020, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds