Fedora ponders the Python 2 end game

By Jonathan Corbet
August 1, 2017

Deadlines have a way of sneaking up on people. For example, not everybody is ready for the fact that, sometime in 2020, support for the Python 2 language will come to an end. This deadline is not exactly news; it was established in 2014 (having been moved back five years from its original 2015 date). Even so, some developers may not appreciate how close that date is. Work that is being done in the Python community and the Fedora distribution shows that even the developers behind the change haven't entirely figured out how the transition will play out.

On July 27, Miro Hrončok approached the Fedora community with a draft plan for finalizing Fedora's switch to Python 3. While Fedora ostensibly switched to Python 3 as the default version of the language with the Fedora 23 release in 2015, in practice this switch left a lot of work undone. In particular, despite the "default" status of Python 3, the unversioned term "python" still means Python 2, even in the upcoming Fedora 27 release. If one asks the packaging system to install, say, the python-pip package, the Python 2 version will be installed, and typing "python" gets the Python 2 interpreter. Python 3 may be the "default", but Python 2 is still very much present.

Fedora's rules say that packages with "python" in their name should use "python2" or "python3" explicitly, with some RPM macro magic causing "python2-whatever" to also appear as "python-whatever". Dependencies listed within packages are also supposed to use version-explicit names. Progress has been made in that direction, but there are numerous packages that do not yet follow these guidelines. Indeed, it would seem that there are nearly 1,000 packages that are still out of compliance. The first phase of the transition plan involves fixing all of those packages, as well as ensuring that all Python scripts use an explicit version number in their "shebang" lines. Completion of this phase is planned for early 2019, meaning that packages will need to be fixed at a rate of roughly two every day.

Thus far, what is described is simply work; somebody has to do it, and there is a lot of it, but it's not particularly controversial. The second phase, intended for the Fedora 32 release in early 2020, has raised a few eyebrows instead. On a current Fedora system, /usr/bin/python, if it exists at all, will run the Python 2 interpreter. As of Fedora 32, it will be changed to run Python 3, with the idea that Python 2 will be removed altogether shortly thereafter. This is a change with more user-visible effects than simply fixing some package names and dependencies.

Redirecting /usr/bin/python to Python 3 will break any scripts starting with "#!/usr/bin/python" that only work with Python 2. The plan, of course, is that no such scripts should exist by 2020, but the real world has a discouraging record of ignoring such plans. So it is unsurprising that some commenters see this change as an undesirable compatibility break. Colin Walters pointed at ansible in particular, saying that this change would break centralized system administration across multiple types of host. He went on to suggest that the /usr/bin/python change should not happen "until RHEL7 is near EOL". That, according to the posted road map, is expected to happen in mid-2024, assuming one doesn't count the "extended" support period. Fedora seems unlikely to want to wait that long.

An alternative to redirecting /usr/bin/python would be to simply not provide that link at all and require that all scripts explicitly invoke python2 or python3. Fedora partially implements that approach now, in that a system without Python 2 will not have a /usr/bin/python link. The problem with that approach is that it, too, breaks all scripts with a /usr/bin/python shebang line, even those that would have otherwise worked. As Nick Coghlan put it:

It's only /usr/bin/python itself that still presents an unsolved problem, since the status quo (not providing it at all) is even more user hostile than pointing it at a modern version of Python 3 that includes the various changes aimed at increasing the size of the common subset of Python 2 & 3 (e.g. explicit unicode literals in 3.3, binary codecs in 3.4, binary mod-formatting in 3.5, Fedora's backport of implicit locale coercion to 3.6).

In other words, pointing /usr/bin/python to Python 3 improves the chances that something will work, especially if the script has been updated or its usage is addressed by the compatibility measures that have been added to Python 3 over the years.

The other reason to keep /usr/bin/python around is the plethora of books and other materials telling readers to simply type "python" to get an interpreter. Having that command actually work is friendlier to people trying to learn the language, and there is value in having it invoke the current version of Python.

As it happens, there is a Python Enhancement Proposal (PEP) describing how the python command should work: PEP 394. Coghlan is currently reworking that PEP with the final days in mind. In this PEP, /usr/bin/python can still point to either version of the language, but it should be either Python 2.7 or at least 3.6 so that the bulk of the compatibility features are present. If /usr/bin/python points to Python 3, it should be possible for the system administrator to redirect it back to Python 2 without breaking the system. Script authors are advised to be explicit about which version of Python they want.

The PEP makes it clear that its advice will change in the future:

It is anticipated that once the Python 2.7 branch is no longer receiving even security updates, we will actively recommend against platforms providing a Python 2.7 stack at all, let alone as the default target of the unqualified "python" command.

A draft version of the new PEP is available for those who would like to read the whole thing.

It is probably fair to say that nobody expected the Python 3 transition to be as long or as difficult as it turned out to be. But that transition is happening, and increasing numbers of programs and libraries have made the switch. Most distributors have been laying the groundwork for the transition for some time and are now starting to think about how to finish the job. Most users will have no choice but to follow — once the deadline gets close enough.

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 16:34 UTC (Tue) by fratti (guest, #105722) [Link] (28 responses)

For what it's worth, Arch Linux has made the switch of pointing "python" to "python3" years ago, and while that did break many things and annoyed many users, the rest of the distributions benefited from people already being aware that "python" does not always mean "python2", and having fixed scripts for that case.

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 16:44 UTC (Tue) by josh (subscriber, #17465) [Link] (25 responses)

The problem is that that makes it much harder to write a portable Python script. Arch points /usr/bin/python at Python 3, so you can't use "#!/usr/bin/python"; some other distributions don't provide "python2" and "python3" names, so you can't always use "#!/usr/bin/python2" (or for that matter "#!/usr/bin/python3"), either.

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 16:45 UTC (Tue) by josh (subscriber, #17465) [Link]

(This also applies when trying to invoke Python, such as from a Makefile or other build script.)

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 16:55 UTC (Tue) by fratti (guest, #105722) [Link] (8 responses)

As far as I know, all distributions aside from CentOS 5 (which is end of life) provide a python2 symlink. This may have been an argument years ago, but nowadays as far as I know, #!/usr/bin/env python2 is a safe bet.

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 17:10 UTC (Tue) by josh (subscriber, #17465) [Link] (2 responses)

Windows has the same problem: the current Python 2 installer for Windows doesn't include a "python2.exe", and as far as I know the Python 3 installer for Windows doesn't include "python3.exe" either.

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 17:12 UTC (Tue) by fratti (guest, #105722) [Link] (1 responses)

Windows also doesn't have shebang lines from what I can tell, so this is a moot point.

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 17:19 UTC (Tue) by josh (subscriber, #17465) [Link]

It still runs makefiles, build scripts, and other things that may need to invoke Python.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 12:50 UTC (Wed) by ballombe (subscriber, #9523) [Link] (4 responses)

Alas, it is not a safe bet. There are more and more systems where only /bin/env exists.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 14:55 UTC (Wed) by josh (subscriber, #17465) [Link] (3 responses)

Wait, what? Which systems install env in /bin?

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 20:17 UTC (Thu) by hmh (subscriber, #3838) [Link] (2 responses)

The ones I know of have either /usr -> / or /usr/bin -> /bin symlinked, so /usr/bin/env will just work...

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 20:38 UTC (Thu) by karkhaz (subscriber, #99844) [Link] (1 responses)

Arch does it the other way (/bin is a link to /usr/bin)

$ find / -maxdepth 1 -type l -exec ls -l {} \;
lrwxrwxrwx 1 root root 7 Mar 26 22:57 /lib64 -> usr/lib
lrwxrwxrwx 1 root root 7 Mar 26 22:57 /lib -> usr/lib
lrwxrwxrwx 1 root root 7 Mar 26 22:57 /sbin -> usr/bin
lrwxrwxrwx 1 root root 7 Mar 26 22:57 /bin -> usr/bin

Fedora ponders the Python 2 end game

Posted Aug 4, 2017 5:33 UTC (Fri) by josh (subscriber, #17465) [Link]

That's how many current distributions do it these days, including Fedora. And Debian has an option to do that, if you install the "usrmerge" package.

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 17:00 UTC (Tue) by karkhaz (subscriber, #99844) [Link] (7 responses)

So don't write #!/usr/bin/python, either with a digit or without one. Use /usr/bin/env instead. As far as I know, this should always work (even on the distributions that don't symlink python to python2---are there really modern, not EOLed distros that don't do this?)

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 17:12 UTC (Tue) by josh (subscriber, #17465) [Link] (6 responses)

env doesn't help with this problem; it just does a PATH search, but you still have to tell it which binary name to invoke. That helps if python isn't in /usr/bin; it doesn't help if the name "python" invokes the wrong version of python.

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 17:18 UTC (Tue) by liw (subscriber, #6379) [Link] (4 responses)

An env shebang also doesn't help in the case your system has some system software installed (such as a backup application, ahem), written in Python, and a user writes their own Perl interprer, name it "python"m and put that before the system Python in their $PATH. Unless the author of the backup application has foreseent this and written a Python2/Python3/Perl polyglot script, the user has shot off their entire leg. We try to make it safe for users to play with siege cannons.

System-installed software shouldn't break when a user installs things early on in their $PATH.

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 17:22 UTC (Tue) by josh (subscriber, #17465) [Link]

System-installed software shouldn't use env, sure; it also knows where Python is, so it shouldn't waste the time searching for it.

But there's a limit to how much you can protect the user from their own attempts to break things.

(That said, I've certainly seen more than a few reports of brokenness that ultimately got tracked down to a user's local installation of python in /usr/local or $HOME, so I mostly agree.)

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 23:39 UTC (Tue) by flussence (guest, #85566) [Link] (2 responses)

Gentoo occasionally takes this problem to new levels of insanity:

The package manager is /usr/bin/emerge. That's a wrapper script (written in python, invoked by a C binary that chases symlinks to figure out which version of python to call it with) that chases symlinks to figure out where the real program is (which is also a python script, but can be installed for one or more of python2/3.x/pypy simultaneously).

Now and again there's support requests from users who mess up their system by manually installing python packages as root outside the OS's control, and have ended up with something in the dependency chain installing a /usr/local/bin/emerge (also a python script) that does something completely different, but is higher priority in $PATH, and gives vague errors that don't make it immediately obvious that the wrong thing's being run.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 3:38 UTC (Wed) by wahern (subscriber, #37304) [Link] (1 responses)

POSIX created the getconf(1) utility just so you could do PATH="$(getconf PATH)" to get the default system provided PATH guaranteed to provide the POSIX utilities. It's one of the rare instances where they didn't just codify existing practice. Perhaps for that reason it's an underused and virtually unknown command. Theoretically Linux distributions could extend it so that something like getconf PYTHON2 prints the default python2 interpreter (if available). That doesn't by itself solve the problem of a simple shebang line, but I think it's the least distributions could do to meet developers half-way.[1]

For Lua scripts I often abuse a coincidence of shell and Lua syntax so I can do

 #!/bin/sh
_=[[
  # shell script code to locate a Lua interpreter
]]
-- Lua code

where _=[[ begins a multiline string in Lua but is a harmless assignment in shell. Because the semantics of shell syntax effectively require parsing the file line-by-line, the shell code never reaches the terminating ]] or subsequent Lua code as long as you invoke exec or exit before reaching ]]. I'm not familiar with Python syntax but perhaps something similar can be done.

[1] Distributions could standardize on the "python2" command name, but you still run into the problem of a bad PATH variable. Ideally you'd be able to do something like #!/usr/bin/env command -p python2, where the -p switch to the command utility only searches the default PATH from getconf PATH. And where env(1) locates command(1), as command(1) is usually a built-in and even when it's available as an independent utility I don't know if you can expect it to be in /bin or /usr/bin--not in the same way that env(1) is reliably at the absolute path /usr/bin/env. But Linux is one of the few Unix systems that concatenate all the shebang command argument words before invoking the interpreter, so the above will work on macOS and some other kernels, but not on Linux.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 8:26 UTC (Wed) by itvirta (guest, #49997) [Link]

> Ideally you'd be able to do something like #!/usr/bin/env command -p python2

Except that you can't do that in Linux, the kernel only passes one argument from the hashbang line. I.e. the line above would run "/usr/bin/env" with "command -p python2" as one argument (plus the name of the script as the second argument). Or well, you could, if this "env" would split the argument to pieces itself.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 15:21 UTC (Wed) by drag (guest, #31333) [Link]

> That helps if python isn't in /usr/bin; it doesn't help if the name "python" invokes the wrong version of python.

It's only part of the solution. If you hardcode the paths then it pretty much forces users to edit your program to suite their environment.

$ which python2
/home/user/.pyenv/shims/python2

$ which python3
/home/user/.pyenv/shims/python3

The situation here is that distributions really don't solve these issues for end users. If users want to be able to take full advantage of the python ecosystem for writing and running their software they are going to have to use a alternative solution.

If you only really to write complex python software for a particular Linux distribution (for a enterprise environment, for example) then trying to only use dependencies in the distribution is fine and is very likely to work. But if your goal is to have something that works on a wide variety of Linux distributions, then you end up stuck with a more 'DIY' approach. Making your software available via pip is really what is going to make it easiest for people to use your software at this point.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 15:10 UTC (Wed) by drag (guest, #31333) [Link] (4 responses)

> The problem is that that makes it much harder to write a portable Python script.

This one of the reasons why people who care about portability between Linux distributions and want to be productive end up using things like pip, pvenv, and pyenv instead of distribution-provided dependencies for anything complex.

> so you can't always use "#!/usr/bin/python2" (or for that matter "#!/usr/bin/python3"), either.

Unless you are writing scripts meant to be shipped part of the operating system then you really really need to avoid using any sort of hard-coded paths to python interpreters. There is absolutely meaningful differences you need to take into consideration with this sort of 'part of the OS' versus 'on the OS'. If you are writing as part of the OS distribution.. use hard coded paths. If you are writing programs as a user of the OS then use '#!/usr/bin/env python'.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 15:39 UTC (Wed) by karkhaz (subscriber, #99844) [Link] (3 responses)

I'm now wondering whether the collective wisdom in this thread is codified anywhere? Somebody pointed out that my comment further up was wrong, and I subsequently looked for guidance. There seems to be no shortage of style guides for python code, but _distribution_ information seems to be sparse. I haven't found a document advising on the shebang line anywhere.

The Debian python packaging guidelines doesn't mention it, and neither even does the Arch Linux one---remarkable because /usr/bin/python on Arch is a symlink to /usr/bin/python3 (which itself is a symlink to /usr/bin/python3.6). There was a bit of breakage back when Arch did that, I suppose similar issues to when Debian switched their /bin/sh to a proper POSIX shell.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 16:42 UTC (Wed) by liw (subscriber, #6379) [Link] (1 responses)

I think that this is covered in the Debian Python policy: https://www.debian.org/doc/packaging-manuals/python-polic... last paragraph in the section: " Maintainers should not override the Debian Python interpreter using /usr/bin/env name. This is not advisable as it bypasses Debian's dependency checking and makes the package vulnerable to incomplete local installations of Python. "

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 18:44 UTC (Wed) by drag (guest, #31333) [Link]

Yep. This makes a lot of sense for python programs you are shipping as part of the distribution. You definitely want hard coded paths.

You want to avoid having these programs be broken by user's shell environments as much as possible.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 12:45 UTC (Thu) by ewan (guest, #5533) [Link]

I'm now wondering whether the collective wisdom in this thread is codified anywhere?

It's somewhat codified in PEP-0394, which advises on how python should be installed (in short, don't make 'python' point to Python 3) and how to cope with the fact that it might (the less than completely helpful suggestion to make anything that just asks for 'python' be compatible with both Python 2 and Python 3).

Fedora ponders the Python 2 end game

Posted Aug 4, 2017 4:12 UTC (Fri) by anatolik (guest, #73797) [Link] (1 responses)

> some other distributions don't provide "python2" and "python3" names

Which one does not do it and why? There is a PEP that says all python distributions should provide "versioned python" binary. Even macports has pythonX nowdays.

Fedora ponders the Python 2 end game

Posted Aug 22, 2017 8:21 UTC (Tue) by mgedmin (subscriber, #34497) [Link]

Older versions of Debian and Ubuntu did not have a python2, just a python and a python3.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 6:33 UTC (Wed) by joib (subscriber, #8541) [Link] (1 responses)

The python3 version of anaconda (Anaconda3) also installs a python3 binary under the name "python". Which breaks all code that assumes "/usr/bin/env python" gets them a python2 interpreter. Very annoying.

Fedora ponders the Python 2 end game

Posted Aug 13, 2017 14:59 UTC (Sun) by Wol (subscriber, #4433) [Link]

I think the default on gentoo is now python3. I know I've been using one of the MD scripts, and have had to edit the shebang specifically to call python2.

It would certainly make life easier if all distro-supplied stuff used a shebang explicitly to call the version it wanted. But that's a lot of work catching all instances ...

Cheers,
Wol

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 18:29 UTC (Tue) by smoogen (subscriber, #97) [Link] (1 responses)

> It is probably fair to say that nobody expected the Python 3 transition to be as long or as difficult as it turned out to be.

I am going to be the exception here. It took over 2 decades to get it so that K&R C was no longer allowed in many C compilers because the code was still used in many deep down infrastructure. It took over 4 decades to get Fortran IV code finally end of lifed in various engineering suites. There are still quite a bit of perl4 compatible scripts still in places, and the fact that every time /bin/sh doesn't act like the standard from 1984 gets reams of bugs in every distributor says how long people expect something to work.

Yes in some cases it is because learning new code is hard and I want my 40 year old stuff to work like it always did. However a lot of times there are outside factors requiring that code to act that way without change. Scientific and various engineering experiments have to be rerun to make sure that the old code works the way the new code does. That might cost tens of millions (or billions in the case of a 747 safety check) where keeping the old code running costs thousands. This means that computer languages that get used for infrastructure are going to become living fossils whether the upstream developers want it or not.

[And the upstream python developers aren't the first to complain about how slow the real world works.. every computer language writer has dealt with this over and over again.. somehow expecting that this time it won't be a problem. I expect that the only ones where it isn't a problem are the languages where the usage of it never got outside of a limited environment.]

Fedora ponders the Python 2 end game

Posted Aug 11, 2017 16:27 UTC (Fri) by Wol (subscriber, #4433) [Link]

And silly little changes to the spec cause major difficulties with the code. For example, try the following fortran code ...

FOR I = 1 TO 0
do something
NEXT I

How many times will that "do something"? If it's standards-compliant FORTRAN, it'll do it once. If it's standards-compliant Fortran, it won't do it at all.

Upgrading code between language specifications can be a nightmare of booby traps ...

Cheers,
Wol

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 19:28 UTC (Tue) by togga (subscriber, #53103) [Link] (60 responses)

2020 could be a good year to have migrated from Python for new projects. I never seen a more disruptive and painful version change in any major language before, many of the changes without apparent reason. Also python3, for me, falls short in the "just works" department where python2 shined. The questions is what platform is the future in the quick and dirty, "just works" productivity-department without the performance issues of Python?

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 20:46 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (35 responses)

I know of several large Python2 codebases that are migrating to Google Go.

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 22:57 UTC (Tue) by iabervon (subscriber, #722) [Link] (19 responses)

This is the perfect time to migrate from Python 2 to Go. You'll be done just in time to find out firsthand whether Go 2 actually meets their goal of not splitting the Go ecosystem...

Fedora ponders the Python 2 end game

Posted Aug 1, 2017 23:52 UTC (Tue) by dgm (subscriber, #49227) [Link] (2 responses)

So, can we say it's time to Go, but not to Go 2? What would Dijkstra say about that?

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 3:36 UTC (Wed) by tome (subscriber, #3171) [Link] (1 responses)

He'd say ouch but he'd be unharmed.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 3:49 UTC (Wed) by cry_regarder (subscriber, #50545) [Link]

That is a fairly considered opinion.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 6:04 UTC (Wed) by togga (subscriber, #53103) [Link]

Fair point. Maybe a more robust solution is not a single platform but rather a "floating" set of them with som common stable but flexible base?

Although Python was stable for a "good while" giving some nice years of productivity it seems smart to stick with something that won't intentionally or unintentionally screw you over after a number of years due to for instance one person's or company's decision (I recall having seen this before...).

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 16:46 UTC (Wed) by khim (subscriber, #9252) [Link] (14 responses)

It's not that hard to "not split the Go ecosystem". FORTRAN did that (and they changed they way language is used in pretty significant way over time), C did that (with transition to C99), even C++ did that (twice - with transition to C++98 from C and with transition to C++11 which is NOT 100% compatible with C++98). Only python developers decided they don't have to have to use tried and true approach and instead would force everyone to rewrite everything.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 18:48 UTC (Wed) by drag (guest, #31333) [Link] (13 responses)

You pointed out several examples of programs being broken by language versioning in Fortran and C/C++ and then in the same paragraph claimed that the python developers are the only ones that are guilty of this.

That's very contradictory.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 19:53 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (10 responses)

There are degrees of breakage - nobody expects software to be 100% perfect. But C++ tried hard to preserve compatibility with C, and it was not perfect but close enough so that incompatibilities could be fixed easily. The gains were also quite major in case of Fortran and C->C++.

Py3 did a huge compatibility break that required major changes and a lot of unexpected breakages. And for no real gain.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 23:19 UTC (Wed) by anselm (subscriber, #2796) [Link] (9 responses)

I don't know. For example, IIRC there are all sorts of subtle differences between C and C++, to a point where a valid C program doesn't necessarily work the same way in C++. By contrast, it is possible to write code that works in both Python 2.7 and 3.x, and the Python developers have made changes in recent versions of Python 3.x that improve compatibility even more.

Personally I prefer Python 3 because, among other things, strings work a lot better than they used to in Python 2. Making the transition is a one-time hassle but as far as I'm concerned it is worth it.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 0:05 UTC (Thu) by khim (subscriber, #9252) [Link] (2 responses)

For example, IIRC there are all sorts of subtle differences between C and C++, to a point where a valid C program doesn't necessarily work the same way in C++

That's why even today there are C parser in gcc and clang and you could link together modules written in C and C++.

By contrast, it is possible to write code that works in both Python 2.7 and 3.x

By contrast? "Normal" C code is also a C++ code, all the changes and possible incompatibilties are explicitly listed in C++ standard (annex C) and in general the "natural" case is when code written for old version works in a new version - only some odd corner-cases are broken!

Compare to python, where "normal" python 3 code is completely incompatible with python 2 and where compatibility features which allowed one to write 2/3 code only arrived later, when developers "suddenly" discovered that people are just not in hurry to spend countless hours doing pointless work for no good reason.

Python2 to Python3 transition may not be the worst type transition (PHP6 and Perl6 transition attempts were even worse), but it's certainly the worst one which haven't killed the language (PHP6 died and Perl6 haven't - but in both cases original implementation survived and are in wide use).

the Python developers have made changes in recent versions of Python 3.x that improve compatibility even more

Sure, but all that work was an obvious afterthought whiles it's certainly the most important part of any such transition.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 9:31 UTC (Thu) by mpr22 (subscriber, #60784) [Link]

"Normal" C code is emphatically not C++ code, because normal C code often uses malloc() and seldom explicitly casts the return value to the desired pointer type because C's implicit conversion rules say that void * is implicitly castable to any other pointer type and programmers are often lazy. C++ discarded that implicit conversion.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 13:25 UTC (Thu) by niner (subscriber, #26151) [Link]

How can Perl 6 be worse when it's quite possible to do a piecemeal upgrade of a codebase from Perl 5 to Perl 6? If such a change is even desired instead of just combining the best parts of both languages. No Perl 5 developer was left in the rain and those who want, can use Perl 6. What about this was in any way worse than leaving people who cannot upgrade behind and forcing countless pointless man years of effort for the rest including distributions?

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 0:06 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

C vs. C++ breakage happened gradually, early C++ versions were pretty much "C with classes". I've seen million line C applications being translated into C++ by simply renaming the files and doing a few easy changes. And C is a compiled language, so that helps a lot.

And if everything else fails, you can always #include C-based API with minimum fuss even in modern C++ using 'extern "C"{}' blocks.

There's nothing comparable in Python world. The transition was abrupt and it required quite a lot of changes, and being an interpreted language you actually have to test everything.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 0:10 UTC (Thu) by khim (subscriber, #9252) [Link] (4 responses)

Personally I prefer Python 3 because, among other things, strings work a lot better than they used to in Python 2.

Actually situation with strings is awful in python2 is awful and python3 made it even worse. Why do you think WTF-8 was added to rust? Why to you think Go still considers strings a sequence of bytes with no string attached? World where only nice unicode strings exist is an utopia! That's why they were forced to throw away the notion that file names are strings and introduced path-like objects! And I'm sure it's not the end.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 0:37 UTC (Thu) by anselm (subscriber, #2796) [Link] (2 responses)

I can only speak for myself, but I'm way happier with strings (and byte sequences) in Python 3 than I used to be with strings (and Unicode strings) in Python 2. They pretty much seem to do what I expect them to do, and given a little care it is reasonably easy to write programs that work. Of course I may not be clever enough to appreciate how “awful” Python's string handling really is.

OTOH, I don't really care about WTF-8 in rust nor what Go considers a string because (so far) I'm not using either of those languages, and have no plans to do so in the foreseeable future.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 2:20 UTC (Thu) by khim (subscriber, #9252) [Link] (1 responses)

Of course I may not be clever enough to appreciate how “awful” Python's string handling really is.

My favorite example was Anaconda few years back. Pick text installer (because you are dealing with small VM), pick Russian language and do everything. On the very last screen it tries to show you "everything is done" message which is in KOI8-R instead of UTF-8 - with exception being thrown and whole installation rolled back. Just PERFECT handling of strings.

OTOH, I don't really care about WTF-8 in rust nor what Go considers a string because (so far) I'm not using either of those languages, and have no plans to do so in the foreseeable future.

That's Ok. If your goal is scripts which kinda-sorta-work-if-you-are-lucky then python or, heck, even bash work. If you want robustness then python is not for you.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 15:24 UTC (Thu) by intgr (subscriber, #39733) [Link]

People seem to have the wrong assumption about paths in Python 3. Python does actually properly handle filenames that aren't valid UTF-8; they are escaped with certain Unicode codepoints: https://www.python.org/dev/peps/pep-0383/ (I guess that's like WTF-8 in Rust). I think that's a pretty good compromise: it does the right thing with properly encoded paths (nearly all paths are) but still remains functional with paths that aren't.

> On the very last screen it tries to show you "everything is done" message which is in KOI8-R instead of UTF-8 - with exception being thrown and whole installation rolled back. Just PERFECT handling of strings.

Yes, that's exactly the behavior I want. There was a bug in the program (or translation) and a good programming environment should immediately throw an error, rather than proceed with some unexpected behavior. Even environments that used to play very fast and loose with types and ignore errors, like MySQL and PHP, have recently became significantly stricter. Otherwise, in complex programs, you will end up with latent errors that are much harder to debug and often data loss.

> If you want robustness then python is not for you.

Erm, in one breath you complain about Python being too strict and now you complain that it's not robust?

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 7:51 UTC (Thu) by roc (subscriber, #30627) [Link]

WTF-8 was not really "added" to Rust. There's a crate for it, that's all.

OTOH OsString has been part of Rust for a long time exactly because sometimes you need to deal with weird non-Unicode platform strings.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 23:48 UTC (Wed) by khim (subscriber, #9252) [Link]

Sorry. Thought it would be obvious from the context, but perhaps not. I mean: Mixed programs, in which packages written in Go 2 import packages written in Go 1 and vice versa, must work effortlessly during a transition period of multiple years.

Fortran 90 introduced free-form source input and arrays were redesigned from scratch (and ended up pretty bad: they were designed with Cray CPUs in mind and are not a good fit for modern CPUs) - yet old code could still be compiled with Fortran 2015 compiler! The same with C, C++ and other languages - "old style" is just a switch away and, most importantly, mixed programs, in which packages written in XXX import packages written in YYY and vice versa, must work effortlessly during a transition period of multiple years.

Fortran developers certainly learned from experience (Fortran 77 was not used widely for many years after introduction because it haven't supported some features of old Fortran 66 and thus old modules were not intermixable with new ones), C and C++ developers (and many, many, many others) have learned from it (e.g. Delphi introduced new class-style types and new strings - but old ones were available for years). That fact was certainly well-known to python community - they have just chosen to ignore all that experience.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 16:18 UTC (Thu) by smoogen (subscriber, #97) [Link]

I expect the point dropped is that every compiler toolkit will compile both for N years. The C compilers would work with both K&R C, ANSI C and C99 while emitting warnings. They would then drop K&R C or make it require an explicit flag. It then would only emit warning for mixing ANSI and C99. Then ANSI needed an explicite flag and then C99 only. All in all it pretty much took 15 years from C99 for various compilers to transition.

The same with Fortran. You could mix and match IV and 77 code. Then you needed a flag, then you needed a flag for F90 etc etc. And the same for C++ code. The transitions for each were slow and Azathoth knows how many compiler writers went mad having to work that kind of code. Which I expect that Guido was trying to avoid.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 4:11 UTC (Wed) by wahern (subscriber, #37304) [Link] (7 responses)

At work the recent Dirty Cow kernel patches broke the JVM. Something similar could happen with Go some day (or maybe already has?), which could be much worse as breaks in backward compatibility would require rebuilding every single application. That could prove a nightmare.

I know Linux tries to maintain strong compatibility guarantees, but both the kernel as well as, more typically, distributions fall short of that goal. For example, the deprecation of the sysctl(2) syscall, which broke descriptor-less acquisition of system randomness. (And because getrandom(2) wasn't added until recently, several long-term RHEL releases suffer from a flat-out regression in behavior and sandboxing capabilities.) For all their stodginess, commercial Unices like Solaris had good track records in this regard. (Perhaps this was their downfall!) On balance Go's static compilation model works well for most teams and in fact a unique advantage, but this could change.

External interpreters are more resilient in this regard. There's a reason the ELF specification is so complex, and why enterprise Unix systems evolved toward aggressive dynamic linking; namely, so you could modify and upgrade discrete components in a more fine-grained matter. On a platform like OpenBSD, which ruthlessly changes its ABI, a static compilation model is much more costly. As a recent DEF CON presentation suggests, improved security might require more frequent breaks in backward compatibility as a practical matter.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 7:14 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (6 responses)

> At work the recent Dirty Cow kernel patches broke the JVM.
The mainline kernel has never broken JVM in released versions. Linux Torvalds would eat anybody who tried to do this on purposed. And Go has a critical mass of users, so any kernel-level breakages will be obvious. Go's static linking and independence from libc is a stroke of genius in my opinion. You just drop a binary and it's immediately usable.

This would also be a problem for Python, if you're using a C-based module.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 9:43 UTC (Wed) by wahern (subscriber, #37304) [Link] (5 responses)

> Go's static linking and independence from libc is a stroke of genius in my opinion.

It's a return to historic Unix. Plan 9 intentionally rejected dynamic linking, too.

I agree that in current production Linux environments static linking provides a significant net benefit. Static linking decouples components, allowing you to iterate development easier. But the spirit behind static linking is predicated on your codebase being well maintained, your programmers responsive to changes in interfaces, and that operating system interfaces are minimal and extremely stable. This is how things _should_ be, ideally. But if this were true in reality than CentOS and RHEL wouldn't dominate on the backs of their ABI compatibility guarantees, and people wouldn't sweat kernel upgrades or even deploying alternative operating systems. "Well maintained" does not describe most code bases; nor does "responsive" describe most engineers. Have you upgraded your vendor kernels (which would have broken the JVM), or recompile all your C and cgo programs, to mitigate Stackclash?

It's no coincidence that Plan 9 represents almost every resource as a file or set of files. A file-based object contract only needs four interfaces at the languag level to manipulate--open, read, write, and close. More over, in Plan 9 control channels rigorously restrict themselves to ASCII-based string protocols. That provides an incredibly stable contract for process-process and kernel-process integration, providing opportunities for rich and deep code reuse without sacrificing reliable backward and forward compatibility. Theoretically Plan 9 is a finished operating system: you aren't going to need to wait around for the kernel to provide improved sandboxing or entropy gathering as it's already provides all the interfaces you need or will ever get.

But Linux isn't Plan 9. New interfaces like seccomp aren't generally implemented via file APIs for very practical reasons. Even interfaces like epoll() were a regression in this regard, as an early predecessor to epoll() had you open /dev/poll instead of via a new syscall. Linux added getrandom(2) to supplement /dev/urandom because the Unix filesystem namespace model is fundamentally limited when it comes to achieving "everything is a file" semantics--unprivileged, unlimited namespace manipulation breaks deeply embedded assumptions, making it a security nightmare and necessitating difficult trade-offs when you wish to leverage namespace manipulation for, e.g., sandboxing. I could go on endlessly about how Linux extensions subtly break the object-as-file interface contract.

There are similar problems wrt resource management when you rely heavily on multi-threaded processes. Go isn't designed for simply spinning up thousands of goroutines in a single process; both Go and Plan 9 were designed to make it easy (and predicated upon the ability to) scale a service across thousands of machines, in which case you really don't care about the resiliency of a single process or even a single machine.[1] But that kind of model doesn't work for enterprise databases or IoT devices, or in situations where communicating processes implicitly depend on the resiliency of a small set of processes (local or remote) for persistence. Do you implement two-phase commit every time you perform IPC? That kind of model for achieving robustness is neither universally applicable, nor even universally useful; and it makes demands of its own. In practice execution is never perfect even if you try, but that's true just as much as when writing a highly distributed system as when trying to achieve resiliency using more typically approaches.

As I said before, dynamic linking and complex ABIs didn't arise in a vacuum. Torvalds guarantee about backward compatibility is meaningful precisely because it permits static compilation of user processes so you can decouple kernel upgrades from application process upgrades. But that guarantee wouldn't be required for different operating system models. It's not an absolute guarantee, especially given that almost nobody runs mainline kernels directly--you'd be an idiot to do do some from a security perspective. And the kernel-process boundary isn't the only place where you care about independently upgrading components.[2] If we don't appreciate why it came about and assume that returning to a static linking model will solve everything, we're doomed to recapitulate all our previous choices. Instead, we need to learn to recognize the many contexts where the static model breaks down, and continue working to improving the utility and ease of use of the existing tools.

[1] For example, Go developers harbor no pretense about making OOM recoverable, and AFAICT generally believe that enabling strict memory accounting is pointless. That makes it effectively impossible to design and deploy high reliability Go apps for a small number of hosts that don't risk randomly crashing the entire system under heavy load, such as during a DDoS. (Guesstimating free memory is inherently susceptible to TOCTTOU races. And OOM can kill any number of processes that ultimately require, directly or indirectly, restarting the host or at least restarting all the daemon processes which shared persistent state, dropping existing connections.) It's one thing to say that you're _usually_ better off designing your software to be "web scale". I whole heartily agree with approach, at least in the abstract. It's something else entirely to bake that perspective into the very bones of your language, at least in so far as you claim that the language can be used as a general alternative to another systems languages like C. (Which AFAIK is not actually something the Go developers claim.)

[2] Upgrading the system OpenSSL is much easier than rebuilding every deployed application. Go already had it's Heartbleed: see the Ticketbleed bug at https://blog.filippo.io/finding-ticketbleed/

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 19:40 UTC (Wed) by drag (guest, #31333) [Link]

> "Well maintained" does not describe most code bases; nor does "responsive" describe most engineers.

If your developer sucks and projects have bad habits then static vs binary dependencies are not going to save you. It's going to suck in slightly different ways, but it's still going to suck.

> There are similar problems wrt resource management when you rely heavily on multi-threaded processes. Go isn't designed for simply spinning up thousands of goroutines in a single process; both Go and Plan 9 were designed to make it easy (and predicated upon the ability to) scale a service across thousands of machines, in which case you really don't care about the resiliency of a single process or even a single machine.

Mult-threaded processes are for performance within a application. Dealing with resiliency is a entirely separate question.

If you have a application or database that isn't able to spread across multiple machines and can't deal with hardware failures and process crashings then you have a bad architecture that needs to be fixed.

Go users shoot for 'stateless applications' which allows easy scalibility and resiliency. Getting fully stateless applications is not easy and many times not even possible, but when it is possible it brings tremendous benefits and cost savings. It's the same type of mentality that stays that functional programming is better then procedural programming. It's the same idea that is used for RESTful applications and APIs.

Eventually, however, you need to store state and for that you want to use databases of one type or another.

The same rules apply however when it comes to resiliency. You have to design the database architecture with the anticipation that hardware is going to fail and databases crash and file systems corrupt. It's just that the constraints are much higher and thus so is the costs.

It really has nothing to do with static vs dynamic binaries, however, except that Go's approach makes it a lot easy to deploy and manage services/applications that are not packaged or tracked by distributions. The number of things actually tracked and managed by distributions is just a tiny fraction of the software that people run on Linux.

> For example, Go developers harbor no pretense about making OOM recoverable, and AFAICT generally believe that enabling strict memory accounting is pointless. That makes it effectively impossible to design and deploy high reliability Go apps for a small number of hosts that don't risk randomly crashing the entire system under heavy load, such as during a DDoS.

When you are dealing with a OOM app they are effectively blocked anyways. It's not like your Java application is going to be able to serve up customer data when it's stuck using up 100% of your CPU and 100% of your swap furiously trying to use garbage collecting to try to recover during the middle of a DDOS.

Many situations the quickest and best approach is just to 'kill -9' the processes or power cycle the hardware. Having a application that is able to just die and is able to automatically restart quickly is a big win in these sorts of situations. As soon as the DDOS ends then you are right back at running fresh new processes. With big applications and thousands of threads and massive single processes that are eating up GB of memory... it's a crap shoot whether or not they are going to be able to recover or enter a good state after a severe outage. Now you are stuck tracking down bad PIDs in a datacenter with hundreds or thousands of them.

> And the kernel-process boundary isn't the only place where you care about independently upgrading components.

No, but right now with Linux distributions it's the only boundary that really exists.

Linux userland exists as a one big spider-web of interconnected inter-dependencies. There really is no 'layers' there. There really isn't any 'boundries' there... It's all one big mush. Some libraries do a very good job at having strong ABI assurances, but for the most part it doesn't happen.

Whether or not it's a easier to just upgrade a single library or if it's even possible or if it's easier to recompile a program... all this is very in 'It depends'. It really really depends. It depends on the application, how it's built, how the library is managed, and thousands and thousands of different factors.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 7:10 UTC (Thu) by mjthayer (guest, #39183) [Link] (2 responses)

The thought that always goes through my head when I hear this discussion is "why not link statically but shell out to openssl(1) for encrypted connections". I am sure there is a good reason, which wiser people than me (I am no openssl expert, neither the library nor the command line tool) will tell me. This is not specific to openssl of course. It applies just as much to image format translation.

Generally, there must be a security risk threshold below which static linking can make sense. If you are at risk from a vulnerability in a dependency, you are presumably at risk from vulnerabilities in your own code too, so you have to be vigilant and to do updates from time to time anyway. The point where most updates are due to security issues in components is probably the point where the threshold has been passed.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 7:46 UTC (Thu) by mjthayer (guest, #39183) [Link] (1 responses)

Having posted that and tried to anticipate the responses, I have to admit that I don't know whether shelling out to openssl(1) is really such a big gain over dynamically linking in most Linux use cases. The main one I see where it is is when you are shipping a binary outside of the distribution and don't want to support multiple openssl shared library ABI versions.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 13:45 UTC (Thu) by niner (subscriber, #26151) [Link]

Funny that you bring this up in a news post that's dealing with "python2/3" vs. "python" command name. What do you hope to gain by shelling out to openssl instead of dynamically linking? The binary still has an interface that may or may not change in incompatible ways. At least for dynamic libraries there's support for versioning. As the article demonstrates, there is no such thing for system commands.

Fedora ponders the Python 2 end game

Posted Aug 4, 2017 1:23 UTC (Fri) by lsl (subscriber, #86508) [Link]

> Go already had it's Heartbleed: see the Ticketbleed bug at https://blog.filippo.io/finding-ticketbleed/

Uhm, that is not a Go issue at all but a bug in the TLS implementation of some F5 load balancer appliances. You could just trigger the bug using the Go TLS library client-side as it uses TLS session identifers of different size than OpenSSL (which was probably the only thing the vendor tested with).

When you sent a 1-byte session ID to these F5 appliances they still assumed these were 32 bytes long (as that's what OpenSSL and NSS happen to use) and echoed back your session ID plus the 31 bytes immediately following it in memory.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 5:51 UTC (Wed) by togga (subscriber, #53103) [Link] (6 responses)

How does Go stand as glue language?

What would it entail to set up an interactive environment integrating with for instance current scientific python 3rd party modules like Numpy/SciPy/Matplotlib?

[Go] >> pyimport("matplotlib.pyplot", "plt")
[Go] >> plt.ion()
[Go] >> plt.figure()
[Go] >> x = get_perf_data().view(np.recarray)
[Go] >> plt.plot(x.timestamp, x.value, 'x')

Go has perhaps a better ecosystem than old Python for integrating with web-applications opening up for a more portable platform?

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 7:08 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

Go is decent, but there aren't that many glue libraries available in Go (yet) or stuff like Jupyter.

However, Go nicely scales upward - you can write million-line-scale maintainable systems in it, and easier than in Python.

For your plotlib example: https://github.com/gonum/plot/wiki/Example-plots

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 8:08 UTC (Wed) by wahern (subscriber, #37304) [Link] (4 responses)

Go relies on it's own calling conventions in order to implement dynamically growable stacks at the machine-code level. Invoking C functions (or C ABI-compatible functions) requires acquiring and jumping to a specially reserved stack. This is can problematic for many reasons, especially when you make heavy use of FFI.

Also, Go uses automatic garbage collection, but it doesn't provide anchoring primitives or similar interfaces that allow you to express complex cross-language ownership relationships in a way that is visible to the garbage collector. In a language like Perl or Lua I can easily create Lua objects which reference C objects which in turn reference Perl objects, persist these references past function invocation, and in a well-defined manner that cooperates with Perl's and Lua's garbage collectors.[1] I'm less familiar with Python but I presume it's relatively easy to do this as well, to some significant degree.

In short, for threading, stack management, and garbage collection--the three most fundamental aspects of any language--Go performs alot of the work at the compilation phase and in a way that conflicts with the C ABI, which is the de facto plane for cross-language interfacing. This was a very deliberate design choice, as for something like dynamically growable stacks (necessary for goroutines, which are absolutely fundamental to Go) you must eschew easy compatibility. Likewise, defining stable garbage collector interfaces for FFI is very difficult if you wish to preserve opportunities for refactoring or optimizing your implementation. Basically, Go is perhaps the one language least suitable as a glue language.

For a good overview of the issues one needs to take into account when designing a good glue language, read the short paper, "Passing a Language through the Eye of a Needle: How the embeddability of Lua impacted its design", at http://queue.acm.org/detail.cfm?id=1983083

[1] Cyclical references can be problematic. But Lua, at least, provides support for anchoring cyclical references in a way that the garbage collector can safely detect and break. In particular, either by making sure cycles pass through the uservalue stash of userdata objects, or for other anchoring strategies (e.g. anchors created in the global registry table indexed by C values) by making use of ephemeron tables.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 9:04 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

Yes. However, most of Go's interfacing is one-way - calling C libraries from Go, and it's very easy to do with cgo. You can just import C headers and call functions right away.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 10:44 UTC (Wed) by wahern (subscriber, #37304) [Link] (1 responses)

A glue language, as I understand it, is a language which makes it easy to assemble an application where much of the heavy lifting (complex low-level logic and resource management, but often not high-level business logic, policy, or orchestration) is implemented by external components loaded into the runtime process. Being able to easily invoke C routines is only the most minimal requirement; necessary but hardly sufficient.

A strongly typed language usually makes for a poor glue language, IME, in terms of productivity. Where a glue language makes sense, it's because strict typing (or the particular characteristics of the language's typing) is a poor fit for the whole application, but you still want the benefit of that stricter typing (early bug detection, performance) for certain components. Because the types of one language almost never map cleanly to the types of another, you usually want the freedom that looser typing can provide in the glue language. And dynamically typed languages usually make introspection and dependency injection much easier; certainly more natural. That makes it easier to generate or instantiate clean bindings to complex interfaces (i.e. interfaces to software worthwhile to reuse from a different language), to write regression tests, and to refactor and iterate faster at the higher level of the application stack.

None of that describes Go. Rather, with Go you would usually solve your problems using the unique tools that Go provides. For example, to achieve good dependency injection you either leverage its duck typing or switch to CSP-style patterns by interfacing across channels. In other words, because Go provides relatively unique features for a systems language--for example, lexical closures that bind dynamically-allocated mutable objects--when Go is your primary language you should have less reason to resort to another language. Also, given how much people appreciate the freedom that Go's static compilation model provides--something which you've expressed in this thread--utilizing libraries via FFI would seem especially costly relative to most other languages.

Certainly languages other than Go are more suitable for particular tasks, and existing library implementations sufficiently useful to sometimes be worth the cost of binding. But the cost+benefit doesn't make Go a very good glue language as a general matter. The cost is higher because of the otherwise unnecessary constraints imposed when doing FFI, and the general mismatch between Go's unique runtime and every other runtime; and the benefit relatively less because Go has really strong language constructs often lacking in even dynamically typed languages. And this is generally what I hear from Go developers--to avoid cgo; that you quickly run into headaches for anything non-trivial; that if you find yourself relying on cgo too much you're probably doing it wrong. That's not something you should be hearing about a glue language, and not something you hear nearly as often in the context of Lua, Perl, or Python.

Fedora ponders the Python 2 end game

Posted Dec 8, 2017 22:41 UTC (Fri) by togga (subscriber, #53103) [Link]

With numpy announcing dropped python support etc this topic is hotter than ever.

"A strongly typed language usually makes for a poor glue language, IME, in terms of productivity."

My thinking is that this might be compensated by the quick compile time of go autogenerating the wrapping layer. This wrapping layer will be order of magnitude faster than python ctypes. With available debug data (introspection of c interfaces) it should be feasible?

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 19:33 UTC (Wed) by bokr (guest, #58369) [Link]

You may be interested to read what guile 2.2 does with multiple
languages,

https://www.gnu.org/software/guile/manual/html_node/index...

Specifically the following teaser is more than a teaser for 2.2, and
you can find much more elsewhere (presumably documentation will
catch up and have some direct links from here ;-)
E.g.,

https://www.gnu.org/software/guile/manual/html_node/Compi...

Various formats for the guile 2.2 manual can be had via
https://www.gnu.org/software/guile/manual/

BTW, note that calling things by unconfusing names was recognized
as important pretty long ago (400 BC+- ;-)
https://en.wikipedia.org/wiki/Rectification_of_names

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 6:40 UTC (Wed) by joib (subscriber, #8541) [Link]

> The questions is what platform is the future in the quick and dirty, "just works" productivity-department without the performance issues of Python?

Haskell?

No, I'm sort of serious.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 8:55 UTC (Wed) by rsidd (subscriber, #2582) [Link]

«The questions is what platform is the future in the quick and dirty, "just works" productivity-department without the performance issues of Python?»

Julia. For scientific computing anyway (I'm a scientist). Just as easy to write as python, and if you're careful about declaring types etc when required, blazing fast. And has most of the math stuff from numpy etc built-in, has a Jupyter front-end, can use matplotlib, etc. But it has some startup overhead (being llvm-based) so probably not suitable for scripting.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 9:37 UTC (Wed) by mikapfl (subscriber, #84646) [Link] (21 responses)

I think python3 is a safe bet for the "quick and dirty, "just works" productivity-department". Also, it actually performs faster then python2. I don' really see where you need "quick and dirty" and "high-performance" together, but if you need first quick-and-dirty, and then want to enhance performance, I'd say python3+numba is a great combination.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 11:00 UTC (Wed) by niner (subscriber, #26151) [Link] (20 responses)

How on earth can Python 3 be considered "just works" when Python 3 was the thing that broke everything? How should one ever trust the Python core developers again after this massive screw up? Especially since they have not shown any sign of even acknowledging their mistake.

No, the answer is "it depends". Some names have already be mentioned. For "quick and dirty", "just works" and "takes backwards compatibility really serious" I'd personally just use Perl.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 11:55 UTC (Wed) by Kamilion (guest, #42576) [Link] (17 responses)

Huh.
*Holds up a copy of Perl 6 and Parrot Essentials (2003)*

*connects to VM*
~$ perl --version

This is perl 5, version 22, subversion 1 (v5.22.1) built for x86-linux-gnu-thread-multi (with 58 registered patches, see perl -V for more detail)
Copyright 1987-2015, Larry Wall

*scratches head*

Sooooooo, where's the perl6 I tried to learn 14 years ago, before resorting to learning python?

I mean, yeah, python3's taken a few years, but when you say you'd personally just use Perl, and after saying "Especially since they have not shown any sign of even acknowledging their mistake.", I find myself feeling somewhat amused and confused at the same time.

On the flip side, I try to take a lot of care that python code I touch runs in both 2.7 and 3.4+.
https://github.com/kamilion/customizer/commit/f0b5521ef30...

Fixing the compatibility issues between the versions using u'{}'.format(data) to ensure all data was unicode, even under python 2, wasn't really terribly difficult.
I've had more trouble trying to bring some old python 2.3 code back to life.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 12:05 UTC (Wed) by lkundrak (subscriber, #43452) [Link]

Perl 5 and (perhaps unfortunately named) Perl 6 are different languages. Unlike Python 2, Perl 5 is actively maintained and doing well.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 12:08 UTC (Wed) by niner (subscriber, #26151) [Link] (15 responses)

Perl 6 is a different language than Perl 5. It's a clean break but other than the Python 2 -> 3 transition, it at least brings real improvements like being able to use multiple CPU cores in Perl 6 code (no GIL), state of the art Unicode support rivaled only by Swift, reactive programming, real grammars instead of just regular expressions and lots more.

And again in contrast to Python 2 -> 3, Perl 5 code can still be used via https://github.com/niner/Inline-Perl5/
There's no need to port whole code bases to Perl 6. Perl 5 and Perl 6 code can live side by side in the same program, allowing for a piecemeal transition if so desired.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 12:23 UTC (Wed) by anselm (subscriber, #2796) [Link] (14 responses)

So the Perl 6 developers did everything right except when they tried to figure out a name for their new language.

Frankly, I think that the issues with the Python-2-to-3 transition have been wildly overhyped. It is quite possible to write code that runs with both Python 2.7 and Python 3.x, and there are automated tools to help with that. The main mistake the Python developers made was to underestimate the time it would take to adapt various popular third-party libraries, but by now this is pretty much finished. I personally went over to Python-3-only for new code a while ago and haven't looked back.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 12:30 UTC (Wed) by niner (subscriber, #26151) [Link] (4 responses)

At work we sit on roughly 1.5 million Python 2 expressions baked into this lovely templating language called DTML [1]. There's no way anyone would pay for porting those and no existing tool will help us with that. So come 2020, we can either just continue to use an unsupported Python 2.7, probably compiled from source, or try to get rid of Python entirely. Our system already compiles most of those to Perl code before executing. For getting rid of Python 2 completely however, we'll have to be able to compile whole Python scripts including function definitions (but luckily no class definitions). Lots of work, but at least then we will be able to stay backwards compatible forever. Both with old code and with existing know how.

[1] http://docs.zope.org/zope2/zope2book/DTML.html#using-pyth...

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 14:10 UTC (Wed) by anselm (subscriber, #2796) [Link] (1 responses)

Oh. DTML. That sucks. I feel your pain :^(

As far as I'm concerned, Zope did look like a good idea in the early 2000s or so but quickly became a liability. Fortunately I managed to move off it soon afterward. I'm using Django today, which by now works great with Python 3.

Incidentally, there seem to be enough people in situations similar to yours that Python 2.7 support might not go away completely in 2020. It's just that the head Python guys said it won't be them providing that support. In effect, starting today you people have more than two years to pool your money and get Python 2.7 LTS organised. It's not as if the code base required loads of TLC to keep running, so this may be cheaper in the end than moving millions of lines of code to something else.

DTML

Posted Aug 2, 2017 14:21 UTC (Wed) by corbet (editor, #1) [Link]

Way back around then, when I was working to replace the initial PHP-based LWN site, I did a prototype in Zope and DTML. Then I got distracted for a few months. When I came back to it I couldn't understand anything I'd done and had to start over from the beginning trying to figure out what all the weird wrapper layers did. I concluded that it would always be that way and started over with something else... I've never regretted that decision.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 16:35 UTC (Wed) by smoogen (subscriber, #97) [Link]

Or someone will offer a commercial branch of python-2.7 which has security updates applied to it for another decade. I mean that was the major reason for getting a paid version of Sun Fortran or C in the late 90's.. to keep that K&R and Fortran IV going :).

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 18:54 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

Python 2 will be maintained past 2020, there are too many projects dependent on it. RHEL 7 will be supported until 2027 at least and it contains Py2.

Fedora ponders the Python 2 end game

Posted Aug 4, 2017 1:49 UTC (Fri) by lsl (subscriber, #86508) [Link] (8 responses)

> The main mistake the Python developers made was to underestimate the time it would take to adapt various popular third-party libraries, but by now this is pretty much finished.

I don't think so. I still see Python programs (allegedly supporting Python 3) heavily shitting themselves upon encountering data that cannot be decoded to Unicode. We're talking external data here, like stuff coming in over the network or user-created file names.

Fedora ponders the Python 2 end game

Posted Aug 4, 2017 4:23 UTC (Fri) by smckay (guest, #103253) [Link] (7 responses)

Does Python 3 make it hard to handle malformed text? I mainly use Java and write backend code so the complaint about non-Unicode filenames is hard to understand. Does that happen a lot? Is it a client-side issue? For me the solution would be to tell ops to stop being cute and use normal filenames. :)

It does sound like the Unicode codecs have significant problems if exceptions are part of the default behavior. It's not like you can tell the socket/pipe/file to stop pulling your leg and cough up the <i>good</i> data. Rule #1 of text handling: do the best you can with what you're given and hope no one notices the ÃƒÆ.

Wrong file name encoding made easy

Posted Aug 4, 2017 5:17 UTC (Fri) by mbunkus (subscriber, #87248) [Link] (2 responses)

> I mainly use Java and write backend code so the complaint about non-Unicode filenames is hard to understand. Does that happen a lot?

It's really trivial to recreate. Take a non-ASCII file name on Windows, e.g. "möp.txt". Compress it into a ZIP, copy it to Linux and unzip it there. You'll end up with:

$ unzip the-zip.zip
Archive: the-zip.zip
inflating: mp.txt
$ ls -l
total 28
-rw-rw-r-- 1 mosu vj 18617 Aug 3 20:27 'm'$'\366''p.txt'
-rwxr--r-- 1 mosu vj 4272 Aug 4 07:07 the-zip.zip
$ rm m$'\366'p.txt
$ 7z x the-zip.zip
…snipped output…
$ ls -l
total 28
-rw-rw-r-- 1 mosu vj 18617 Aug 3 20:27 'm'$'\302\224''p.txt'
-rwxr--r-- 1 mosu vj 4272 Aug 4 07:07 the-zip.zip

The reason is simple: stupid file formats. There are no specs for file name encoding in ZIPs. There's no file name encoding indicator either. One could always use 7z, of course, where the specs state that file names be encoded in one of the Unicode encodings, but Windows doesn't come with support for 7z out of the box, but for ZIP. Aaaaand try getting people to use some new(ish) format and see how much success you have :(

Another thing that happens more often than I'd like is some mail program or other getting MIME wrong somehow so that an attachment containing non-ASCII characters in its file name being saved with the wrong encoding resulting in similar brokenness.

Wrong file name encoding made easy

Posted Aug 7, 2017 20:15 UTC (Mon) by cesarb (subscriber, #6266) [Link] (1 responses)

> There are no specs for file name encoding in ZIPs. There's no file name encoding indicator either.

Actually, there is. See appendix D of the ZIP spec at https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT which says:

- If bit 11 is set, the filename is in UTF-8
- If bit 11 is unset, the filename is in CP437
- The UTF-8 filename can also be in extra record 0x7075

Wrong file name encoding made easy

Posted Aug 8, 2017 2:40 UTC (Tue) by smckay (guest, #103253) [Link]

Ah, so when we run into a non-compliant zip file it is time to barf and die because we are running Python 3. I think I understand now.

Fedora ponders the Python 2 end game

Posted Aug 4, 2017 8:36 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

The right idea is to avoid Unicode decoding altogether. Just treat input as a stream of bytes for as long as you can.

For example, I have recently struggled with a Py3-based proxy server that failed with one broken client that sends non-ASCII header names.

Fedora ponders the Python 2 end game

Posted Aug 5, 2017 21:14 UTC (Sat) by flussence (guest, #85566) [Link] (2 responses)

A lot of modern languages try to force the Unicode issue by trying to hide the data from the programmer with a stone wall of abstraction. I've learned from trying to use several of them that there's just no sane default behaviour for all strings. Latin-1 is bad because it forces you to jump through hoops to handle human-readable text correctly, Unicode is *worse* because it forces you to jump through hoops to handle machine-readable text correctly (humans are generally more forgiving parsers).

The most programmer-abusive thing I've tried to use lately is actually perl6's binary types. There's no concept of endianness so the 16/32 bit wide variants are completely unusable for I/O… and they *only* work with I/O functions. You can't pack, unpack, recast to array or any other kind of high level operation on them. The rest of the language is (mostly) sane, but this forgotten corner has all the readability and portability of disassembled SIMD code with the performance of said asm being emulated in a high level language.

Fedora ponders the Python 2 end game

Posted Aug 7, 2017 11:12 UTC (Mon) by niner (subscriber, #26151) [Link] (1 responses)

"and they *only* work with I/O functions. You can't pack, unpack, recast to array or any other kind of high level operation on them."

That's not exactly true:
> perl6 -e '"Ödögödöckö".encode.say'
utf8:0x<c3 96 64 c3 b6 67 c3 b6 64 c3 b6 63 6b c3 b6>

> perl6 -e '"Ödögödöckö".encode.List.say'
(195 150 100 195 182 103 195 182 100 195 182 99 107 195 182)

> perl6 -e 'use experimental :pack; pack("NNN", 1, 2, 3).say'
Buf:0x<00 00 00 01 00 00 00 02 00 00 00 03>

> perl6 -e 'use experimental :pack; pack("NNN", 1, 2, 3).unpack("NNN").say'
(1 2 3)

Fedora ponders the Python 2 end game

Posted Aug 9, 2017 20:58 UTC (Wed) by flussence (guest, #85566) [Link]

Those are the 8-bit variants, of course they work. People notice when something as fundamental as UTF-8 breaks (usually!)

I'm talking about things like this:
> perl6 -e 'use experimental :pack; buf32.new(0x10203040, 1, 2, 3, 4).unpack("N").say'
4538991231697411

Or, here's a basic “real world” example I just made up: Read two equal-size image files in farbfeld format, alpha composite them, and write out the result to a new file… what would idiomatic perl6 code for that look like? Probably shorter than this comment if these bits of the language worked, but they don't.

Sorry for what looks like nitpicking some obscure corner of the language, but I've seen a few too many people get burned out exploring these dark corners; they receive the silent treatment when they point out the language is getting in their way, and subsequently ragequit. There's a lot of this broken window syndrome outside of the cool-oneliner-demo APIs, and it's been like this since forever.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 10:03 UTC (Thu) by Otus (subscriber, #67685) [Link] (1 responses)

> How on earth can Python 3 be considered "just works" when Python 3 was the thing that broke everything?

Python 3 just works if you are starting from scratch.

Had Python 2 not had as much adoption as it did, the breakage would have been a good idea. As is, they should have deprecated one thing at a time in a manner that would have been backwards compatible for a couple of releases. (And left the meaningless changes, some of which have been since rolled back.)

> How should one ever trust the Python core developers again after this massive screw up?

That is the clincher. I hope they've learned the lesson.

Fedora ponders the Python 2 end game

Posted Dec 22, 2017 20:30 UTC (Fri) by togga (subscriber, #53103) [Link]

No. Python3 is the core problem here (along with end of life for Python2). At least for me, developing in python3 is cumbersome and full of practical issues. Python3 is just not a productive language for me, and therefore I'm seeking alternatives.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 17:23 UTC (Wed) by bandrami (guest, #94229) [Link] (4 responses)

I'm sure this is stepping in something, but what's the point of ripping out working pieces of distro infrastructure just because there's a newer and not-quite-compatible version of the interpreter available? As an admin rather than a developer, this desire has always confused me. Hrončok's document as far as I can tell justifies all the labor by saying it furthers the goal of making python3 the default python on Fedora, but that kind of begs the question (to use that phrase correctly for once).

I mean, I get that upstream doesn't work on the interpreter anymore, but I'm old enough to remember when that meant software was "completed" rather than "dead"...

Fedora ponders the Python 2 end game

Posted Aug 11, 2017 6:12 UTC (Fri) by dvdeug (guest, #10998) [Link] (3 responses)

As an admin, you're thrilled to provide support for an endless variety of systems? You don't mind installing ratfor and dealing with the problems everytime you upgrade the Fortran compiler? How about a Modula-3 compiler? Support for Java 1.0 class files that won't run on modern JVMs?

Old enough to remember when software was completed? When was this? When code was written for the Commodore 64 or some other long dead platform? Of what use is code that can't handle Unicode (or even 8-bit characters), or can't handle JPEGs and PNGs or can't handle IPv6? At no point in the history that I'm familiar with has stable software been the norm. The vast majority of software lived and died on short-lived platforms, and even well-written programs that hit the language/OS/platform lottery, if they didn't adapt to new file formats and new demands, died.

Fedora ponders the Python 2 end game

Posted Aug 11, 2017 22:08 UTC (Fri) by nix (subscriber, #2304) [Link] (2 responses)

Old enough to remember when software was completed? When was this?

I can think of two pieces of software that were completed. TeX, because Knuth decreed it (and bugfixes didn't stop even then), and BBC B Elite, which must be considered to eventually have been completed because there was literally no more RAM in which to fix the single remaining known bug: IIRC, it needed three bytes, and after many sweeps through the code, optimizing each time, there was no more fat to be wrung out of it.

Fedora ponders the Python 2 end game

Posted Aug 21, 2017 21:46 UTC (Mon) by pboddie (guest, #50784) [Link] (1 responses)

Do you have any more information about this Elite bug? Some enquiries have produced an account of another bug, but also some scepticism about one which needed three unavailable bytes to be fixed. Someone even asked Ian Bell for his recollections!

Fedora ponders the Python 2 end game

Posted Sep 1, 2017 15:24 UTC (Fri) by nix (subscriber, #2304) [Link]

I said 'IIRC' because this too was from vague recollections. I'll ask around. :)

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 14:04 UTC (Thu) by rossburton (subscriber, #7254) [Link]

For anyone wondering what other people are up to, I've just written up my plans for the Yocto Project at https://wiki.yoctoproject.org/wiki/NoMorePython2.

Because Python 2 is on the way out, and there's really no point in building multiple versions of Python if we can avoid it, we're trying to remove all use of Python 2 from OpenEmbedded Core sooner rather than later.

Thanks to Arch forcing the issue and installing Python 3 as /usr/bin/python most packages are happy to use either 2 or 3 these days. There's just a few tools we use that need Python 2: Qemu, APR, and bmaptools come to mind.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 14:09 UTC (Thu) by ballombe (subscriber, #9523) [Link]

> It is anticipated that once the Python 2.7 branch is no longer receiving even security updates, we will actively recommend against platforms providing a Python 2.7 stack at all, let alone as the default target of the unqualified "python" command.

The only problem with python3 is the attitude of the python3 developers toward python2. That they do not want to maintain it beyond 2020 is quite understandable. But trying to prevent other people to use it and keep maintaining it is not in the spirit of free software.

Remove that, and everybody would applaud python3 as the new interesting language to look at instead of the thing which is forced down their throat.

Fedora ponders the Python 2 end game

Posted Aug 6, 2017 16:50 UTC (Sun) by bluebirch (guest, #58264) [Link]

Maybe python2 (and optionally /usr/bin/python) should point to pypy's python2.7. For most scripts especially the ones used be administrators should work fine with it. And pypy won't have a problem with security patches.