Hovmöller: Moving a large and old codebase to Python3
Our philosophy was always to go py2 -> py2/py3 -> py3 because we just could not realistically do a big bang in production, an intuition that was proven right in surprising ways. This meant that 2to3 was a non starter which I think is probably common. We tried a while to use 2to3 to detect Python 3 compatibility issues but quickly found that untenable too. Basically it suggests changes that will break your code in Python 2. No good. The conclusion was to use six, which is a library to make it easy to build a codebase that is valid in both in Python 2 and 3."
Posted Feb 21, 2018 2:13 UTC (Wed)
by vstinner (subscriber, #42675)
[Link] (28 responses)
I explained that it took a few years to the Python community to understand that 2to3 was a bad idea. Dropping immediately Python 2 support is simply not possible for many good reasons. Adding Python 3 support now seems obvious, but it wasn't the first approach promoted in the community.
It's not just the 2to3 tool, but also Python 2.7 and 3.x which were designed to force developers to drop Python 2 support at once. Again, it took a few years to adjust Python 2.7 and 3.x to ease the migration to Python 3.
Posted Feb 21, 2018 7:47 UTC (Wed)
by niner (subscriber, #26151)
[Link] (27 responses)
Posted Feb 21, 2018 9:09 UTC (Wed)
by smurf (subscriber, #17840)
[Link]
Also, in Perl the internals can easily discover which environment they're currently executing in (as opposed to the environment they have been created in). That's far more difficult (and thus inefficient) in Python, but would have been required for str-vs.-bytes, semantics of bytes/bytearray, iterators, …
Posted Feb 21, 2018 10:22 UTC (Wed)
by vstinner (subscriber, #42675)
[Link] (10 responses)
I may be wrong, but I understood that hacks like Inline-Perl5 to run Perl5 in Perl6 use two processes. The Perl6 process spawns a Perl5 process and then do RPC calls and exchange data. It works. But it's maybe not the perfect solution. There are many implementations of Perl6, but none were forked from Perl5. So Perl6 breaks the full C API, and is very different from Perl5. I don't know well Perl, but this recent open letter gives an idea of the situation:
If you read my slides, Python 3 was forked from Python 2 on purpose. Backward compatibility and reduce the number of differences between Python 2 and Python 3 were part of the Python 3 design. Technically, it's possible to write Python code running on Python 2 and Python 3 with a few changes (see the six module). I consider that this part is a success.
> What I do not understand - to this day is why make it so hard on the users in the first place? Why not let programmers use both Python 2 and Python 3 in the same program (...)
Honestly, I don't think that it's for a technical reason. Technically, you can do many things. In practice, Python is written by a small team of developers working in their free time. Developers prefer to work on a fresh code base without all the design issues of Python 2. But many changes were made in Python 2.7 and Python 3.x to make the migration simpler.
Python prefers to regulary makes tiny backward incompatible changes (prepared with deprecation warnings in the previous release) to reduce the technical debt inside Python. Perl works differently. In Perl, you can ask to get the old behaviour in one file (ex: run as Perl 5.14 in Perl 5.20).
IMHO it's more a deliberate choice to regularly polish the language and its standard library.
It's not easy to compare Python and Perl because of these differences. But I let you decide between Perl6 and Python3 which one succeded. Hint: Perl developers like to repeat like Perl6 is a different language that you should have a different name.
Perl5 continues to evolve whereas Python community decided to "un-schedule" Python 2.8.
Posted Feb 21, 2018 11:32 UTC (Wed)
by niner (subscriber, #26151)
[Link] (9 responses)
For compatibility between Python 2 and Python 3 code, embedding is not an option since ironically their internals are too similar, i.e. they share symbol names. But combining the two is still possible using two processes and IPC, like you imagined Inline::Perl5 would work. That would have been an option since day one of Python3K. Of course it's not perfect and performance would be an issue. But there are countless cases where something like this could have spared users a lot of pain and made it easier for people to start using Python 3, even though some essential 3rd party libraries had not migrated yet. And it'd have made it easier for library authors to upgrade without losing their user base. It could even still help today.
As the author of Inline::Perl5 I can tell you, that getting it to a point where I could use Perl 5's database support in Perl 6 was an effort of two fun afternoons of hacking. Once you've got a basic communications channel open, it's really just a matter of making it more comfortable to use and blur the borders between languages.
Regarding the success of Python 3 vs. Perl 6 I can share a user's perspective: my company is stuck on Python 2 with no sane way to upgrade while we are already using some Perl 6 code in production. And the sole reason for the latter is the ability to combine both Perls in a system.
Posted Feb 22, 2018 6:59 UTC (Thu)
by mb (subscriber, #50428)
[Link] (8 responses)
I don't think this is the case.
It is easily possible to have a program that runs on 3.x and 2.7, as long as new 3.x-only features and old deprecated pre-2.7 features are not used.
Posted Feb 22, 2018 7:41 UTC (Thu)
by niner (subscriber, #26151)
[Link] (6 responses)
Posted Feb 22, 2018 8:35 UTC (Thu)
by rahulsundaram (subscriber, #21946)
[Link] (5 responses)
2.7 didn't exist 10 years back nor did 3.x releases that bridged the gap between 2.x and 3.x releases. If 3.0 had all of that relevant functionality of the latest 3.x release and 2.7 was released before 3.0 and the appropriate libraries like six or future were released along with 3.0 release, the time taken would be less but the Python community didn't have all the insight into this that they have now.
Posted Feb 22, 2018 18:36 UTC (Thu)
by raven667 (subscriber, #5198)
[Link] (4 responses)
I'm not sure of that, I imagine that there were python developers who were pointing out the transition difficulties at the time, who saw the engineering and social challenges, but were ignored by the leadership. It might be good for someone more familiar with the details to go back and figure out who was right and who was wrong, to have some accountability for those decisions, to identify who has a good intuition for likely future outcomes and who doesn't. One of humanities super-powers over other living things is the ability to accurately predict the future, to envision consequences, but at the time decisions get made you don't know for certain what is likely to be outcome, without looking back you can't assign weight to conflicting opinions and worldviews, so every decision is like its being made for the first time.
Posted Feb 23, 2018 20:20 UTC (Fri)
by rahvin (guest, #16953)
[Link] (3 responses)
Posted Feb 26, 2018 21:46 UTC (Mon)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
Posted Feb 26, 2018 22:16 UTC (Mon)
by raven667 (subscriber, #5198)
[Link]
Posted Mar 6, 2018 11:02 UTC (Tue)
by dgm (subscriber, #49227)
[Link]
Who got it right is important, but only so much. You have to take into account the broken clock effect (you know, even a broken clock gives the right time twice a day), so better start listening to people with future-viewing super powers once they have proved themselves, at least, a couple of times.
Posted Feb 26, 2018 21:49 UTC (Mon)
by mathstuf (subscriber, #69389)
[Link]
Posted Feb 21, 2018 10:49 UTC (Wed)
by ceplm (subscriber, #41334)
[Link] (14 responses)
Only during porting we found out that our glorious test suite is actually completely useless to support really large refactoring.
And especially, those of us who were not blessed/cursed by having native language which fits in ISO-8859-1 encoding, for the first time in their life were forced to stop pretending that whole world fits into that encoding and they had to read for the first time in their life https://wp.me/p83KNI-eH and many of us had never forgiven for the pain it caused them (https://is.gd/vZMzJ0 ;)).
And yes, I think people in ISO-8859-1 languages outside of US-ASCII are even worse than Americans. The former are (sometimes, rarely) a bit humble because they know their native character encoding is insufficient. French, Germans and other ISO 8859-1 natives look down on Americans smugly persuaded they know all about foreign languages, and yet they know nothing.
And yes, a bit of self-promotion. I have just finished porting to dual py2k/py3k of M2Crypto, with a bit of marketing hype the most complete Python bindings for OpenSSL (especially useful if you want more than getting 's' in your https). Available on PyPI and the GitLab repo with all issues, merge requests and all that jazz on https://gitlab.com/m2crypto/m2crypto/ .
Posted Feb 22, 2018 15:12 UTC (Thu)
by nim-nim (subscriber, #34454)
[Link] (9 responses)
French needs Œ/œ wich requires at least iso-8859-15 (and it was always easier to migrate directly to UTF-8 than get iso-8859-15 properly supported by US-centric tools)
German also needs capitalized ss nowadays
Replace French and German people by "French and German programmers using Windows with CP 1252 and pretending it is ISO-8859-1" and you may be closer to the truth.
Posted Feb 22, 2018 15:44 UTC (Thu)
by ceplm (subscriber, #41334)
[Link]
Posted Feb 22, 2018 22:52 UTC (Thu)
by anselm (subscriber, #2796)
[Link] (6 responses)
Do we, now? I live in Germany and I don't think I've ever seen a capitalized ß anywhere except in press releases from the Unicode committee.
Posted Feb 23, 2018 4:39 UTC (Fri)
by spaetz (guest, #32870)
[Link] (1 responses)
Posted Feb 23, 2018 11:45 UTC (Fri)
by vstinner (subscriber, #42675)
[Link]
Python 3.7 currently uses Unicode 10.0: haypo@selma$ ./python Python 3.7.0a0 (heads/master:b903067, Jun 30 2017, 11:49:25)
It seems like Unicode 10 still uses "SS":
>>> 'ß'.upper()
The German government has to change the Unicode standard :-) Please report this "bug" to the Unicode standard :-) So I closes this issue as "third party.
Posted Feb 23, 2018 7:00 UTC (Fri)
by tdz (subscriber, #58733)
[Link] (2 responses)
Posted Feb 25, 2018 0:46 UTC (Sun)
by anselm (subscriber, #2796)
[Link] (1 responses)
You don't use “ß” when typesetting in small caps.
Posted Feb 26, 2018 9:54 UTC (Mon)
by smurf (subscriber, #17840)
[Link]
Posted Feb 23, 2018 10:03 UTC (Fri)
by nim-nim (subscriber, #34454)
[Link]
(That's where the smug people ceplm complains about come from, they will pretend their native language only needs ISO-8859-1, and work in an en_US locale with a qwerty keyboard, in the hope it convinces @boss to postpone i18n work. They mostly only end up convincing themselves.)
IIRC capital ss adoption was not uniform in German-speaking countries, with some sold on it and others more cautious.
Posted Feb 28, 2018 10:19 UTC (Wed)
by tedd (subscriber, #74183)
[Link]
Oh God, too much Reddit.
I thought there was a WWII joke in there somewhere.
Posted Feb 23, 2018 0:05 UTC (Fri)
by raiph (guest, #89283)
[Link] (3 responses)
* is unaware that Unicode is the planet's standard for text characters (seems unlikely); or
* has decided to permanently limit python's built in character handling to work only with the subset of human languages represented by the most politically powerful within the tech community in the 90s (if true, seems incredible); or
* is aware that Python 3 ignores this issue and is contemplating a Python 4 that will radically alter its character handling.
Does anyone know which of these three applies?
Posted Feb 23, 2018 8:59 UTC (Fri)
by smurf (subscriber, #17840)
[Link] (2 responses)
My opinion: better late than never.
Posted Feb 23, 2018 10:58 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
Posted Feb 23, 2018 20:26 UTC (Fri)
by rahvin (guest, #16953)
[Link]
Posted Feb 21, 2018 7:13 UTC (Wed)
by theophrastus (guest, #80847)
[Link]
There were more than a few of us who were ready to throw up our hands and stick with python2. But we persisted, and I think we've got our in-house data harvesting all converted to python3 now. But I've never encountered a process where the solution felt more Pyrrhic. Too bad there aren't merit badges for this sort of thing.
Posted Feb 21, 2018 14:43 UTC (Wed)
by jhoblitt (subscriber, #77733)
[Link] (3 responses)
Posted Feb 21, 2018 21:52 UTC (Wed)
by ceplm (subscriber, #41334)
[Link] (2 responses)
Nice thing about six is that it is so simple and one file, so it doesn't have to be dependency of your project, but it is simply bundled, to be updated whenever you feel like it.
Posted Feb 21, 2018 22:54 UTC (Wed)
by jhoblitt (subscriber, #77733)
[Link]
Posted Feb 23, 2018 10:05 UTC (Fri)
by jengelh (guest, #33263)
[Link]
Posted Feb 21, 2018 20:53 UTC (Wed)
by togga (subscriber, #53103)
[Link] (7 responses)
Posted Feb 21, 2018 22:06 UTC (Wed)
by ceplm (subscriber, #41334)
[Link] (6 responses)
* How well you are with your Perl 6 code? (Not mentioning some nasty incompatibilities between some minor versions of Perl 5, e.g., in relation to Unicode).
* Ruby 1.8 -> 1.9? Do I need to say more (that's minor version again) And of course, 1.9 -> 2.0 wasn't without its pains as well.
* NodeJS? v0.10.* from 2013 and it is now EOS. All of them based on V8 with various levels of incompatibility.
* Java 5, 6, or sorry 7, and whatever version it is now? Each one of them needs substantial fixes and corrections.
And of course, PHP managed to be incompatible between its major versions in similar manner (<5, 5+) even without resolving difficult problems py3k resolved (ehm, PHP still doesn't have proper Unicode support, does it?).
The lesson is simple: do have your test suite ready and be prepared to spend some time before upgrading.
Actually, ignoring py2k->py3k transition, I would argue that Python is a way more stable than others: program from Python 1.5 (that's December 31, 1997) run usually quite well on 2.7.
Posted Feb 21, 2018 22:46 UTC (Wed)
by sfeam (subscriber, #2841)
[Link]
Posted Feb 21, 2018 23:08 UTC (Wed)
by jhoblitt (subscriber, #77733)
[Link]
Early perl5's unicode handling problems were fairly minor and string handling generally just worked. py3 still suffers from encoding problems even when everything is marked as `u` strings and all file I/O has an explicit encoding declared. Often, an encoding exception bubbles up from some library with no indicated of the file being read of the offending string.
There is no ambiguity with `use v6;` Which does not even compare to the wheel-of-fish that `/usr/bin/python` has been even after PEP394, which did come out until after every distro had done something different.
My impression of the ruby 1.8.x -> 1.9.x migration being bumpy was because many of the new 1.9 idioms were not supported by 1.8. This lead to folks bending over backwards to keep gems compatible with 1.8. Other than C code, can you give an example of 1.8 code that's broken under 2.4? AFAIK, Ruby has never had a world breaking change.
Posted Feb 22, 2018 1:02 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Perl5/6 is the only comparable break but unlike Python, Perl5 is well supported and is still being developed. There are ways to integrate both languages in one process. And last but very much not least, Perl6 doesn’t have GIL.
I don’t know much about Ruby or NodeJS, but Java is almost perfectly backwards compatible. I’m still using a library compiled for Java5 on the newest Java9. They are also code-compatible.
Posted Feb 22, 2018 5:49 UTC (Thu)
by jezuch (subscriber, #52988)
[Link] (2 responses)
It doesn't. Upgrading Java is a headache for devops but for programmers. (Changes in GC algorithm behaviour are much more risky than any other.)
And your snipe about "whatever version it is now" is misguided (it's 9 btw). Java was famously slow-moving but that's changing now.
Posted Feb 22, 2018 15:50 UTC (Thu)
by ceplm (subscriber, #41334)
[Link] (1 responses)
Posted Feb 25, 2018 9:50 UTC (Sun)
by cpitrat (subscriber, #116459)
[Link]
2to3 was a bad idea
2to3 was a bad idea
2+3 in the same codebase
2to3 was a bad idea
https://www.perl.com/article/an-open-letter-to-the-perl-c...
2to3 was a bad idea
2to3 was a bad idea
So in the transition period it's just needed to first port to 2.7 and then to 3.x.
Code embedding is not needed at all, because the syntax is fully compatible. The rest is handled by the highly dynamic nature of the language. You can just do if 3: foo; else bar for API differences in the libraries.
2to3 was a bad idea
2to3 was a bad idea
2to3 was a bad idea
> It might be good for someone more familiar with the details to go back and figure out who was right and who was wrong, to have some accountability for those decisions, to identify who has a good intuition for likely future outcomes and who doesn't.2to3 was a bad idea
I don't think pointing fingers will get you anywhere, regardless of who was responsible they appear to have mostly figured it out at this point. People hopefully learned from the mistake. Besides, it's probably more than one or even two people that made the initial decision. IME pointing fingers is literally the worst thing you can do. It's a destructive process to attempt to assign blame, and it often destroys team cohesion.
2to3 was a bad idea
2to3 was a bad idea
2to3 was a bad idea
2to3 was a bad idea
2to3 was a bad idea
2to3 was a bad idea
2to3 was a bad idea
2to3 was a bad idea
German also needs capitalized ss nowadays
2to3 was a bad idea
2to3 was a bad idea
Python doesn't make politic :-) Python implements Unicode standard.
>>> unicodedata.unidata_version
'10.0.0'
'SS'
"""
2to3 was a bad idea
2to3 was a bad idea
2to3 was a bad idea
2to3 was a bad idea
German also needs capitalized ss nowadays
2to3 was a bad idea
2to3 was a bad idea
2to3 was a bad idea
2to3 was a bad idea
2to3 was a bad idea
Hovmöller: Moving a large and old codebase to Python3
Hovmöller: Moving a large and old codebase to Python3
Hovmöller: Moving a large and old codebase to Python3
Hovmöller: Moving a large and old codebase to Python3
Hovmöller: Moving a large and old codebase to Python3
Hovmöller: Moving a large and old codebase to Python3
Hovmöller: Moving a large and old codebase to Python3
"I would argue that Python is a way more stable than others: program from Python 1.5 (that's December 31, 1997) run usually quite well on 2.7". That is very much not my experience. Incremental porting to 2.3 2.4 2.5 2.7 has each time been a major headache, to the point that much older code is never ported at all. This leads to production machines with parallel installation of all python versions back to 2.4 just so that existing 3rd-party applications can continue to run.
Hovmöller: Moving a large and old codebase to Python3
Hovmöller: Moving a large and old codebase to Python3
Hovmöller: Moving a large and old codebase to Python3
Hovmöller: Moving a large and old codebase to Python3
Hovmöller: Moving a large and old codebase to Python3
Hovmöller: Moving a large and old codebase to Python3
