|
|
Subscribe / Log in / New account

Szorc: Mercurial's Journey to and Reflections on Python 3

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 13, 2020 20:49 UTC (Mon) by ehiggs (subscriber, #90713)
In reply to: Szorc: Mercurial's Journey to and Reflections on Python 3 by HelloWorld
Parent article: Szorc: Mercurial's Journey to and Reflections on Python 3

Meanwhile much of the Java ecosystem is stuck on Java 8.

Python's migration problem was because it didn't allow for Python3 code to load and run Python 2 bytecode or otherwise use Python2 files until they were all ported and vice-versa. This meant that any project that wanted to migrate had to wait until 100% of its dependencies were on Python3 already. And any library had a huge window where they needed to maintain compliance with 2 and 3 (so no one could take advantage of new features).

Without this migration path, it mean developers needed to perform a big bang migration. The article calls it a "flag day" migration.

This is discussed in the section labelled "Commentary on Python 3" which is well written and easy to follow.


to post comments

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 13, 2020 21:28 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (27 responses)

> Meanwhile much of the Java ecosystem is stuck on Java 8.
That's not quite true. Java 9 introduced modules which many projects are cheerfully ignoring. But most of Java 8 code works just fine in Java 9, without being module-aware.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 13, 2020 23:09 UTC (Mon) by ehiggs (subscriber, #90713) [Link] (9 responses)

>That's not quite true

It is absolutely true. Libraries target JDK 8 because that's what Android is stuck on. It's still a hassle and Java's type system didn't save it from the problems.

Java 9 was EOL in March 2018. Java 10 was EOL in September 2018. And if you're a commercial user of Java 8 and don't have a license with Oracle or anyone else, support was EOL in January 2019.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 13, 2020 23:12 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

> It is absolutely true. Libraries target JDK 8 because that's what Android is stuck on. It's still a hassle and Java's type system didn't save it from the problems.
The thing is, it's easy to have a library targeting JDK 8 to work on JDK 11. I have several packages that are doing that. You basically need to refrain from using JDK>8 features and you'll be fine.

This didn't work with Python, the transition from 2 to 3 required massive rewrites.

> And if you're a commercial user of Java 8 and don't have a license with Oracle or anyone else, support was EOL in January 2019.
Just use https://aws.amazon.com/corretto/ , it'll be supported for a loooong time.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 13, 2020 23:22 UTC (Mon) by ehiggs (subscriber, #90713) [Link] (1 responses)

>You basically need to refrain from using JDK>8 features and you'll be fine.

Indeed and this is not the desired state of affairs. And Java's type system did not save it.

> This didn't work with Python, the transition from 2 to 3 required massive rewrites.

Indeed and this is not the desired state of affairs. And Python's type system did not cause it.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 13, 2020 23:30 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]

> Indeed and this is not the desired state of affairs. And Java's type system did not save it.
It did. I can run JDK 8 code in JDK 11 without any modifications, mixing and matching it freely with newer versions.

> Indeed and this is not the desired state of affairs. And Python's type system did not cause it.
Yes, they did. The string type was fundamentally changed, along with a significant chunk of the API.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 14, 2020 3:11 UTC (Tue) by cesarb (subscriber, #6266) [Link] (1 responses)

> The thing is, it's easy to have a library targeting JDK 8 to work on JDK 11. [...] You basically need to refrain from using JDK>8 features and you'll be fine.

Unless your package, or one of its dependencies, does bytecode manipulation and uses an old version of the bytecode manipulation library, which chokes on classes compiled for a newer JDK. Or your package depends on one of the several J2EE libraries which were removed by JDK 11 (some of them having no replacement outside of the JDK). Or your package, or one of its dependencies, chokes on the replacement of one of the several J2EE libraries which were removed by JDK 11, because it uses an old version of the bytecode manipulation library, and the replacement J2EE library was compiled for a newer JDK.

As late as the end of 2019, some packages were still announcing Java 9 compatibility fixes. For some reason, Java 9 had more compatibility issues than usual, and Java 11 made it worse by completely removing components first deprecated in the short-lived Java 9 release.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 14, 2020 6:12 UTC (Tue) by ssmith32 (subscriber, #72404) [Link]

Apache Beam is a reasonably popular library stuck on 8.. for similar reasons.

Of course, it is doing some pretty wacky stuff. But it's the only option for some things (e.g. GCP Dataflow )

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 15, 2020 19:52 UTC (Wed) by nim-nim (subscriber, #34454) [Link] (3 responses)

And do you really think Android getting stuck has no relationship at all with Google getting sued and redirecting its investments elsewhere?

The Java leadership has been busy making itself irrelevant by alienating most of the rest of the IT world.

Though I wonder where that will leave all the Apache foundation Java projects. They can’t survive in a closed circuit loop forever. Scala is not the solution, its adoption outside the existing Java world is nonexistent.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 15, 2020 19:54 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

Google is moving towards Kotlin which runs fine on the JVM8. There are also third parties maintaining JVM forks (Amazon is one with Coretto project). Java will be fine.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 16, 2020 9:09 UTC (Thu) by nim-nim (subscriber, #34454) [Link] (1 responses)

That means some more years fighting on the governance of Java, on what is a real Java implementation, what is not, what APIs can/should be used or not, etc.

Who wants to deal with this crap forever? Easier to port to another language and let someone else fatten lawyers.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 16, 2020 16:20 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

Who cares? OpenJDK is under GPL so Amazon can freely maintain its fork. They just need to avoid calling it "Java" to avoid trademark issues.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 13, 2020 23:16 UTC (Mon) by HelloWorld (guest, #56129) [Link] (16 responses)

> That's not quite true. Java 9 introduced modules which many projects are cheerfully ignoring.
Well, that might be because they suck. They've been working on this stuff for years and yet it's still not possible to have multiple versions of the same library in a single program. So if you want to use two different libraries that both depend on a third library but in different versions, you lose. Unless of course you use OSGi which has been around for, what, 20 years now and already solved this problem when it first came out.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 13, 2020 23:22 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]

> Well, that might be because they suck.
Indeed they do.

However, I can't fault the way Snoracle introduced them - none of my module-unaware code broke during JDK9 migration.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 14, 2020 1:44 UTC (Tue) by Conan_Kudo (subscriber, #103240) [Link] (14 responses)

> They've been working on this stuff for years and yet it's still not possible to have multiple versions of the same library in a single program.

Oh God, no. I like my sanity, thank you very much. I am *totally* OK with that restriction and I would rather nobody ever lifted it.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 14, 2020 6:58 UTC (Tue) by HelloWorld (guest, #56129) [Link]

I see, apparently you don't like modularity.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 14, 2020 19:02 UTC (Tue) by HelloWorld (guest, #56129) [Link] (12 responses)

If I use library X, and X uses library Y internally but doesn't expose its types in its API, then Y is an implementation detail of X. Now if I also happen to use Y for some unrelated purpose, then upgrading either X or Y might break my application. IOW, I am now affected by X's implementation details. That is the antithesis of modularity, because modularity means that the implementation details of a module don't affect its users. So, could you explain why you believe this breach of modularity is a good thing?

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 15, 2020 8:26 UTC (Wed) by NYKevin (subscriber, #129325) [Link] (11 responses)

It's a bad thing on the development side. But it's (arguably) a good thing on the packaging and deployment side. As an SRE, I want to deploy one, complete, and entirely self-contained artifact into production, and ideally, I want it to behave exactly the same every time (because I'm going to deploy it to N machines, and I don't want half of them to explode and the other half to melt). It'd also be swell if that artifact would produce logs such as "This version of Y is 1.2.3 and it was pulled in by X, don't confuse it with the freestanding Y that is version 1.3.2." And, of course, I want some kind of ironclad guarantee that the two versions of Y "can't see each other" and therefore won't interfere in unexpected ways at runtime.

Even given all of that, I'm still not thrilled with this idea, because I *know* that sooner or later, some part of X is going to somehow call the "wrong Y," and I'll get paged at 3 AM when it crashes in production. And I'm sure that someone will have written a comment somewhere long ago, assuring the reader that, no, of course it's "impossible" for X to call the wrong Y, don't be silly, you see, they are entirely separate, there's no possible way for them to interact. Except for that one obscure side channel the SWE forgot about, where on alternate Thursdays when the moon is full, the software briefly tries to bind TCP port 12345 at exactly the stroke of midnight, in order to practice speaking a profane and blasphemous protocol. A protocol defined only in an RFC that the IETF subsequently declared Librorum Prohibitorum, and which must now be obtained by special dispensation from Vint Cerf. Why does it do this? Because some client asked for it five years ago and everyone has now bleached that contract from their collective unconscious. Anyway, the second version of Y fails to bind the port on EADDRINUSE, and the error gets swallowed because don't you know, in a containerized setup, you're not supposed to get EADDRINUSE, so obviously it's a /* can't happen */ situation. Then X, blissfully unaware, connects to the port and talks to the wrong Y, and the wrong Y does something subtly different from what X wanted, and if you're very lucky, this merely causes the app to crash.

In theory, those are mostly solvable problems. In practice, the language is not actually in a position to solve them (Are you really going to stop the two versions of Y from interacting with any kind of global state, including the filesystem? Unless your language is Haskell, or perhaps an extremely locked down dialect of JavaScript, that isn't realistic as a language-level restriction.). They are institutional problems, and require institutional solutions. As it turns out, one of those solutions can be* "library versions are bumped on a fixed schedule, keep up or else your code stops building and we stop deploying it." A "one library version per process" rule is a straightforward way of enforcing that, but of course you could just as easily attach some kind of custom restriction to the build process instead. It's just a matter of convenience.

* "Can be," not "has to be." There are other solutions, with various advantages and tradeoffs, which are beyond the scope of this discussion.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 15, 2020 9:10 UTC (Wed) by roc (subscriber, #30627) [Link] (2 responses)

Are such environmental collisions any more of a problem in practice for multiple versions of the same library coexisting than for different libraries coexisting?

I'm familiar with libraries stomping on each other at run-time, e.g. with races around fork(). I'm familiar with C libraries stomping on each other with symbol collisions during linking. I've even had listening port collisions with two components in the same container. But my Rust project links multiple versions of some libraries (which libraries, and which versions changes over time), and I haven't had any problems with that so far in practice.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 15, 2020 21:23 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (1 responses)

It is much more likely that two versions of the same library will share symbols than different libraries. The latter can probably be avoided by not exporting all symbols by default or adding a few strategic `static` keywords. The former…well, Python2 and Python3 can't be mixed in the same process because they share symbol names. Same with glib and other projects that tend to be "good citizens" with their ABI management.

Rust avoids the symbol problems, but still has woes with libraries trying to control any global resource (signal handlers, environment variables, etc.).

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 15, 2020 21:38 UTC (Wed) by roc (subscriber, #30627) [Link]

Yes I understand that for C-like linkage multiple versions of the same library are a disaster.

Signal handlers are just a massive problem in general --- for different libraries as well as for two versions of the same library. For that and other reasons I have not encountered any Rust crates (other than tokio-signal) that set signal handlers. Likewise setting environment variables is a minefield libraries should avoid under any circumstances.

So I agree that two versions of the same library are more likely to hit these issues than two different libraries, but I'm not convinced that *in practice* it's really worse, for Rust.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 15, 2020 9:48 UTC (Wed) by HelloWorld (guest, #56129) [Link] (2 responses)

There is no global state in Java, “static” variables are per-classloader, and of course you can't load two classes of the same name in a single class loader, so different versions of the same class will have their own state. And if they interact through the file system or a TCP port, then two processes separate using different versions of the same library would also be affected, so such a library would be broken whether or not you use two different versions in a single process.

Besides, it's not like Java will detect that there are two different versions of a library on the classpath unless you use special measures to prevent that. It'll just crash later when something tries to call a method that isn't there any more or something like that, so it's not like your approach of “let's just forbid it” prevents anything.

So yeah, I'm not buying it. Libraries should be isolated from each other, anything else just doesn't scale…

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 16, 2020 1:57 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (1 responses)

> There is no global state in Java, “static” variables are per-classloader, and of course you can't load two classes of the same name in a single class loader, so different versions of the same class will have their own state. And if they interact through the file system or a TCP port, then two processes separate using different versions of the same library would also be affected, so such a library would be broken whether or not you use two different versions in a single process.

You see, this kind of clever thinking on the SWE side is why the average SRE has a drinking problem. I told you that there would be a comment to that effect, and you *actually wrote it for me,* apparently in all seriousness believing it would change my mind.

Realistically, every major operating system has process-wide mutable parameters (the working directory, the umask, UID/GIDs, stdin/out/err redirection, etc.), and while Java may try very hard to sandbox those parameters, you can always call out to native code and manipulate them anyway.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 16, 2020 9:01 UTC (Thu) by HelloWorld (guest, #56129) [Link]

> apparently in all seriousness believing it would change my mind.
Yes, I thought that arguments could convince someone to change their opinion, silly me. Apparently this is more of a religious thing for you...

Of course, 99 % of libraries *don't* call out to native code, so what you're saying is that we can't have the solution for the 99 %, because it might not work for the 1%, and of course you don't have a solution for the remaining cases either.

Besides, all this stuff about the cwd, the uid/gid, stdio redirection etc. is complete hogwash, because these are shared among *different* libraries as well, and therefore any library that relies on any of these to be in any particular state is broken to begin with, whether or not you allow multiple versions to be loaded. It's a completely unrelated problem.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 15, 2020 19:58 UTC (Wed) by nim-nim (subscriber, #34454) [Link] (4 responses)

It’s not a good thing on the deployment side either. It leads to the slow accumulation of multiple versions of the same thing, which is not good resource and performance wise, and leads to death marchs when a problem affecting a wide range of versions is found (as is always eventually the case).

At that point, the pyramid crumbles under the weight of its technical debt and wipes out all past “savings”.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 16, 2020 9:09 UTC (Thu) by HelloWorld (guest, #56129) [Link] (3 responses)

Now that you've explained all the problems, it's time to start talking about solutions. Say I depend on two different libraries that I don't have control over and that depend on incompatible versions of a third library. Now what?

The fact of the matter is that this problem doesn't go away on its own, and if the platform doesn't solve it, people come up with other solutions. For Java that is JarJar Links...

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 16, 2020 9:33 UTC (Thu) by nim-nim (subscriber, #34454) [Link] (2 responses)

Now the correct software engineering solution is to help whichever part of the stack depends on a lagging version of the lib ported to common supported version (or help port it to something else if the third party lib is so broken porting is more expensive than dropping it).

That’s the inherent cost of using third party code. Don’t like it? Write your own code.

Engineering means delivering reliable solutions. Not letting problems fester in dark places.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 16, 2020 14:19 UTC (Thu) by HelloWorld (guest, #56129) [Link]

Bullshit! What happens in practice in larger projects isn't that the whole stack is migrated to the new version, but that people stick with the status quo, because there's no way to do the migration piecemeal and it's too large a disruption to do it in one step. I've seen this happen even in relatively small projects (< 50.000 loc), and it's bound to be much worse in larger ones.

Szorc: Mercurial's Journey to and Reflections on Python 3

Posted Jan 16, 2020 14:22 UTC (Thu) by HelloWorld (guest, #56129) [Link]

Besides, “letting problems fester in dark places” isn't a solid technical argument; rather, it's just rhetoric. Perhaps a new, incompatible release of a library only improves matters in a way that isn't relevant for my particular use case, and in that case, the correct response is “if it ain't broke don't fix it”.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds