LWN.net Logo

The Managed Runtime Initiative

The Managed Runtime Initiative

Posted Sep 6, 2010 22:20 UTC (Mon) by Blaisorblade (guest, #25465)
In reply to: The Managed Runtime Initiative by k8to
Parent article: The Managed Runtime Initiative

Nah, the behavior is quite different. Sun JVM seems very good at keeping a large number of objects in memory and stalling reclaiming their space until it can do a large number at once, causing stalls, and wasting memory. Other sytems, like Python, seem to reclaim the objects much more incrementally which might not be as effecient in a long term view.

As just said elsewhere by Nix, since Python uses standard reference counting, it is not efficient even in the short term view, because copying a pointer to the stack causes a heap mutation, even to pass a parameter to a procedure. That's why trying to support multithreading gave a 2x slowdown. Given that other portions of code have been optimized, I believe the slowdown nowadays would be bigger. And the slowdown you get with Java and a smaller heap is probably still not comparable to the one you get in Python (no less than 10x).

Python apps seems to have decent performance when they do little, and the rest is written in C, but as soon you try to actually do something with Python code, you lose. I don't get how the same community, which prefers C to Java for performance reasons, can even mention Python. I hope it's not the same people at least.

There are many realtime GCs, and each of them is better than Python's one. In particular, Cliff Click described the pauseless GC, with its amazing performance and small overhead, somewhere on this blog, which I recommend for those interested in the field (even if quite technical). However, he describes their special CPU, but it seems they could port it to x86, and the code is in this release.

For Python, refcounting was just a bad choice in the beginning, and it's now impossible to get rid of it without rewriting everything - and they don't have the man power nor the will. And all of this was well-known, people implementing Lisp, Smalltalk, Self, knew it for the last 20 years, together with a number of other techniques.


(Log in to post comments)

The Managed Runtime Initiative

Posted Sep 6, 2010 23:13 UTC (Mon) by nix (subscriber, #2304) [Link]

since Python uses standard reference counting, it is not efficient even in the short term view, because copying a pointer to the stack causes a heap mutation, even to pass a parameter to a procedure. That's why trying to support multithreading gave a 2x slowdown.
Oh no, there are much more appalling reasons why Python's multithreading is awful. The GIL acquisition macros were never designed with multiple CPUs in mind: on a one-thread-of-execution machine (what we used to call 'one CPU'), it all works fine, but if there's more than one, they race with each other and often end up bouncing ownership back and forth and both blocked for astounding periods of time. There's an awesome presentation on the subject, strongly recommended.

The Managed Runtime Initiative

Posted Sep 10, 2010 21:07 UTC (Fri) by Blaisorblade (guest, #25465) [Link]

Ah, that's right, I even knew of that presentation (and watched it again). I kind-of guessed that people were working on that (without any evidence other than "somebody noticed"), and I had no idea about how to fix that.

Moreover, there was work (from Antoine Pitrou IIRC) to change the policies of the GIL - it was/is released/reacquired every 100 opcodes (which might be ridiculously small or too large, depending on the opcodes), while A. Pitrou wanted to have some saner scheduling.

Releasing it every timeslice (i.e. ~80/100 ms IIRC) would probably help with the problems of that presentation - I should have another look to know if this makes sense. A simple user-level scheduler for GIL acquisition would maybe be needed in the worst case, but hopefully not.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds