LWN: Comments on "Memory use in CPython and MicroPython"

Memory use in CPython and MicroPython

pboddie — Sun, 02 Jul 2017 16:07:30 +0000

That's a very interesting reference you provide. It is disappointing that the response was more or less "we don't like people misusing pointers because we mess around with pointers a lot ourselves".

It's true that CPython (at least 2.x) can get a lot of performance benefits from just special-casing integer operations. The virtual machine effectively has explicit tests for integers in all the operator bytecode handlers, and upon finding integer operands, just executes the fast integer-only operation (with overflow tests, of course). (It is amusing that people in the referenced discussion questioned the need for integer-related optimisations given the presence of these existing ones in CPython.)

I found it disheartening when evaluating integer-heavy benchmark performance of a Python-like language compiler/translator that I've written, testing it against CPython, because CPython's "fast path" trickery does remove a lot of the normal dispatch overhead. I had originally only intended to have a faster generic dispatch mechanism and had assumed that it would outperform that of CPython in most cases.

So I introduced tagged integers in order to avoid allocation overhead for integer objects, which CPython addresses by employing lists of preallocated objects, and by also short-circuiting the usual dispatch mechanism. Well, why not play by the same rules?

Memory use in CPython and MicroPython

foom — Wed, 28 Jun 2017 05:01:24 +0000

I actually implemented tagged integers in CPython a decade ago on a whim. The performance improvement seemed relatively promising, IMO.

https://mail.python.org/pipermail/python-dev/2004-July/04...

It was very nearly a trivial change. It didn't add much complexity to CPython, and only _barely_ broke the C API -- in basically the same manner that CPython broke it for other reasons in the meantime: "foo->ob_type" and "foo->ob_refcnt" should no longer be accessed directly, but through a macro.

That global search&replace of direct field access to use of a macro was almost the entire extent of the diff.

More recent python verisons already introduced macros for those (Py_REFCNT and Py_TYPE), which are used in most of the codebase, because the definition of PyObject_HEAD changed incompatibly a while back. So, making this change is probably even easier nowadays. It's conceivable it's more valuable, too, given the removal of the int type. But I don't really track python development anymore.

Memory use in CPython and MicroPython

lhastings — Tue, 27 Jun 2017 18:12:48 +0000

CPython 2 had two integer types: the "int" and the "long". An "int" was a fast machine-native integer; a "long" was an arbitrary-precision integer. This bifurcation resulted in occasional dumb bugs (if isinstance(x, int) would fail if x was a long).

For CPython 3, Guido decreed that "long" was the new "int". The machine-native integer type was removed, and "int" was the arbitrary-precision integer. This was proposed/discussed in PEP 237:

https://www.python.org/dev/peps/pep-0237/

Tagged ints would add an enormous amount of complexity in CPython, and I suspect it would break the C API. In general, CPython's goal is to make life nicer for the programmer, and if that means spending more CPU / memory getting there, so be it. I've never asked Guido about tagged ints in CPython but I think we have prima faciae evidence he's not interested--if he was, they probably would have gone in at some point over the last 25 years.

Memory use in CPython and MicroPython

eklitzke — Sat, 24 Jun 2017 00:12:22 +0000

Small integers are interned, although obviously that's quite different than tagging.

Mainly this is due to how the C API works. Everything in Python is represented as a PyObject, which makes working with the C API very uniform and easy. Implementing optimizations liked tagged integers would improve the speed of the interpreter, but at the cost of complexity in the C API. The issue with reference counting is similar: it's uncommon for new interpreters to use reference counting (rather than mark and sweep), but working with a reference counted GC is much easier to do in C than working with a more advanced GC.

There's a reason why Python has such an extensive ecosystem of C libraries. This is both its greatest strength, and its greatest weakness. Python has a really amazing ecosystem of data science libraries (NumPy, SciPy, Pandas, Tensorflow, etc.) which is largely due to how easy it is to port existing C and Fortran libraries to Python (in many cases it can be trivially automated). There are other places where C bindings are used very extensively, again due to how easy the C API is to use. On the other hand, the huge reliance of C extensions for Python makes it difficult for alternative Python implementations to get widespread adoption.

Memory use in CPython and MicroPython

ignacio.hernandez — Fri, 23 Jun 2017 17:56:31 +0000

Nice work!

I had some demo code to brute force finding prime numbers and this interpreter slices this (horrible) test running time from 7.5s to 3.3s. I would think having better baseline performance is a welcome feature especially in small systems.

Memory use in CPython and MicroPython

ejr — Thu, 22 Jun 2017 11:50:06 +0000

I'd argue that L1 & L2 are more tied to the feeds and speeds of the cores and hence should stagnate. The TLB is an excellent example for using space more effectively. I do think that memory optimizations are important, but less so for CPython than any parallelization story. I'm still not a fan of Python (particularly with the intentional lack of floating-point semantics), but it has wide adoption.

Memory use in CPython and MicroPython

pbonzini — Thu, 22 Jun 2017 09:54:31 +0000

It depends. L2 cache and TLB sizes have not increased as fast as main memory; L3 have, but they are shared within the many cores in the same package. In particular better octet packing leads to more efficient GC, and often to less pointer chasing which also improves sequential performance.

Memory use in CPython and MicroPython

flussence — Thu, 22 Jun 2017 01:19:34 +0000

I'm not a Python person but those numbers are pretty amazing! I don't think even QBasic was that memory-efficient.

Memory use in CPython and MicroPython

SEJeff — Wed, 21 Jun 2017 20:25:40 +0000

Micropython runs absolutely fantastic on the esp8266 and esp32 teensy microcontrollers. Huge fan of the work they've put in for this.

Memory use in CPython and MicroPython

ejr — Wed, 21 Jun 2017 20:21:34 +0000

MicroPython's techniques harken to days long before CPython. Once upon a time, people used machines with kilobytes of memory, much like the smallest microcontrollers (that are being phased out because it's cheaper to include more memory now). Implementations of dynamic languages like Lisp on those employed many fun memory-saving techniques. Smalltalk and ZIL added yet more fun memory-saving bits for different object models.

The difference is that CPython was targeting environments that have/had ever increasing memory and sequential computational power. The latter has flat-lined, but memory has continued to increase. Their effort likely is best spent addressing the lack of sequential performance increase... Fine-tuning the octet packing would be more relevant for using vector units in current cell-phone and above processors.

Memory use in CPython and MicroPython

pbonzini — Wed, 21 Jun 2017 19:52:37 +0000

I am very surprised to hear that CPython does not tag small integers. Does it have other optimizations to speed up arithmetic?