|
|
Subscribe / Log in / New account

Speeding up CPython

Speeding up CPython

Posted Dec 17, 2020 19:20 UTC (Thu) by Bluehorn (subscriber, #17484)
In reply to: Speeding up CPython by quietbritishjim
Parent article: Speeding up CPython

> The GIL affects fewer Python programs than you might think.

I have fallen for this argument. Unfortunately.

This has been the argument for the GIL for as long as I can remember: Just optimize the inner loop in something like C++. This works if you have a tool that does some costly computation (solving of linear equations, FFT, scene rendering, ray tracing) at the core with some Python code doing the stuff around that is not time critical.

That is definitely a big domain and includes a lot of scientific computing (where Python seems to shine).

Unfortunately I maintain an application written in Python that does not have such a core but deals with lots and lots of small objects representing parts of a FE-model. And it is slow. When we started, we had small datasets so we did not worry too much - we'll just rewrite the relevant implementation details in C.

Only after I hit the performance wall I noticed that I basically had to rewrite most of the classes in C to replace the slow Python code. But since everything is held together by Python collections, manipulating those objects would either require holding the GIL or doing our own locking. Also, we would lose our database integration (which is using SQLAlchemy, no idea how to adapt this to C++ objects).

So we tried to use caching to avoid repeated computations. Which means we spent a lot of development time in fixing cache invalidation problems. Still the system was slow and customers (with 24 core systems) told us that they can't understand how the software performs that bad while not even loading the system. Just parallelize...

As rewriting in another language was out of the question, we implemented multi process concurrency. At last this improved things a bit (especially the user interface stays responsive even if the system is doing something), but at a hefty price. multiprocessing was unusable because this is a GUI application. Using fork wrecks the X11 connection, and the forkserver approach was not available in Python 2 when we started. So we wrote our own multi process IPC library.

I still love Python for writing scripts, the language is extremely expressive. But I am now wary about using it for applications that have to do simple actions on many diverse and small objects.


to post comments


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds