Making multiple interpreters available to Python code
It has long been possible to run multiple Python interpreters in the same process — via the C API, but not within the language itself. Eric Snow has been working to make this ability available in the language for many years. Now, Snow has published PEP 734 ("Multiple Interpreters in the Stdlib"), the latest work in his quest, and submitted it to the Python steering council for a decision. If the PEP is approved, users will have an additional option for writing performant parallel Python code.
Snow's work on this topic began in 2015 with a post to the python-ideas mailing list. He followed that up in 2017 by writing PEP 554 (also titled "Multiple Interpreters in the Stdlib"). He later gave a talk at the 2018 Python Language Summit to gather support for the idea. By 2020, he was optimistic about the possibility of seeing PEP 554 approved for Python 3.10. Ultimately, it was delayed to focus on prerequisite work in the form of ensuring that each Python interpreter uses a separate global interpreter lock (GIL). In 2023, he gave a talk at PyCon about the status of the work so far, and what would be necessary to push it over the finish line.
Python already has several ways to run code in parallel. The threading library allows running tasks in different threads, although only one thread can run Python code at any one time because of the GIL. The multiprocessing library avoids contending for the GIL by running tasks in separate Python processes. These processes can communicate via shared memory, but they cannot share Python objects directly, instead relying on a system of queues that copy objects or by using proxy objects. The overhead involved in sharing complex objects can make multiprocessing a poor fit for some applications.
When circulating the initial version of PEP 554 for comment in 2017, Snow
summed up the purpose of the work by saying:
"The project is partly about performance. However, it's also
particularly about offering [an] alternative concurrency model with an
implementation that can run in multiple threads simultaneously in the
same process.
"
PEP 734 proposes adding a new module — interpreters — that uses Python's longstanding support for multiple interpreters (sometimes called subinterpreters) to permit running independent Python interpreters, each with its own GIL, in different threads of the same process. It used to be the case that only one interpreter could run at a time, but a previous proposal from Snow, PEP 684 ("A Per-Interpreter GIL"), fixed that by giving each interpreter its own GIL in Python 3.12. This could allow multiple interpreters to offer a substantial performance boost, because separate threads could actually run in parallel.
Another PEP may have had much the same effect; PEP 703 ("Making the Global Interpreter Lock Optional in CPython"), which was proposed and accepted in 2023, makes the GIL optional entirely. That change could make the performance advantages of having multiple interpreters less competitive. On the other hand, the GIL remains enabled by default in CPython releases, since it permits better single-threaded performance and remains a requirement for many extension modules, which may make multiple interpreters a practical option for environments where only a stock Python interpreter can be used.
Sharing the same process
allows data to be passed back and forth between the interpreters by sharing the
underlying memory, without having to copy it into an area of shared memory or
incur the cost of interprocess communication.
PEP 734 introduces a
new type of
Queue
for sending objects between interpreters.
The PEP is clear that Python objects themselves are still not shared between
interpreters. Instead, immutable types (such as str or bytes)
can share their underlying storage directly. Small types such as int or
float can also be shared directly, as can tuples of shareable objects.
The only mutable objects that can currently
be shared are Queue and
memoryview objects, but the PEP promises
that "there is no
restriction against adding support for more types later
".
Still, this new queue offers noticeably different semantics than multiprocessing
queues, which can appear to "fork" objects, since objects are marshalled and
copied when sent through the queue. When
discussing the desired semantics of the
newly introduced queue and the relationship between objects on one side of the
queue and the other, Snow said: "My preference for
that relationship is 'they may not be the exact same object, but they might as
well be'.
"
Queue feedback
Despite extensive previous discussion, the updated PEP still drew additional
comments. Antoine Pitrou
expressed concern about the restrictions on what types can be passed between
interpreters in order to preserve Snow's desired semantics:
"I think interpreters.Queue deviating from the
threading and multiprocessing queue semantics by only allowing shareable objects
will be annoying [to] users.
" He went on to suggest that the queues could
use the out-of-band buffers available in
pickle
(Python's object-serialization mechanism) protocol
version 5 to send immutable types
with the same efficiency the current design allows, while also permitting other
objects. Snow agreed that this was an
interesting possibility, but is "still ruminating over the potential
consequences of using pickle by default
".
Pitrou suggested separating out a LowLevelQueue that allows shareable
objects only, and then a regular Queue built on top of that. Steve
Dower
concurred, but noted that "I think we're still at the stage where we want
3rd party packages to design the Queue object
". Guido van Rossum
agreed that it made sense to have a queue which only accepts some types:
I think there will always be a notion of shareable objects — though that's a poor name, it's really about things that have value semantics. And the interpreters module can have a Queue that only allows values. Over time the definition of "value" can be adjusted.
Shared memory
Ran Benita
suggested that it might make sense to consider the design of
transferable objects in JavaScript that can be sent between
contexts and "'hollowed out' on the sending side
".
Transferable objects are mutable objects that are safe to send between threads,
because they are made unusable by the sending thread in the process of being
transferred.
Code written using transferable objects can be sure that only one thread will
try to mutate them at a time, but avoid the overhead of copying large objects
between threads.
Benita also said:
"The reason I'm bringing up Transferable Objects is not that it should be a
part of the PEP, but that I think it would be good to either make sure the
design does not preclude it as a future enhancement, or that it explicitly does
preclude it in case it's not relevant for Python.
" Snow
agreed that it was a "cool idea
" and, while it should not be
part of the initial PEP, it had been part of previous discussions.
Benita also questioned what would happen if code in multiple interpreters wrote to a shared memoryview object without synchronizing that access via a queue. The answer would appear to be a data race. Pitrou noted that users already need to worry about this possibility when using multiprocessing shared memory.
Next steps
Now that Snow has submitted PEP 734 to the steering council for consideration, there is a good chance of actually seeing this work merged for Python version 3.13, expected in October 2024. The council is likely to make a pronouncement one way or another in time for the first beta release (and feature freeze) in May. Even if it does approve the PEP, however, there is still more work to be done before the interpreters module will be generally available.
Index entries for this article | |
---|---|
Python | Subinterpreters |
Posted Mar 5, 2024 7:42 UTC (Tue)
by rrolls (subscriber, #151126)
[Link] (2 responses)
I'm quite fond of Pony's refcap model (especially having not seen _any_ other language do it) and I think Python (and any language, really) could stand to learn a lot from reading up on it. I'm not saying they should adopt the entire model (it would be better to just use Pony, at that point), although it would be kind of interesting if it could be implemented within type-checkers somehow.
But, the relevance here, is that the "kind" of types which Snow and GvR describe are exactly Pony's `val` refcap, or in Python terms any object which is completely "frozen". We have frozenset, and with tuple we kind of have frozen-list but it's not quite the same (one can write typing.List[Foo] and it means "a list of elements all of type Foo", but with tuple one has to write typing.Tuple[Foo, ...]; for a single element, one can write [x] but one has to write (x,) for a tuple; debuggers, code assist and other kinds of inspection will produce often-inconveniently different-looking outputs depending on whether something is a list or a tuple, etc.); many of us (myself included) have independently implemented frozen-dict in our own codebases because it has so many use cases...
It would be a game-changer if Python implemented a builtin, say `frozen`, which allowed one to write x = frozen(set)(some_set) or x = frozen(list)(some_list) or x = frozen(dict)(some_dict) or x = frozen(SomeDataclassType)(some_dataclass_instance) and end up with frozen semantics for any arbitrary type of object without having to implement it separately every time. From reading around, I don't believe I'm the only one who'd like to see this. And, while incredibly useful on its own, if implemented with entirely strict semantics so that there's no workarounds to unfreeze the reference somehow (which there always are whenever you try to make any kind of frozen class in current pure Python), that would give you exactly the `val` refcap from Pony, and thus a very simple way to say "if it's `frozen`, you can send it via `interpreters.Queue`.".
(Yes, in Pony, there's `#send` to represent sendable objects, meaning not just `val` but also `iso` and `tag`. But they aren't really relevant here. And of course, you can easily argue that `Queue` itself is sendable and is sent more like a `tag`, but you can also argue that it's sent like a `val` and just happens to contain a description of some shared resource; my point is that we only need Python to provide opt-in `val` semantics to score a big win here.)
With all that said, `interpreters` is going to be fantastic regardless. I'm certainly excited for it to become available in standard Python.
Posted Mar 5, 2024 9:32 UTC (Tue)
by LtWorf (subscriber, #124958)
[Link] (1 responses)
You are misunderstanding. It's the comma that makes the tuple, not the parenthesis.
You can write
a=1,
Posted Mar 5, 2024 14:03 UTC (Tue)
by rrolls (subscriber, #151126)
[Link]
Posted Mar 5, 2024 8:49 UTC (Tue)
by npws (subscriber, #168248)
[Link] (5 responses)
I had a medium sized code base of maybe 20kloc written in Python. I initially chose Python because it was easy to build a first prototype, then stuck with it. After a while, getting at least half way decent performance felt like a constant battle against the interpreter. Every single one of Python's concepts for adding parallelism feels weird, multiprocessing as an actually working alternative to threading requires object marshalling with (at the time) a lot of limitations in the pickle module, numpy requires back and forth conversion to a completely different representation, various "compiled python" extensions similary bring their own sets of limitations, there are strange interactions between modules using threading, multiprocessing, etc.
18 months ago I finally ditched it, rewrote the entire codebase in go, and it has grown to 120kloc since then as I don't have to waste my time on fighting the interpreter anymore. I wish the people still using Python that this will finally provide a more natural way to achieve parallelism, but I'm quite doubtful that it won't just be yet another way that brings its own set of unnatural restrictions and interacts poorly with all the other methods.
Posted Mar 5, 2024 9:34 UTC (Tue)
by LtWorf (subscriber, #124958)
[Link]
I used fork(), so there was no marshalling needed to pass objects, they were already there.
The marshalling was only needed to return the objects, and only there the usual limitations applied.
Posted Mar 5, 2024 15:02 UTC (Tue)
by intelfx (subscriber, #130118)
[Link] (3 responses)
I'm not sure why the "still" here; does it imply that all contemporary uses of Python are strictly due to inertia? That is very much not so.
Posted Mar 5, 2024 18:06 UTC (Tue)
by atnot (subscriber, #124910)
[Link] (1 responses)
Posted Mar 20, 2024 11:48 UTC (Wed)
by sammythesnake (guest, #17693)
[Link]
I used to work almost exclusively in PHP[1] and was delighted when I got to leave it behind in favour of python, having got so sick of its unnecessary sharp corners (obligatory reference to http://phpsadness.com/)
These days, though, I'm in the camp of wishing python were somewhat different than it is - notable wishes include of course a better story for parallelism, but also better options for type safety[2]. I'm completely over the concept of duck typing - I want to be able to know there are no type mistakes in my code at "compile" time, thanks.
For certain stuff, I'm starting to use Julia, and loving most of what I see so far, though it's not a silver bullet, sadly.
I've also recently embarked on yet another attempt at getting my head around functional programming languages. I'm constantly seeing things that other languages have tried to mimic without really getting all the benefits because they're so religiously paradigm agnostic.
I've been writing in a more and more declarative/functional style in various languages for quite some time, and the benefits are palpable, even if it's a bit of a learning curve for somebody whose previous experience has leaned heavily toward procedural & OO styles/languages.
[1] Well, obviously *also* JavaScript/html/SQL/XML/Uncle Tom Cobley...
[2] Runtime enforcement, better expressivity, checkers that don't need nasty workarounds... This is motivated not least by the horrible job IDEs can do of bog standard autocomplete with the meagre information available - even with highly verbose annotations. Also, some kind of linting to help measure/improve coverage & specificity of type annotations. Other languages just make complete type information a necessary part of the language, and *so much time* is saved by having the computer keep track of it all, that it feels like a no-nrainer to me.
Posted Mar 5, 2024 19:25 UTC (Tue)
by npws (subscriber, #168248)
[Link]
Making multiple interpreters available to Python code
Making multiple interpreters available to Python code
Making multiple interpreters available to Python code
Making multiple interpreters available to Python code
Making multiple interpreters available to Python code
Making multiple interpreters available to Python code
Making multiple interpreters available to Python code
Making multiple interpreters available to Python code
Making multiple interpreters available to Python code