LWN.net Logo

Rigo: Multicore Programming in PyPy and CPython

Rigo: Multicore Programming in PyPy and CPython

Posted Aug 10, 2012 16:13 UTC (Fri) by drag (subscriber, #31333)
In reply to: Rigo: Multicore Programming in PyPy and CPython by njs
Parent article: Rigo: Multicore Programming in PyPy and CPython

> I have no idea why this is the goal, because shared-everything is horrible for programmers and implementors alike. I guess that makes it a fun engineering problem. It's like if Oulipo designed runtime environments.

Isn't the major advantage of threading versus forking in Linux the ability to have low-overhead IPC over things like shared memory and the such?

Otherwise what is the advantage over forking?

Just curious.


(Log in to post comments)

Rigo: Multicore Programming in PyPy and CPython

Posted Aug 10, 2012 18:08 UTC (Fri) by njs (guest, #40338) [Link]

If we're using terms like BSP, STM, shared-everything, etc., then we're talking about what semantics your runtime provides, not how it's implemented in terms of OS primitives.

Right now Python on Linux has two easy options for parallelism. You can fork(), which gives you shared-nothing parallelism (which is great, very easy to reason about) with somewhat cumbersome and expensive IPC (serialized objects copied explicitly over sockets or shared memory, explicit chunk-of-shared-bytes via mmap, that sort of thing). Or you can use "threads", and get shared-everything parallelism (which is horrible) with fast IPC, except the GIL kind of kills that.

But you can imagine PyPy implementing fork()-like shared-nothing semantics *with* fast IPC. It'd use pthreads underneath, but with each thread having a totally different Python namespace, and when you passed an object to another thread it would just mark it copy-on-write instead of actually making a copy. This is how languages designed for parallelism, like Erlang or Rust, are intended to work.

Implementing this on top of CPython would be harder, and you'd have to have a whole transition period while people audited their existing C modules to adapt them to handle GIL-less operation and the new copy-on-write stuff, but it'd be possible. And seems more likely to me than designing a new C-compatible language like Armin is suggesting. I doubt it's worth it for a relatively minor IPC speedup, though...

Rigo: Multicore Programming in PyPy and CPython

Posted Aug 16, 2012 14:55 UTC (Thu) by martin.langhoff (subscriber, #61417) [Link]

Yes! Having an Erlang-ish map() in Python, that transparently dispatches the work to a number of processes/threads would be fantastic.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds