LWN: Comments on "An introduction to asynchronous Python"

An introduction to asynchronous Python

HelloWorld — Fri, 07 Jul 2017 17:16:01 +0000

The real question is why two concurrent threads would mess around with the same mutable data structure anyway. And most of the time is that they really don't, and when you stop doing that, things become much easier.

An introduction to asynchronous Python

neilbrown — Sat, 01 Jul 2017 00:25:40 +0000

> Given that essentially no human beings are capable of understanding synchronisation perfectly in any non-trivial cases,

That's a bold claim!
I think it much more likely that we don't have, or are not using, suitable semantic tools to enable us to think about synchronization in a reliable way.
By "semantic tools" I mean things like "loop invariants", which I personally find make it much easier to think accurately about loops.
I think a lot of synchronization errors come about because people are either not informed about the locking requirements, or think they can take a short cut without justifying it. This suggests that it isn't a lack of capability, but a lack of tools and training.

You point still stands, though, that it may be easier to train people in asynchrony than in synchrony.

An introduction to asynchronous Python

excors — Fri, 30 Jun 2017 12:43:03 +0000

> There are not many things that are better for async in that comparison. The main advantage to asynchronous programs for Python is the massive scaling they allow, Grinberg said.

I think the main advantage of the async model over threading may be that you don't have to understand synchronisation - it avoids all those mutexes, semaphores, events, conditions, atomics, implicitly atomic operations in a particular interpreter implementation, GIL, ... Instead, all your code is guaranteed to run atomically except where there's a clearly-visible "await". Given that essentially no human beings are capable of understanding synchronisation perfectly in any non-trivial cases, that's a substantial benefit.

An introduction to asynchronous Python

epa — Fri, 30 Jun 2017 05:50:01 +0000

Ironically, if you use a worker process model you may not need reference counting or other forms of garbage collection at all. Each process can do some work and then exit, freeing its memory then. This suggests some global flag to turn off reference counting and garbage collection would help here - and perhaps also help non-forking programs which nonetheless do one thing and then exit.

A more subtle tweak would be to set the reference counts on all objects to some magic value like -1 which marks an object as used and stops the count being updated further. The parent process could call that as a one-off just before forking the workers. Then all existing objects would stay shared, but allocations made in the children (or further things allocated in the parent) would be garbage collected as usual.

An introduction to asynchronous Python

excors — Thu, 29 Jun 2017 22:49:02 +0000

I think std::shared_ptr can sort of do this already - you can pass the constructor a raw pointer (allocated however you want) plus an allocator object that will be used for internal allocations (i.e. refcount storage etc), and you could make that allocator use a separate pool to keep all the refcounts together. Then wrap it all in a custom type so users don't have to think about it.

(Apparently constructing a shared_ptr via std::make_shared is special - that does a single allocation to contain both the refcount and the object, which is usually a good idea, but in this case you'd need to implement it differently, which should be easy enough.)

An introduction to asynchronous Python

epa — Thu, 29 Jun 2017 19:51:32 +0000

Interesting point. This suggests that if you are going to use reference counting, the counts should be stored separately from the object, in some kind of global array or global lookup of address to count. Then they will be in their own pages, while the objects themselves stay mostly read-only.

An introduction to asynchronous Python

dtlin — Thu, 29 Jun 2017 17:48:33 +0000

If you start up new processes, each Python library loaded (including the standard library) will not be shared. That far outweighs the text size of the interpreter itself.

If you fork off of a main process after loading libraries, reference counting unshares the data pretty quickly.

An introduction to asynchronous Python

zlynx — Thu, 29 Jun 2017 16:58:59 +0000

Interpreters like Perl and Python which use reference counts are terrible about data sharing. Even if a variable is read-only the reference count bumping as it is passed around unshares the pages.
I noticed this on SpamAssassin on 256 MB boxes years ago. It used an initialize and fork model, obviously copied from a C application, perhaps Apache. It should have been very memory efficient. However, as soon as a SA worker began to work, it's memory quickly unshared and started to overload the box.

And of course there are C++ apps using shared_ptr and std::string which do just as badly at this.

An introduction to asynchronous Python

willy — Thu, 29 Jun 2017 16:45:18 +0000

> Running multiple processes means having multiple copies of Python, the application, and all of the resources used by both in memory, so the system will run out of memory after a fairly small number of simultaneous processes (tens of processes are a likely limit), Grinberg said.

Ooh, no. There will only be one copy of the Python interpreter text segment in memory. There will be separate data segments for each invocation, of course. And maybe that's what he meant. I'm not entirely sure what is meant by "resources", but if that's (for example) a read-only data file being processed, then there's only one copy of that too (unless python does something awful like read() it into a userspace buffer instead of mmap() it). Either way, dozens of processes being the limit seems unlikely.