LWN.net Logo

Python "newthreading" proof of concept released

From:  John Nagle <nagle-AT-animats.com>
To:  comp-lang-python-announce-AT-moderators.isc.org
Subject:  [ANN]: "newthreading" - an approach to simplified thread usage, and a path to getting rid of the GIL
Date:  Fri, 25 Jun 2010 11:15:02 -0700
Message-ID:  <4C24F226.2050607@animats.com>
Archive-link:  Article, Thread

We have just released a proof-of-concept implementation of a new
approach to thread management - "newthreading".  It is available
for download at

     https://sourceforge.net/projects/newthreading/

The user's guide is at

     http://www.animats.com/papers/languages/newthreadingintro...

This is a pure Python implementation of synchronized objects, along
with a set of restrictions which make programs race-condition free,
even without a Global Interpreter Lock.  The basic idea is that
classes derived from SynchronizedObject are automatically locked
at entry and unlocked at exit. They're also unlocked when a thread
blocks within the class.  So at no time can two threads be active
in such a class at one time.

In addition, only "frozen" objects can be passed in and out of
synchronized objects.  (This is somewhat like the multiprocessing
module, where you can only pass objects that can be "pickled".
But it's not as restrictive; multiple threads can access the
same synchronized object, one at a time.

This pure Python implementation is usable, but does not improve
performance.  It's a proof of concept implementation so that
programmers can try out synchronized classes and see what it's
like to work within those restrictions.

The semantics of Python don't change for single-thread programs.
But when the program forks off the first new thread, the rules
change, and some of the dynamic features of Python are disabled.

Some of the ideas are borrowed from Java, and some are from 
"safethreading".  The point is to come up with a set of liveable
restrictions which would allow getting rid of the GIL.  This
is becoming essential as Unladen Swallow starts to work and the
number of processors per machine keeps climbing.

This may in time become a Python Enhancement Proposal.  We'd like
to get some experience with it first. Try it out and report back.
The SourceForge forum for the project is the best place to report problems.

				John Nagle
-- 
http://mail.python.org/mailman/listinfo/python-announce-list

        Support the Python Software Foundation:
        http://www.python.org/psf/donations/



(Log in to post comments)

Python "newthreading" proof of concept released

Posted Jun 25, 2010 22:17 UTC (Fri) by aigarius (subscriber, #7329) [Link]

This is horribly non-Pythonic. I do hope this never becomes mainline and another, simpler solution is found. For example this implementation completely changes how the language works after a second thread is launched. It explicitly differentiates between methods that have a leading underscore in their names. It makes it impossible to have a global variable containing a list. It makes it impossible to actually share a list between two threads.

This is pointless!

Python "newthreading" proof of concept released

Posted Jun 26, 2010 3:42 UTC (Sat) by rriggs (subscriber, #11598) [Link]

What is "non-pythonic" about this?

Python "newthreading" proof of concept released

Posted Jun 26, 2010 4:20 UTC (Sat) by mikov (subscriber, #33179) [Link]

I think it non-pythonic because it changes the language in a non-backwards compatible way. To me it seems that by definition and implementation of real threading would be non-pythonic because at the very least it has to get rid of reference counting and this predictable object lifetimes.

As an observer of Python from the sidelines, I feel that not addressing threading in Python 3000 was a major missed opportunity. Two consecutive non-backward compatible changes seems like too much. (That said, I don't see any of Python's "competitors" like Perl or Ruby doing any better)

Python "newthreading" proof of concept released

Posted Jun 26, 2010 4:18 UTC (Sat) by shmget (subscriber, #58347) [Link]

"This is pointless!"
The _point_ is to allow for mostly language managed multi-threading, while not imposing undue burden on single-threaded programs.
Have you read http://www.animats.com/papers/languages/pythonconcurrency... ?
and particularly the 'Prior Work' section ?

"It explicitly differentiates between methods that have a leading underscore in their names"
Have you ever written a multi-threaded program, in any language ?

"It makes it impossible to have a global variable containing a list."
as the paper said:
'Global variables must be immutable or synchronized.'
Duh!
You can have global variable list... until you start doing multithreading. at that point you must have 'froozen' your global list to make it immutable, or you must encapsulate it in a synchronized object.

"It makes it impossible to actually share a list between two threads."
Yes you can. You just have to wrap you list in a proper synchronized object to make it thread-safe. It is not harder to do than to have an external lock object that every user of the list must remember to acquire to access the list.

Python "newthreading" proof of concept released

Posted Jun 26, 2010 7:09 UTC (Sat) by aigarius (subscriber, #7329) [Link]

It is pointless, because the way this proposal works, once you run a thread you loose all the niceties of a dynamic, interpreted language. It is no longer Python. I have read the paper and I reject the assumption that 'it can not be done' in a thread safe way without completely throwing away dynamicity of Python.

The point of creating a generic threading mechanism that works withoug GIL would be to allow most programs to become multi-threaded. And guess what, most multi-threaded programs are multithreaded from start to finish. So in essence this paper just offers us to gut Python language completely, because someone is too lazy to write a proper per-object locking implementation.

The choice between Python with GIL and 'newthreading' without Python is no choice at all.

Python "newthreading" proof of concept released

Posted Jun 26, 2010 4:17 UTC (Sat) by mikov (subscriber, #33179) [Link]

I see a potential problem with sharing immutable objects. The proposal claims that it is safe to share immutable objects between threads, however in general that is _not_ true.

This is a well known problem. If an immutable object is constructed in one thread, there is no guarantee when and whether other threads will have a consistent view of that object unless synchronization (memory barrier) is used in _both_ the constructing thread and other threads.

Was this addressed in the proposal? In fact I saw the claim that read-only objects can be shared without locking, which is definitely false. They can be shared without locking in many cases, but definitely not _in general_.

Worse, I don't actually see a way of implementing this concept in a practical way without horrible overhead (but I might be wrong).

Python "newthreading" proof of concept released

Posted Jun 26, 2010 4:24 UTC (Sat) by shmget (subscriber, #58347) [Link]

Maybe I'm confuse too, but such an object is either dynamically instantiated or it is statically instantiated.
In the former case other thread cannot see it until it is done being built.
In the latter case it is instantiated before the program goes into 'multi-thread' mode (see the section '"Freezing" of the program' in the proposal)

Python "newthreading" proof of concept released

Posted Jun 26, 2010 4:33 UTC (Sat) by mikov (subscriber, #33179) [Link]

The problem is, from a memory-consistency POV there is no "done being built" unless a memory barrier is used in both threads. In general there is no guarantee of when and in what order one thread will see the writes performed by another thread. Specifically, the second thread may see the object as half-initialized.

The only way to prevent this is to use a memory barrier in both threads. In the creating thread that would be acceptable, but in the other threads you'd need a barrier before accessing any object, which is suicide.

Python "newthreading" proof of concept released

Posted Jun 26, 2010 10:05 UTC (Sat) by shmget (subscriber, #58347) [Link]

"Specifically, the second thread may see the object as half-initialized."

Can you give a concrete example of how that would occurs (in the context of the creation of a read-only object) ?
I can't seems to find one without breaking the read-only condition (i.e without using some kind of variable visible by both thread while being updated... which would then not be a read-only object. Note that the read-only condition is with regards to every thread. the only write occurs at the creation of the object, in a period when no object's instance can see the being-created object. only the class constructor sees it while being initialized)

Python "newthreading" proof of concept released

Posted Jun 29, 2010 4:53 UTC (Tue) by JohnLenz (subscriber, #42089) [Link]

You have to realize that writes are not ordered between CPUs. With multiple CPUs, there is no guarantee memory stores will be sequential on different threads. Therefore, the memory store which sets the globally visible pointer to the newly created object might be seen before the memory stores which initialized the object.

Here is an example: say in memory location 1000 is a globally accessible pointer which all threads can see. Thread 1 is creating a read only object in memory locations 1-50, and Thread 2 is reading the object using the pointer from memory location 1000. Thread 1 does the following in order:

  • Writes a value to memory location 1
  • Writes a value to memory location 2
  • ...
  • Writes the pointer to memory location 1000

You say to yourself, this is perfectly fine because the store to 1000 happened after the object is initialized. WRONG! The problem is, the CPU gives no guarantee that a different CPU will see the writes in that order. For example (depends on caching and such), Thread 2 might see the writes in this order

  • Sees a write to memory location 1
  • Sees a write to memory location 2
  • Sees the write to memory location 1000 Object is still uninitialized at this point
  • Sees a write to memory location 3
  • ...

The way to prevent this is to insert memory barriers (or to use locks... note locks like posix thread locks imply memory barriers). A more detailed writeup is on wikipedia.

Python "newthreading" proof of concept released

Posted Jun 26, 2010 4:56 UTC (Sat) by mikov (subscriber, #33179) [Link]

On second thought, I see how it can work. The second thread couldn't get to the object in the first place without some sort of synchronization. So all that is needed is a write barrier in the creating thread after an immutable object is created or frozen. That is not bad.

Python "newthreading" proof of concept released

Posted Jun 26, 2010 10:06 UTC (Sat) by shmget (subscriber, #58347) [Link]

Sorry I replied to your prior response before reading this one...

We agree.

Python "newthreading" proof of concept released

Posted Jun 27, 2010 2:36 UTC (Sun) by foom (subscriber, #14868) [Link]

> On second thought, I see how it can work. The second thread couldn't get to the object in the first place without some sort of synchronization. So all that is needed is a write barrier in the creating thread after an immutable object is created or frozen. That is not bad.

Not true in general. Absent a barrier on the reading CPU, there's nothing preventing it from deciding to fetch the cache-line containing the pointer to the object from the other CPU, while not fetching the updated cache-line containing the object's data.

Python "newthreading" proof of concept released

Posted Jun 27, 2010 5:06 UTC (Sun) by mikov (subscriber, #33179) [Link]

It depends. I think in this case it can work.

The second thread must obtain the address of the shared object somehow and the only way to do that (in this implementation, or in any "safe" implementation) is ultimately by using synchronization of some form, which always implies a read barrier.

For example the shared object must be put in a synchronized queue, or it may be accessible from within another synchronized object, etc. It turns out that the semantics of this implementation do not allow the second thread to see the object before having issued a read barrier.

In other words, this implementation does not allow one thread to just share objects by storing them in global variables without any synchronization.

As I said however, the constructing thread must make sure to at least use a write barrier (or release consistency) before sharing the address of the new object.

Python "newthreading" proof of concept released

Posted Jun 30, 2010 9:24 UTC (Wed) by mpr22 (subscriber, #60784) [Link]

Absent a barrier on the reading CPU, there's nothing preventing it from deciding to fetch the cache-line containing the pointer to the object from the other CPU, while not fetching the updated cache-line containing the object's data.

If a page contains data intended for more than one CPU to access, even L1 hits for cache lines in that page should generate snoop cycles.

Python "newthreading" proof of concept released

Posted Jul 1, 2010 17:40 UTC (Thu) by BenHutchings (subscriber, #37955) [Link]

There is an implicit barrier for data-dependent reads on every architecture that I'm aware of, except Alpha. I think I can live without Alpha support in Python.

Python "newthreading" proof of concept released

Posted Jul 3, 2010 22:35 UTC (Sat) by foom (subscriber, #14868) [Link]

Yup! I just learned that on Friday.

Who knew. :) That's indeed quite a useful property for the CPUs to provide.

Some useful information on the subject:
http://www.kernel.org/doc/Documentation/memory-barriers.txt
http://g.oswego.edu/dl/jmm/cookbook.html

Python "newthreading" proof of concept released

Posted Jun 27, 2010 1:34 UTC (Sun) by ferringb (subscriber, #20752) [Link]

Couple of flaws with your statements:

1) 'sharing immutable objects'; you're actually describing immutable singletons here, which *would* need locking if you wanted a runtime generated singleton, no argument there. Raw immutable objects however in this context don't imply singleton- merely that they aren't modifiable post creation. That creation is synchronous to that thread, so no other thread sees it till creation is finished (someone could violate this via having the constructor itself leak to a seperate thread, but that wouldn't exactly be immutable at that point now would it?).

2) immutable objects in this case *can* be shared w/out locking. The reason comes down to the underlying vm- current cpython implementations use a non-atomic incref/decref because due to the GIL ensuring it is impossible to ever have multiple python threads running in parallel (note I said python- you can have c threads going in parallel, but they in effect cannot rely on anything in the VM). If/when you can truly execute two python threads in parallel, the VM would have to be atomic inc/decref (or a delayed sweep) to even allow N threads, regardless of this module. After all, this module's intent is just to try and make access where multiple threads are running consistant, not to make the raw vm itself thread safe.

Roughly, you're thinking of c/c++ runtime in this case- your statements would apply. This however is in reference to the python vm; the rules differ in python vm's.

Python "newthreading" proof of concept released

Posted Jun 27, 2010 5:23 UTC (Sun) by mikov (subscriber, #33179) [Link]

About 1) No, I definitely don't mean singletons, although the problem is intrinsically the same. What the second thread sees depends only on the second thread - it may even see the writes in reverse order. That is unless it uses a read barrier.

So, if you want to share a read-only object you absolutely need two things:
a. The creating thread must use a write barrier between constructing the object and sharing its address
b. The reading thread must use a read barrier between fetching the address and using it.
(Note that in x86 both these barriers are implicit in the micro-architecture)

About 2) I am assuming that any useful implementation of real Python threading will get rid of reference counting and the GIL. (The implementation referenced here is not practical - it serves just to illustrate the restrictions of a hypothetical real implementation.)

However I agree that even in such a hypothetical VM it can work as long as the new semantics enforce synchronization when objects are shared. As a very basic example: if the only way to share between threads is via global variables, you just need to enforce synchronization when accessing global variables. Which seems to be the case in the referenced proposal. So, yes, I agree it will work.

Python "newthreading" proof of concept released

Posted Jun 26, 2010 4:22 UTC (Sat) by flewellyn (subscriber, #5047) [Link]

Well, good, I guess. For those who want to use threading, I suppose this automatic locking of the classes will help prevent the worst headaches of massive nondeterminism.

I still think threading is fundamentally BAD, though. Concurrency via message passing or explicit shared memory is a much better approach.

Python "newthreading" proof of concept released

Posted Jun 26, 2010 7:56 UTC (Sat) by drag (subscriber, #31333) [Link]

In Linux handling multiple processes is a efficient thing.

So if your doing a Linux oriented python programming then other negative aspects (performance wise) of the language are going to be much higher priority... like having a proper JIT implementation and better garbage collection. Plus if your using some sort of socket-based IPC and using individual processes for parallelization then it's a short hop to clustering support.

In Windows, however, process creation is extremely expensive, while Linux has about the fastest and most efficient way to create new processes that any contemporary OS has.

Multi-threading is extremely important in Windows in order to get good performance and with Python actually using threads on simple benchmarks of easily parallelizable tasks can actually make things _slower_ due to the extre overhead of all the synchronization that the GIL has to do.

http://www.dabeaz.com/python/GIL.pdf

Python "newthreading" proof of concept released

Posted Jun 26, 2010 12:08 UTC (Sat) by NAR (subscriber, #1313) [Link]

When you already have a VM running the code, you don't need operating system processes or even threads to do multithreading (with or without message passing)... Erlang is an example for this, we're still using the emulator in non-SMP mode to run thousands of processes.

Python "newthreading" proof of concept released

Posted Jun 26, 2010 15:31 UTC (Sat) by vonbrand (subscriber, #4458) [Link]

When thread handling was benchmarked to see how it should work in Linux it came out overwhelmingly in favor of pure kernel threads, even Solaris (which had a hybrid kernel/process model) soon followed suit. I fail to see how replicating the whole scheduling, locking, and "be careful that if one thread does block the others can still go on" mess is of any advantage (unless the focus is on severely crippled systems).

Python "newthreading" proof of concept released

Posted Jun 26, 2010 16:42 UTC (Sat) by NAR (subscriber, #1313) [Link]

As far as I know, when the Erlang VM was first implemented and deployed, Linux didn't even support SMP, so it's definitely not a question of reimplementing in that case. And I guess that due to not sharing memory, the implementation is a lot simpler, more predictable, etc. In case of Python, they might have to reimplement some stuff, but don't forget that there are other systems than Linux, so they might have to implement something anyway. In some environments it might be more beneficial that the operating system kernel gets out of the way as much as it's possible and leave the decision on the level that has more information.

Python "newthreading" proof of concept released

Posted Jun 26, 2010 20:27 UTC (Sat) by drag (subscriber, #31333) [Link]

Yep. Those other systems would be 'Windows'. Threading is very important in Windows if you want good performance were your frequently started and stopping new threads. With Linux this is not such a big problem.

Also don't forget that with Linux 2.4 the threads and processes were handled in very much the same way from the perspective of the kernel. It was not until 2.6 Linux were they introduced NTPL threading model were you have significant advantages to threading for short lived and very numerous threads.

Wikipedia says the difference between 'Linux Posix Threads' vs NTPL were they simply started up 100,000 threads on IA-32 was 15 minutes vs 2 seconds (respectively).

So if your a Python hacker with Linux priorities in the 2.2/2.4 era there is even less reason to get GIL right or remove the GIL then there is now.

HOWEVER... even if you were to fix the GIL or removing the GIL it probably would not improve Python's attractiveness for the sort of heavy workloads were you want to handle hundreds/thousands of threads anyways... You'd probably want to use a faster language anyways.

Python "newthreading" proof of concept released

Posted Jun 28, 2010 8:47 UTC (Mon) by NAR (subscriber, #1313) [Link]

This is comparing apples with oranges, but nonetheless, might be interesting:

1> N0=now(), lists:foreach(fun(_) -> erlang:spawn(fun() -> ok end) end, lists:seq(1,100000)), N1=now(), timer:now_diff(N1, N0).
6662831

This code shows that 100000 Erlang processes were started in less than 7 seconds. So it is possible to start 100000 processes within seconds even in userspace, it's not necessary to use kernel threads.

Python "newthreading" proof of concept released

Posted Jun 28, 2010 8:27 UTC (Mon) by cmccabe (guest, #60281) [Link]

Back in the LinuxThreads days, using green threads was pretty much your only choice if you wanted to have more than a thousand threads. Neither the scheduler nor the threading library really scaled much more than that.

Now that we have the 2.6 kernel, NPTL, and a better scheduler, the best thing to do is to use real kernel threads.

There are a lot of reasons for this, but here is a very simple and obvious one. The kernel assigns each thread to one and only one CPU. Sometimes the kernel will move a thread from one CPU to another, but that is a costly operation and should be avoided if possible. So no matter how clever you are in user space, you can never really get that much parallelism out of an application that has only one "real" process.

Also, maybe this is slightly off-topic, but I find it a little bit annoying when people say "well, we can't implement this feature for our programming language on all platforms, so therefore we won't implement it at all." The most successful programming languages of the last decade have focused on just one operating system-- like Ruby, C#, Python and Perl. Sure, you can run all these on the "wrong" operating system, but it will be painful and unproductive. Rather than delivering a mediocre experience to everyone, language designers should focus on solving a real problem that people have.

Python "newthreading" proof of concept released

Posted Jun 27, 2010 15:43 UTC (Sun) by robert_s (subscriber, #42402) [Link]

"When you already have a VM running the code, you don't need operating system processes or even threads to do multithreading"

Yes, it's a great idea to duplicate all the process scheduling and locking again in userspace. It's not as through schedulers are something that's hard to get right or anything.

Python "newthreading" proof of concept released

Posted Jun 26, 2010 22:24 UTC (Sat) by foom (subscriber, #14868) [Link]

Ideally you could start up multiple *completely separate* Python interpreters inside one OS process. If python supported that, you could run multiple python "processes" as multiple OS threads in the same OS process on Windows. Unfortunately, that's never really worked right. Someone should work on that... (Other languages get it right, e.g. TCL and Lua).

Python "newthreading" proof of concept released

Posted Jun 27, 2010 16:57 UTC (Sun) by jamesh (guest, #1159) [Link]

Many Python extensions are not safe to use with multiple interpreters, since they store object references in global variables or stack allocate classes.

The C API changes in Python 3 make it easier to write safe extensions, but I haven't seen many extensions using it.

Python "newthreading" proof of concept released

Posted Jun 27, 2010 21:48 UTC (Sun) by foom (subscriber, #14868) [Link]

The Python core itself isn't very well isolated between interpreters -- it uses a single GIL for all interpreters in a process, for instance.

Worse than that, though, is the PyGILState_Ensure/PyGILState_Release functions which were added to Python 2.3, with the full knowledge that the new APIs were incompatible with the use of multiple interpreters in a process. Ah well.

Python "newthreading" proof of concept released

Posted Jun 26, 2010 23:15 UTC (Sat) by flewellyn (subscriber, #5047) [Link]

Oh, that's fair enough. But, I maintain that the high cost of processes relative to threads is a way in which Windows is broken.

Python "newthreading" proof of concept released

Posted Jun 27, 2010 20:25 UTC (Sun) by cmccabe (guest, #60281) [Link]

Is it just the startup and shutdown times of spawning multiple processes that is a problem on Windows? Or is there some more fundamental reason why multi-process solutions don't work well in Windows?

If it's just a problem with startup and shutdown times, it seems easy enough to create a "process pool" of VM instances, and change fork() so that it would grab one of those processes out of the pool, rather than spawning a new one.

There have been a lot of "threading sucks and is hard to get right" comments in this... um... thread. Based on my own experiences, I think I agree. It would be nice if we could start using processes to do this stuff right.

Python "newthreading" proof of concept released

Posted Jun 27, 2010 23:23 UTC (Sun) by drag (subscriber, #31333) [Link]

> Is it just the startup and shutdown times of spawning multiple processes that is a problem on Windows? Or is there some more fundamental reason why multi-process solutions don't work well in Windows?

I think that it's the start up times and minimal memory consumption per process that makes having lots of short-lived processes expensive in Windows. I don't know were exactly to get the info on why this is, but I'll see what I can find later.

But it's not a free lunch in Linux either. It's just much less of a serious problem.

Think about the difference of embedding a interpreter into a Apache via mod_perl or mod_python versus using CGI scripts....

Personally I don't like either approach. Instead I like the WSGI approach were I can spawn multiple long-running python processes (based on demand) that can live longer then a single page request. It seems to me that you get the best of both worlds that way.

This is also one of the reasons Google Chrome kicks-ass. It should of been like that from the beginning with Linux browsers.

Python "newthreading" proof of concept released

Posted Jun 28, 2010 0:04 UTC (Mon) by dlang (✭ supporter ✭, #313) [Link]

the issue with mod_perl/python vs CGI scripts is far less the forking overhead than it is the perl/python startup overhead.

Python "newthreading" proof of concept released

Posted Jun 26, 2010 9:35 UTC (Sat) by MisterIO (guest, #36192) [Link]

Why don't they implement a sane synchronization model where you sybnchronize on the data structure you're accessing, not on code?(and no, a synchronized method of a class is _not_ what I'm talking about) After having a simple and efficient way to add locks to objects and lock/unlock them, they could build inside the language a way to automatically check locking correctness(that you could obviously disable once the debug phase is over).

Python "newthreading" proof of concept released

Posted Jun 26, 2010 15:32 UTC (Sat) by vonbrand (subscriber, #4458) [Link]

That is known as Monitors in sane concurrent languages...

Python "newthreading" proof of concept released

Posted Jun 26, 2010 13:01 UTC (Sat) by lmb (subscriber, #39048) [Link]

Threading effectively sucks in any language. (C being the worst offender.) I personally prefer processes with local data and message passing.

Python "newthreading" proof of concept released

Posted Jun 26, 2010 22:30 UTC (Sat) by ms (subscriber, #41272) [Link]

Absolutely. Of course it can't be "proved" in any theoretical sense that message-passing concurrency and non-shared structures are at all "safer" or easier to program to, but in my experience (and I have spent most of a PhD plus several years programming highly concurrent systems in both Java and Erlang, so I do know both sides of the coin), ptheads and all derived systems are a disaster and need to die. Yes, absolutely, there are times where you need the performance advantage of having multiple threads walk over and modify the same data structure, and yes, it can be done safely. But the *vast* majority of the time, you don't need that.

Python "newthreading" proof of concept released

Posted Jun 28, 2010 7:06 UTC (Mon) by flewellyn (subscriber, #5047) [Link]

Actually, it can and has been proven that threading IS much less safe. See "The Problem With Threads" by EA Lee. http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006...

A critique of "The Problem With Threads"

Posted Jun 28, 2010 16:35 UTC (Mon) by PaulMcKenney (subscriber, #9624) [Link]

I recall having seen this paper before. :–)

Please allow me to call your attention to the first sentence of Section 2 on page 2: “Some applications can very effectively use threads.”

Next please look at the first sentence of the quotation following this paragraph: “The world of client applications is not nearly as well structured and regular.” Section 3 then examines the problems that arise when attempting to multithread applications that are unstructured and/or irregular. Section 4 enumerates some ways to mess up the implementation of a simple concurrent class. And you would not be the first reader to overgeneralize the next couple of sections to “threads are always bad.” All that aside, he does give a nice overview of a number of techniques, several of which are much better for some jobs than is threading.

The analogous position in the 1970s was that ordinary programmers could not be trusted to correctly code while loops. This 30-year-old debate may seem strange today, but it did produces some useful constructs, one of them being iterators like the Linux kernel's list_for_each_entry() macro.

I do fault the author for failing to bring out the point that design is much more important in multithreaded code than it is in single-threaded code. Yes, he does note this fact in passing, but he fails to fully state the conclusion, which is that single-threaded programming experts need to learn how to improve their overall design skills in order to effectively handle concurrency. They cannot hack their way to parallel-programming expertise, at least not without accumulating a fair number of bullet holes in their feet.

Surprisingly, my own position on the threads-vs.-not debate is actually quite similar to the final sentence of Lee's paper:

Threads must be relegated to the engine room of computing, to be suffered only by expert technology providers.

The only change I would make would be to replace “suffered” with “enjoyed.” –)

So please don't overgeneralize Lee's paper to “never use threads.” That is not what Lee wrote!

Python "newthreading" proof of concept released

Posted Jun 26, 2010 23:39 UTC (Sat) by elanthis (guest, #6227) [Link]

In general, yes, absolutely. There are apps where you really do just need (or highly prefer) the specific behavior of threaded programming over mulitiprocess programming. These situations are relatively rare outside of a few specific fields of CS, though, and mostly arise in very very performance sensitive bits of code (multithreaded audio engines, for instance) or cases where almost all data is shared and the threads write only in specific ways that result in light locking mechanics (many numerical analysis setups, or simulations).

Useless and stupid

Posted Jun 26, 2010 15:37 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

This locking scheme is absolutely stupid. It simply won't work in presence of two threads - it'll deadlock faster than you can say 'active object'.

They tried to replicate "apartment threading" model from COM. But badly.

Python "newthreading" proof of concept released

Posted Jun 28, 2010 4:10 UTC (Mon) by gdt (subscriber, #6284) [Link]

They seem to have done a good job of what they set out to do.

My criticism is the scope. Multithreading just hurts unless the language itself hides the multithreading. The language should be defined as if iterators start one thread per item. Functional languages allow automatic synchronisation of data, and Python is so near to a functional language that it wouldn't take many changes to reap their benefit. This would allow programs to fully exploit whatever hardware is underneath without knowing the details of that hardware (the number or power of the cores, scheduling to match bus or hypercube topologies, etc).

newthreading vs stackless?

Posted Jun 28, 2010 9:24 UTC (Mon) by job (guest, #670) [Link]

What happened to Stackless Python? It was some sort of green threads implementation in Python that seemed promising, but I've never seen it used in practice.

newthreading vs stackless?

Posted Jun 28, 2010 14:30 UTC (Mon) by donblas (guest, #63860) [Link]

See: http://www.stackless.com/wiki/Applications

In particular, EVE Online uses them, a MMORPG which can have 50,000+ users online in a single instance of the game.

newthreading vs stackless?

Posted Jun 29, 2010 4:12 UTC (Tue) by njs (guest, #40338) [Link]

Stackless Python itself is a patch on top of CPython, so it's never seen widespread use.

But the greenlet package, which uses some serious juju to implement full stackless-style coroutines within standard CPython, is available and there's a small ecosystem around it -- mostly people working on Twisted-style threadless network libraries. (But, I guess, with a nicer API than Twisted itself.)

Stackless/greenlets/coroutines are all irrelevant to the goal of newthreading, though, which is to allow a single Python process to simultaneously make use of multiple CPUs.

Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds