Single-threading not considered harmful
Posted May 7, 2011 2:10 UTC (Sat) by quotemstr
Parent article: Scale Fail (part 1)
When the author admonishes us for using "single-threading", we naturally suppose that we should use "multi-threading" instead, but because a "multi-threaded program" is commonly understood to be composed of many shared-memory lightweight processes, following this advice will in fact tempt us to create programs that become expensive to scale. In a sense, multi-threading, not single-threading, is the true enemy of scalability.
In any massively parallel system, it's the communication between processing nodes that ultimately limits the size and performance of the system. When we express concurrency using multiple threads, we naturally use use the memory shared by these threads as the communication medium. But because shared memory scales poorly, the cost of using ever-larger coherent-memory systems quickly overwhelms any possible benefit.
Having run into this wall, we transition to a communication medium that scales much better, although (or because) it offers fewer features and less coherency compared to shared memory; examples include databases, clustered filesystems, and specialized message queues. After this expensive and painful process, costs again increase linearly with capacity: processing nodes can be spread across multiple machines instead of having to share a single increasingly powerful machine. Because the communication medium is no longer shared memory, the possibility of multiple threads sharing a single process becomes irrelevant, and we see that the work we invested in using this kind of threading was wasted.
So to avoid these ends, let's avoid these beginnings: avoid multi-threading. Use single-threaded programs, which are easier to design, write, and debug than their shared-memory counterparts. Instead, use multiple processes to extract concurrency from the hardware. Choose a communication medium that works just as well on a single machine as it does between machines, and make sure the individual processes comprising the system are blind to the difference. This way, deployment becomes flexible and scaling becomes simpler. Because communication between processes by necessity has to be explicitly designed and specified, modularity almost happens by itself.
I believe the author had these points in mind when he wrote his article, but by denouncing "single-threading", he risks sending some readers down an unproductive path. Concurrency is the ultimate goal, and it's usually achieved best by a set of cooperating single-threaded programs. The word "thread" refers to a concept that resides at a level of abstraction not appropriate for this discussion, and its use can only muddle our thinking.
to post comments)