My complaint with threading is correctness more than performance. With all resources shared between all threads the scope of an error is the whole application. Use more explicit sharing and the really dangerous errors get confined to the smaller portions of the program actually dealing with the sharing.
Performance scaling? Sure, shared memory doesn't scale to the largest jobs. For that you need methods that cross machines. Those methods don't scale to the smallest latencies. For that you need shared memory.
Still, I agree. Threading is too much the method of fashion more because everyone is doing it than technical merit. Of course everyone doing it means you get to use libraries other people wrote and spend less time swimming up stream.