Good point on the choice between big and LITTLE being far more nuanced than I could hope to fully capture in a one-sentence rule of thumb. I do agree that the characteristics of a given device will often need to be taken into account.
Please accept my apologies for losing your comment about thread pools. I agree that having applications base their thread-pool sizes on the number of CPUs physically configured on the device will usually be a good place to start.
I really did mean that wakeups can be delayed until a CPU has gone offline! ;-)
Here is what can happen (or at least did happen to me as of about a year ago): (1) A kthread bound to CPU 0 is awakened. (2) Before the kthread can run, CPU 0 goes offline. (3) The kthread actually tries to start running, and as a result has its binding to the now-offline CPU broken. It is possible to handle this by careful use of preemption disabling and checks to see what CPU the kthread is actually running on.