It seems that by default you will have one thread per block device, which seems totally reasonable, no matter how many block devices you have. Reducing this number to less than that will mean you will be able to write data out slower, because you can't keep all devices busy. Except if you switch to a non-blocking multiplexing thread doing all write-outs, but there's probably a good reason why that's not done. For fast devices it may be better to increase the number of threads, but again, not doing that will result in slower write-out throughput.
A thread pool would only be better if you would otherwise allocate too many threads per device. But if you allocated too many, there's nothing that prevents having a too high number of threads in the pool either, so it's just shuffling the problem around.