Resource utilization is maximized when each processing element (think core) executes useful work, until no more useful work remains. Starting a thread/process has overhead. Context switching between threads on a processor has overhead. Spawning and switching between threads is not useful work.
High-performance servers want to spawn a thread per allocated core, and have each thread fully exercising that CPU. That's why AIO can/must beat synchronous I/O (blocking or non; that's immaterial here) -- your thread can go on managing events (of course, if the CPU is necessary for the AIO to be performed, your thread won't run anyway, but the CPU can sometimes be avoided).