This sounds sensible -- essentially it's the Windows Driver Foundation argument; that you should be able to develop in a friendly, simple but inefficient environment and gradually pull in the more awesome powers of the full kernel environment as you need to. So your driver for a serial IR device that only ever needs to cope with 9.6kbps can be written like the system is very simple indeed, whilst that for a 100Gbps network card perhaps needs to pull out all the stops to minimise the number of machine instructions to send a packet, or maximise parallelism when working with dozens of cores.
The WDF guys needed to solve the blocking-call thing of course; their approach was to say that I/O request packets had to be either completed within a driver callback or else placed in a queue for later retrieval in response to an event or timer; placing it in an internal queue would release the device or instance-wide lock. Busy-waiting or sleeping inside a callback was always an error. It's troubling for this purpose that Linux's model of in-progress kernel calls is very much tied to the C stack, but it seems like it wouldn't be too difficult to remodel using aio as the standard and synchronous calls as a simple wrapper atop that which takes place outside the driver framework's locking scheme.