Oh, it's much, much older. Control Data mainframes of the 60s had even
*system calls* made asynchronously. The CPUs had no I/O capabilities, and
one would write a system request into the first word of memory of the
process. Peripheral processors would periodically scan process memory
beginnings, see the request, and set a bit to acknowledge reception.
I/O would begin, and later the word would be written to again to acknowledge
completion, with a toggle bit as semaphore between PP and CPU.
Meanwhile, your faultlessly brilliant code would hopefully continue doing
nuclear bomb calculations, cryptographic decrypts, or payroll, waiting
for the washing-machine disks to do their deeds and the PPs to propagate
their transfers back into central memory.