The trick used is a handoff of responsibility from the kernel to userspace. Userspace knows what device woke up the system because it just interacted with the corresponding device driver, and can then decide whether or not to continue holding off suspend. The kernel therefore only needs to hold off suspend until userspace has started the interaction, where this interaction might be a read() system call. So userspace would hold off suspend just before doing the read(). The kernel would stop holding off suspend as part of the read(). After the read() returned, userspace would use the data read to determine whether it should keep holding off suspend, and, if not, stop holding off suspend.
According to the Android developers, wakelocks handle this handoff in a natural way. Others argue that all of the suspend-blocking work should happen in user space, so that the kernel does not need to worry about it. And there are probably a large number of other opinions out there about how all of this should work, both informed and otherwise. ;-)