There are precedents on the non-Android Linux desktop
A PCI sound chipset, a USB headset, and a Bluetooth headset are very different from the perspective of the Linux kernel. The kernel is content to mediate both the USB and PCI solutions through a common userspace API (ALSA pcm) but Bluetooth data is just data until it hits userspace.
So the only way to have these three sound making devices behave interchangeably as far as, say, talking to your friend over VoIP, is either for every program to have separate handling for the different cases, or for a userspace abstraction to intermediate. That's one of the things the once widely despised PulseAudio does for you.
Or consider webcams. Kernel policy forbids format conversions and the like inside the kernel, and in recent years that policy has been increasingly enforced. Once upon a time cameras with custom encodings or weird format decisions would hide some conversion code in their kernel driver. No more. Today if you want cameras to work properly you use a userspace library which masks the various differences between cameras and offers to deliver plain RGB data (or various other useful formats) regardless of what the native camera hardware is.
The kernel is irreplaceable when it comes to access control, resource management, and so on. But when it comes to adding a "take a picture for your avatar" feature to a program, I want a userspace abstraction that can capture the picture just as well from a $5000 pro HD video camera as from the $5 USB webcam I, the developer, happen to own, without me needing to buy dozens of cameras to check they work.