The sandboxing idea would work for a lot of classes of failure, notably those where the driver goes quietly moribund without breaking anything else on the way down. But it's stuck if the driver messes up core kernel data structures before it goes down, and it's *really* stuck if it leaves the hardware's state unclean. (This is probably especially important for graphics cards, but any card with a complex stateful protocol is likely to have this problem.)
This is fixable, but you'd need shadow initialization/shutdown code for *every* driver that needs special shadowing support. (Even there, some cards have complex enough internal states that it can be very hard to deduce how to reset the card if you can't trust the driver's internal state. Again graphics cards are the big villains here, but some SCSI cards I've used have been notable for bizarre state machines which are at some points hard to reset.)
Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds