This would be an interesting problem. In theory, you could clean-room re-implement the existing firmware as stage 1, then replace the buggy IP stack you've now implemented with a good one. Provided the hardware was up to it, you'd be fine.
This brings up the issue raised by another poster of states. The amount of state you'd need to support must exceed that of the CPU and motherboard. Can you do this?
Yes, on two conditions: the card MUST have its own PCI Express controller, AND it must be capable of DMA operations. If it can access main memory in exactly the same way as the CPU, =plus its own=, then provided the regular kernel's VMM could take care of the ethernet adapter's memory needs as well as the OS', then the ethernet card can handle just as much state information as the OS could have at exactly the same speed.
Instead of trying to reverse-engineer a card, initially, I would suggest a proof-of-concept by building an ethernet card with either the Linux or NetBSD TCP/IP stack AND full DMA access to memory.
(Really, the full DMA access should really be there or you'd never be able to pass the packets from the card to the buffer of the client software through kernel bypass. However, that's an aside.)
If such a card was developed, even as a crude garage prototype, you could learn a lot about what such devices actually need in order to work well and thereby anticipate what the kernel will need at some future point when hardware does shift in nature.