Aaaaand I've still goofed it. I'm not taking into account... well, everything.
I see the actual test in my head, but I can't code it right now due to fatigue. But even with all necessary calculations, being integer math, it'll take no more than a few tens of cycles. Even on a register-starved x86, putting a couple temporary variables on the stack will only pollute the cache, before over-writing the temps anyway. It shouldn't take more than a microsecond to check for overlap.