Nah. There's no real reason the hardware has to be cache coherent. It is just a lot easier if it is.
For example, the OS could force a cache flush on both CPUs when migrating a task.
Threaded applications would have a tough time, but even that has been handled in the past. For example, the compiler and/or threading library could either track memory writes between locks so as to make sure those cache addresses are flushed, or it could use the big hammer of a full cache flush before unlock.
Cache coherency is really just a crutch. Lots of embedded programmers and Cell programmers (no cache protocol on the SPE's) know how to work without it. :-)