Well, another way to make things easier for programmers is to implement things like RCU in libraries-- which Paul and others have done. The complexity may still be there, but it's hidden from most RCU users.
I'm not sure if an in-hardware lazy update cache would be more difficult to understand or less for developers. In general, application programmers are horrified by the concept of eventual consistency. The problem from their perspective is not the lack of direct hardware support, but the concept itself. Things like SQL were developed so that they wouldn't have to think about these kind of ordering issues.
As long as its just us systems programmers dealing with this stuff, I doubt that Intel or AMD is going to be very motivated to make it "simpler."