So, I think a strong "boil the ocean and break everything" specification that you mention would actually solve the "web authors don't follow the spec" problem that you mention.
If bad behavior is such a problem, then perhaps the next HTML specs and implementations shouldn't be so permissive of it. Likewise, it should be easy for both the web developer and browser developer to test and verify good behavior. Otherwise the problem will just continue to get worse as more features are piled on.