That's exactly the problem -- high-frequency words can get away with irregularity, and the higher the frequency the more irregularity they can get away with (is/am/be/are/was/wtf?). But "data" isn't high frequency at all, so speakers are normalizing it to the usual English rules.