I never understood why the compiler doesn't just have an optimization option mode which optimizes for speed on the selected target CPU. Instead we have -O, -O2, and -O3 where each enables a fixed list of optimization strategies (for example -O3 is documented as turning on function inlining, tree vectorization, etc). This means that if it turns out that a particular optimization tends to help on some CPUs but not others, there is no way to just choose whatever is best, even though you may have specified the exact processor to generate code for.
After all, if you really were interested in a fixed set of optimization rules you could specify them by hand with a long list of -fthis and -fthat. Most people who aren't interested in that level of detail would rather have a way to let the compiler decide what's best - even if its decisions are not quite as good as a skilled developer hand-tuning the flags.
To be clear, I am not saying that the compiler can magically determine which optimization flags will help on a particular piece of code. But it can know in general which ones tend to work on which CPUs. An older CPU may benefit from loop unrolling while a newer one, whose performance is more memory-bound, will usually not benefit. It makes more sense for these heuristics to be codified in the compiler, with a table mapping CPU models to optimizations. This table would explicitly be open to change in newer compiler versions.