Indeed, the article should have stated that two consecutive calls to the same pure function
with the same arguments will give the same result, that's the difference between pure and
constant functions (constant functions always return the same for the same input).
With regard to your suggestion to just use Haskell, I'd point out that ghc doesn't support
CSE. I love Haskell, and it's a great language, but this sort of optimization is not one of
its strengths. Laziness interferes with its strengths as a pure functional language in this
case. The trouble is that it's hard to determine when using the memory to store the result of
a computation is worth paying to avoid recomputing it. For primitive data types the answer is
obvious, and we should do CSE, but ghc doesn't perform that analysis, and even if it did, I
wouldn't be happy with a CSE pass that only worked on functions returning Int/Double/Bool etc.
Of course, dead code elimination comes for free in Haskell, so that does gain us something.
But as the article points out, it's really only useful for things like conditional compilation,
which is much less of a gain than CSE.