I agree it would be more convenient (and possibly faster) if the check for zero-filled could be done in the compression algorithm. I'm not signing up for that though... the LZO algorithm/code is incredibly dense!
But even though the CPU is much faster than the memory bus, much of the cost for scanning a page (or even copying it) is due to the cache misses incurred getting all the data bytes in the page in from RAM to cache. So scanning a page for zero-filled also serves as an effective pre-fetch. Once the page has been scanned, and assuming the data cache is not tiny, the compression algorithm will be reading pre-cached data.
Lots of good discussion, would be good to measure the different options.