"It should be possible to have a rsync varient that omits using the checksums, and simply overwrites the destination file always, like cp -- but with the rest of the rsync interface left intact. That should be much faster on some hardware."
Note that rsync already do the "overwrites the destination file always, likecp" as you said with the --whole-file option, which is according "the default when both the source and destination are specified as local paths", quoting this man page.
As far as I understand, when rsync acts on local files, in addition to a "normal" cp, rsync is only computing the whole file checksum.