The constraint that it must "look reasonable to a casual human inspector" doesn't make sense.
The point of the checksum is that you can download the source code and trust it WITHOUT having
to inspect it. Indeed, the point of the checksum is that you trust the maintainers, and the
checksum serves to link that trust with the tarball, with the presumption that the source code
as last checked by the maintainers was what they checksummed.
Also, we're dealing with tar+gzip+PHP code. Certainly it would be completely trivial to append
arbitrary binary data to the tar file which may or may not (depending on the tar
implementation) give a warning when unpacking. I'm not even going to bother analyzing this,
because its pointless; an attacker need only find one avenue which meets his criteria.
As for file size. This may or may not pose a problem. Certainly from a practical standpoint I
doubt it. This is a relatively huge tarball. The requisite additional data is at most 512
bits, IIRC--the internal block size of MD5--maybe less. People who manually download the
tarball--and bother to check the hash--will also likely ignore the file size, and rightly so.
Any secure cryptographic algorithm isn't sensitive to the input size; if it is, its not
secure, by definition.
I'm not familiar enough with the weakness of MD5 to judge the complexity of an attack which
doesn't just use a suffix. It may be simply slightly marginally more difficult to find a
collision if you modify the tail instead of appending; I dunno.
Point is, the MD5 checksum is useless, because you cannot trust it anymore.