Posted Sep 27, 2012 12:26 UTC (Thu) by njs (guest, #40338)
[Link]
The idea was sufficiently well known for Val Henson (now Val Aurora) to publish a paper arguing against it in May 2003; she cites 6 different earlier systems using it: http://valerieaurora.org/review/hash/node2.html
The oldest appears to be rsync, with Tridge's thesis coming out in 1999, and for de-duplication specifically I'd check the paper on a backup system called "Pastiche" that was formally published in 2002...