Posted Aug 2, 2010 3:23 UTC (Mon) by vogelke (guest, #4271)
Parent article: On comment spam
> One could try content-based filtering approaches, but they have their own hazards.
Two filters I'd definitely recommend are POPfile and Nilsimsa. Both use fuzzy matching to assign a probability that two documents were not independently created, and they both work very well at catching this type of crap.
I've had the same mail address for around a decade so I'm on every spam list on the planet, but POPfile keeps that down to around one or two per day out of 150-200 messages. Site: http://getpopfile.org/
The main Nilsimsa site seems to be down from a disk failure, but there's a Perl module on CPAN.