The Grumpy Editor's guide to bayesian spam filters
The Grumpy Editor's guide to bayesian spam filters
Posted Feb 23, 2006 9:42 UTC (Thu) by walterh (guest, #19113)Parent article: The Grumpy Editor's guide to bayesian spam filters
I think that running SpamAssassin with the network tests enabled is unfair and invalidates the statistic. After all, your mail was collected over some time and only then fed to SpamAssassin. So in the meantime, all the network databases that SpamAssassin queries already had the spam from your set marked by other users. This is very different to the situation where you pipe your incoming spam through SpamAssassin in real-time, because then most spam isn't marked yet.
Posted Feb 23, 2006 10:40 UTC (Thu)
by robdinn (guest, #30753)
[Link]
But may that is an argument for spamassassin + greylisting :)
Posted Feb 23, 2006 13:36 UTC (Thu)
by bk (guest, #25617)
[Link]
That's a good point.The Grumpy Editor's guide to bayesian spam filters
(of course that wouldn't help if everyone followed that approach)
It's also unfair in terms of performance. SA with only local tests is *much* faster (although, arguably, still not 'fast' in the bogofilter sense of the word).The Grumpy Editor's guide to bayesian spam filters