|
|
Subscribe / Log in / New account

The Grumpy Editor's guide to bayesian spam filters

The Grumpy Editor's guide to bayesian spam filters

Posted Feb 23, 2006 9:42 UTC (Thu) by walterh (guest, #19113)
Parent article: The Grumpy Editor's guide to bayesian spam filters

I think that running SpamAssassin with the network tests enabled is unfair and invalidates the statistic. After all, your mail was collected over some time and only then fed to SpamAssassin. So in the meantime, all the network databases that SpamAssassin queries already had the spam from your set marked by other users. This is very different to the situation where you pipe your incoming spam through SpamAssassin in real-time, because then most spam isn't marked yet.


to post comments

The Grumpy Editor's guide to bayesian spam filters

Posted Feb 23, 2006 10:40 UTC (Thu) by robdinn (guest, #30753) [Link]

That's a good point.

But may that is an argument for spamassassin + greylisting :)
(of course that wouldn't help if everyone followed that approach)

The Grumpy Editor's guide to bayesian spam filters

Posted Feb 23, 2006 13:36 UTC (Thu) by bk (guest, #25617) [Link]

It's also unfair in terms of performance. SA with only local tests is *much* faster (although, arguably, still not 'fast' in the bogofilter sense of the word).


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds