The best of both worlds - a hybrid approach?
Posted Feb 27, 2006 9:42 UTC (Mon) by Ross
In reply to: The best of both worlds - a hybrid approach?
Parent article: The Grumpy Editor's guide to bayesian spam filters
One thing I've always wondered is why they do it "backwards". Presumably they could do away with a lot of manual tuning if they fed the individual test results into the stream as words. So you could have "SATEST=TEST_RESULT" incorporated into the Bayesian decision making. Two complicating factors would be how to protect it from spammers incorporating those tokens (maybe use a installation-specific "password" prepended to the token), and chunking the numeric results so that they match for similar inputs (obviously floating point numbers won't generally work well). Of course it wouldn't do anything to speed up processing.
to post comments)