LWN.net Logo

The best of both worlds - a hybrid approach?

The best of both worlds - a hybrid approach?

Posted Feb 27, 2006 9:42 UTC (Mon) by Ross (subscriber, #4065)
In reply to: The best of both worlds - a hybrid approach? by corbet
Parent article: The Grumpy Editor's guide to bayesian spam filters

One thing I've always wondered is why they do it "backwards". Presumably they could do away with a lot of manual tuning if they fed the individual test results into the stream as words. So you could have "SATEST=TEST_RESULT" incorporated into the Bayesian decision making. Two complicating factors would be how to protect it from spammers incorporating those tokens (maybe use a installation-specific "password" prepended to the token), and chunking the numeric results so that they match for similar inputs (obviously floating point numbers won't generally work well). Of course it wouldn't do anything to speed up processing.


(Log in to post comments)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds