LWN.net Logo

A grumpy editor's bayesian followup

A grumpy editor's bayesian followup

Posted Mar 2, 2006 15:24 UTC (Thu) by jmason (guest, #13586)
In reply to: A grumpy editor's bayesian followup by zmi
Parent article: A grumpy editor's bayesian followup

It's important to note that, without good training, BAYES_99 may indeed
fire regularly on nonspam mail -- that's the danger with user-trained
rules. In the *default* scenario, therefore, a score of 3.5 is reasonably
optimal. However, if good training is supplied, it's a good plan to
increase the BAYES_99 score to 5.0, or even more. (I think we might
mention that somewhere in the documentation -- I hope. ;)

Also, it's worth noting that "BAYES_99" doesn't really refer to a 1%
probability. SpamAssassin uses the Fisher Inverse Chi-Square Procedure
described at http://garyrob.blogs.com/whychi90.pdf , and as a result these
are no longer true probability values -- so don't expect to see
probabilistic distributions.

Great articles btw. The grumpy editor has outdone himself ;)

--j.


(Log in to post comments)

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds