Training and tweaking
Posted Mar 2, 2006 16:22 UTC (Thu) by
corbet (editor, #1)
In reply to:
A grumpy editor's bayesian followup by glouis
Parent article:
A grumpy editor's bayesian followup
FWIW, the filters *were* "carefully trained." Just over 2000 messages were pulled out of the stream and used only for that purpose. They were well inspected to avoid mistraining the filter. How much more careful does one need to be?
I did avoid tweaking the various knobs exported by some of the filters, with the well-documented SpamAssassin exception. I believe that was the right choice: most users (even those who are not "newbies") are unlikely to mess with them, and the defaults should be reasonable.
(
Log in to post comments)