|
|
Log in / Subscribe / Register

(statistically) biased tests?

(statistically) biased tests?

Posted Sep 12, 2002 18:26 UTC (Thu) by corbet (editor, #1)
In reply to: (statistically) biased tests? by bockman
Parent article: Spam avoidance techniques

"A better test was maybe to train the filter with half of the data set and then test it with the other half."

That was the first (15%) test, essentially. And the linux-kernel test too.


to post comments

(statistically) biased tests?

Posted Sep 13, 2002 21:01 UTC (Fri) by ElMiguel (guest, #741) [Link]

But the numbers most people will remember from this article will be the ones with the 100% of lwn@lwn.net messages, since they are the ones showing the most striking advantage in favour of Bogofilter. And, as Bockman says, that is the least realistic test case of all, since you previously optimized the filter for precisely that set of messages. Perhaps you should make a note in the article itself to warn people who don't read the comments of that circumstance?

(Otherwise than that and overlooking spamc/spamd, great articles, as always :-)).


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds