LWN.net Logo

The Grumpy Editor's guide to bayesian spam filters

The Grumpy Editor's guide to bayesian spam filters

Posted Feb 23, 2006 0:20 UTC (Thu) by dlang (subscriber, #313)
Parent article: The Grumpy Editor's guide to bayesian spam filters

unfortunantly you missed popfile (which just released 0.22.4 today)

it is more then just a spam/ham filter, it can filter into many different catagories (there are people who use it to filter into >50 catagories)

it started out as a pop3 filter, but now will also do SMTP and NNTP as well as providing a XMLRPC and IMAP interfaces.

it's IMAP interface is fairly unique in that it doesn't act as a proxy between your mail client and the mail server, instead it acts as a client itself and watches your inbox, automaticaly moving messages into subfolders for you (and you train it by moving messages to the proper subfolder from wherever they get put by popfile).

Popfile provides a web interface to reclassify messages and do other configuration tasks (IMAP users won't bother with this much once they set it up)
more info at
http://popfile.sourceforge.net/ or http://sourceforge.net/projects/popfile/

David Lang


(Log in to post comments)

The Grumpy Editor's guide to bayesian spam filters

Posted Feb 23, 2006 1:54 UTC (Thu) by mbcook (subscriber, #5517) [Link]

I have to agree. I've been using POPFile for years, both on Windows and OS X. It is a fantastic
little program. It doesn't detect "ham/spam", it puts things into buckets. Now I use Ham and
Spam as my buckets, but you can add more and it will learn where things go. So you could make
a bucket for kernel patches and it would learn when things go in there. That would probably
increase the accuracy since it doesn't have to lump kernel patches (which would be largely C
code) in with Ham (which would contain all sorts of stuff).

I recently tried the IMAP support, which was cool. I'm not an IMAP person so I went back to using
POP3 but it did have one really cool feature: classification by folders. Since it monitors what is in
each folder, it knows when you move things. Thus, when you move a message out of your Spam
folder into your Ham folder, it learns that it made a mistake (and vice versa). This is such a cool
idea. Because of the way it works you could set a little home server to run POPFile in IMAP mode
and then whenever and where ever you check your e-mail you can reclassify things and it will
learn from that.

It also supports SSL is you put in the have the needed perl modules. SSL support works great and
provides security that is nice to have when you often work on open networks where someone
could get your e-mail password.

Check it out.

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.