|
|
Log in / Subscribe / Register

Bayesian Spam avoidance possible hole ?

Bayesian Spam avoidance possible hole ?

Posted Sep 18, 2002 6:29 UTC (Wed) by guybar (guest, #798)
Parent article: Spam avoidance techniques


It seems a bit stupid to ask, but what's stoping the spammers from attaching a random scientific/financial/other serious/ article after the actual spam ?

wouldn't this defeat the bayesian techniques described ?


to post comments

How does online shopping work

Posted Nov 4, 2002 1:31 UTC (Mon) by mcisaac (guest, #7442) [Link]

Great article!

I'm worried that routine activities such as online shopping might be difficult with this approach. In the "definition of spam" section, the paper touches on what is and is not spam, referencing a merchant receipt as an example of something commercial that isn't spam.

My question is, does the receipt pass the Bayesian filter or get flagged as spam?

Bayesian Spam avoidance possible hole ?

Posted Apr 15, 2003 4:27 UTC (Tue) by mattknox (guest, #10640) [Link]

Actually, this would not work all that well, unless the spammers chose a document that is in your field. If you get a lot of mail about scripting languages or kernel development, then an article about one of these topics might help spam get through(at least a few times). However, if a random article that you would not normally recieve in the mail was attached, it would do nothing, because the terms would neither look like spam nor like ham. So the only way for spammers to win on this strategy is to find words that are found in mail that goes to a lot of people, and has not appeared in spam yet. This will be, at best, an uphill battle for them.

Bayesian Spam avoidance possible hole ?

Posted Oct 2, 2004 12:58 UTC (Sat) by jerry (guest, #25162) [Link]

Theoretically, it will not defeat the bayesian filter, but in reality, it does affect the filter, especially considering the impact on speed and memory usage. personally, I think designing an algorithm that "forget" rarely used tokens may be tedious or costly/impractical...

The way we used in a real world implementation spamweed is to mix bayesian filter with other technologies, especially those that can extract useful information among spammer's decoys, which significanly increase bayesian filter efficiency and stabilily.

Jerry: Engineer SpamWeed.com


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds