Actually they don't bother with state
Actually they don't bother with state
Posted Feb 24, 2006 2:32 UTC (Fri) by AnswerGuy (guest, #1256)In reply to: The Grumpy Editor's guide to bayesian spam filters by shane
Parent article: The Grumpy Editor's guide to bayesian spam filters
... they just retry everything three or four times at five minute intervals.
It's only an incremental extra effort to deliver multiple times regardless of whether the earlier copies reached you or not.
I get over a 1000 slices of spam per day through my personal inbox ... even after greylisting and some blacklisting (very limited blacklisting). SpamAssassing gets over 99% of that, but I still end up with 20 or so, per day, that make their way through SA. Often I get about 5 copies of any that do get through.
Periodically I go through the spam folder (and one "longlines" folder which contains spam with very long lines --- basically no standards compliant linefeeds at all, which have broken my procmail recipies for SA and YAVR in the past). (YAVR is an anti-virus recipe since SA normally doesn't catch those as "spam" per se).
I've only noticed a tiny handful of false positives dumped into my spam folder by SA. (Less than a half dozen in almost two years). This is a very unscientific measure since I am pretty heavy handed with the delete key when I do spot check the spam folder; and I'm only human with a limited about of time and energy to spend on rescuing mail sent by strangers who's content looks too "spammy."
I don't keep metrics on rejections ... nor on greylisting delivery deferrals that never get delivered. I only have one confirmed case of a bit of e-mail that was not spam but which got greylisted for 49000 seconds (wife was attempting a PayPal password change and their mail server didn't respect the conventional retry delays --- the Postfix greylisting daemon we're using punishes apparently attackers with an exponential back off). (She resolved that by simply whitelisting them and forcing another PW change; while also teaching the silly tech support person there all about proper MTA behavior, and greylisting over the phone).
So, greylisting helps a little ... but too many spammers have adapted and now simply, blindly try everything multiple times. (Also anyone that does successfully dump their spam on an open relay gets the delivery retries for free ... alll standards compliant MTAs do that, and most open relays are just old, unpatched, poorly configured copies of sendmail).
(I do NOT use ORBS type blocking ... I refuse to configure my MTAs to implement a set of dynamically changing policies that are set by strangers ... so I only add my own connection blocking sparingly ... so far).
The main thing that seems to limit my spam load is my own paltry bandwidth. My mailservers share bandwidth with DNS and web traffic (from my clients and servers) over a little old 144Kbps IDSL line. I have about the equivalent of two bonded 56K leased lines. Apparently a significant number of spam cannons time out and drop connections on such a slow link (they've got millions of other targets to get to).
JimD
(The Linux Gazette "Answer Guy" --- no I didn't pick the name; yes, my e-mail address is still the same: jimd@starshine.org --- and published monthly in several languages and countries around the world for several years).
Posted Feb 24, 2006 13:53 UTC (Fri)
by copsewood (subscriber, #199)
[Link]
Also occasional genuine messages getting rejected by the MTA that accepts mail from across admin boundaries will result in a bounce to the sender, while not sending bounces to innocent victims, e.g. which happens if you reject at an internal incoming MTA. I set spamassassin score > 10 to MTA reject and > 7.5 to go to my spam folder via procmail. I also use the spamhaus DNSBL which currently rejects about 900 spams a week on my server and with which I have never seen a single false positive in about 2 years use, and use less reliable DNSBLs e.g. spews for spam folder filtering.
While greylisting makes sense for the reasons other contributors suggest, you might spend less time and have fewer manual false discardsreject on black, filter on grey
going through your spam folder if you reject more of the very high probability spam at the MTA level. Checking manually what went into the spam folder is quicker and better on doubtful messages only or there will be too many in the spam folder to do this job accurately.