LWN.net Logo

Backscatter increase clogs inboxes

By Jake Edge
April 9, 2008

Backscatter, also known as blowback, is the result of a spammer forging the sender address on an email that is sent to a non-existent address. Many mail servers do not reject invalid addresses when they receive the email and instead generate a bounce message sometime later. The unfortunate victim, then, is the one whose address was forged as the sender. Sometimes, hundreds or thousands of bounce messages can be generated which flood the inbox of an innocent bystander.

Backscatter seems to be on the rise recently, the LWN inbox has seen a huge increase in the number of bounces over the last week or so. There may be some connection to some Google domains contributing to the problem, but that cannot explain all of it. One basic problem is that many mail servers are generating the bounce messages after accepting mail for invalid addresses, rather than rejecting it while the SMTP transaction is still in progress.

When a mail server gets a connection from a sending machine, it gets several pieces of information about the email in addition to its contents. Both a "from" and "to" address are included in this extra information, which is usually called the envelope, for obvious reasons. After receiving each piece of the envelope, a mail server has the opportunity to reject the message. Typically this isn't done for valid-looking sender addresses, except in limited blacklist situations, but it certainly can and should be done when the recipient address is invalid.

Due to a variety of mail server configuration issues, many mail servers do not avail themselves of rejecting mail for invalid senders. Instead, they defer their decision until sometime later. Servers that relay mail will not know whether some of the addresses they relay are valid, while other servers (qmail for example) separate the SMTP conversation program from the local delivery program for security reasons and thus do not have that information available. Other valid or semi-valid reasons exist, but once the mail has been accepted, the proper means of indicating a bad address is no longer available.

In the days before spam—remember those?—a mail server could generally trust that the sender address in the envelope was the real sender. So an incorrectly addressed email could be bundled up in a bounce message and sent to the sender. If the sender address is valid, it is very little different than a bounce that is generated by the sender's machine when the mail gets rejected at SMTP time. Unfortunately, the majority of sender addresses these days are forged.

But spammers don't want to use just any forged address, they want to use something that is valid or appears valid. Mail servers have gotten better at testing sender addresses for validity before accepting mail from them. So, where does an enterprising spammer get a valid email address? They pick one at random from their list of "500,000 guaranteed opt-in email addresses" that they bought from some other miscreant. They use those lists to send their spam to as well as using them to choose sender addresses to use.

As might be guessed, the SpamAssassin mailing lists have been discussing the problem recently, especially trying to find ways to reduce the amount received. SpamAssassin does have the VBounce plugin to recognize bounce messages. By default, it doesn't increase the score of bounces by much as it is meant to be used with procmail to put bounces in a separate place from spam.

Another idea floated on the list is to use SPF or DKIM records for a domain. The belief is that spammers avoid using those domains because it is likely to cause their message to be immediately classified as spam. Anecdotal evidence seems to indicate that backscatter can be significantly reduced in this way.


(Log in to post comments)

Backscatter increase clogs inboxes

Posted Apr 10, 2008 1:59 UTC (Thu) by motk (subscriber, #51120) [Link]

qmail is especially bad at this, and there are so many dodgy qmail setups out there ...

Backscatter increase clogs inboxes

Posted Apr 10, 2008 4:40 UTC (Thu) by dlang (✭ supporter ✭, #313) [Link]

I thought I read that the most extensive use of SPF was spammers domains.

on the other hand if you do have a SPF definition and the mail server does check it when it
receives the mail, it won't send it on to the second-level process that's eventually bouncing
the message.

Backscatter increase clogs inboxes

Posted Apr 10, 2008 4:49 UTC (Thu) by zlynx (subscriber, #2285) [Link]

In the early days of SPF some sites were configured to give bonus points or whitelist source
domains with SPF.

The right way to use SPF is negative scoring only.  If email doesn't match its domain SPF then
give it spam points ... whatever you happen to think it's worth.

what is SPF good for ?

Posted Apr 10, 2008 16:50 UTC (Thu) by copsewood (subscriber, #199) [Link]

Personally, having implemented it and then given up, I think the only useful application for
SPF is whitelisting. If you score based on SPF pass or fail and this increases your false
positives/negatives there is no point using it in this way, unless your objective is to punish
people for incorrect or unmaintained SPF setups. However, if you have whitelisted the domain
as having a well-managed mail system then SPF can give you some confidence a message from a
particular IP address is from that domain.

what is SPF good for ?

Posted Apr 10, 2008 17:10 UTC (Thu) by zlynx (subscriber, #2285) [Link]

That's another good use of it.  I don't whitelist at the mail server level so I didn't think
of it.

As for punishing people with bad setups.  Yes!

Admins are already punished for running open relays, not having reverse DNS records, firewall
blocking their sending SMTP servers and many other things.  If they publish a SPF record, it
had better be correct.

what is SPF good for ?

Posted Apr 11, 2008 15:53 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

"As for punishing people with bad setups. Yes!"

If only there were some way to do that without punishing the sender and recipient of the mail more.

I have often seen instances of mail recipients rejecting my mail out of spite, based on an opinion of how the mail system should work. In every instance, the recipient would have enjoyed receiving my mail more than I would have enjoyed him receiving it. In most cases, it was a reply to an email he sent me.

If SPF is too complex try CSV/CSA

Posted Apr 10, 2008 18:03 UTC (Thu) by copsewood (subscriber, #199) [Link]

I think there are 2 reasons SPF hasn't delivered much help in practice.

  1. Not much use other than to help with whitelisting known good domains. It pushes the problem back from knowing what the good and bad IP addresses are to knowing what the good and bad domains are, but only helps here for known good domains with SPF records consistent with email envelopes.
  2. It tries to go too far and ends up too complex and difficult to maintain. (I've implemented SPF and believe me, it's a mess). If there is any regular change in where your domain email users want to send their mail from, maintaining a useful SPF DNS record becomes unlikely.

Knowing which domain is responsible for a sending MTA is likely to be easier than knowing which addresses an envelope From: (not the header From) can reasonably be sent from. The Microsoft take on SPF, SenderID is even worse because it tries to validate the header From and related headers.

If it is more easy to know good from bad domains than good from bad addresses, CSV-CSA provides a much simpler check of the domain responsible for the sending MTA and doesn't care about any envelope or body headers beyond the HELO/EHLO greeting. Presumably if the MTA is run from a well managed and reputable domain, the rest of the message is more likely to be authentic. For those particularly interested in message authenticity (useful if you want to know a message claiming to be from your bank is actually from your bank) then DomainKeys can be used to give stronger assurances. However, DomainKeys isn't reliable for mail going through mailing lists or other gateways that mangle the body or headers of the message.

If SPF is too complex try CSV/CSA

Posted Apr 12, 2008 19:46 UTC (Sat) by kevinbsmith (guest, #4778) [Link]

For those of you who don't naturally think in RFC-speak, here is a gentler introduction to
CSA:
  http://www-uxsup.csx.cam.ac.uk/~fanf2/hermes/doc/antiforg...

It's still not quite as "plain English" as I would prefer, but it's not bad. I would be
interested to hear other opinions about a) how much good for individuals who adopt it
tomorrow, b) the likelihood of it being widely adopted, and c) how much good it could do if
widely adopted.

I'm still sad about SPF. The worst part was when I set up both email hosting and outgoing smtp
services at pobox.com (who themselves were among the SPF originators), and was still unable to
find or get a simple recipe for configuring SPF.

If SPF is too complex try CSV/CSA

Posted Apr 17, 2008 11:07 UTC (Thu) by copsewood (subscriber, #199) [Link]

Good article thanks. I think that SPF is probably redundant, because if you want to know the
sending MTA is responsibly managed CSV/CSA together with a domain reputation system is
probably better. If you want to know the message is authentic, Domainkeys offers a better
solution. I don't think there is much overlap in function between Domainkeys and CSV/CSA but
SPF tries to overlap both and does neither job well.

Backscatter increase clogs inboxes

Posted Apr 10, 2008 6:45 UTC (Thu) by dcovey (guest, #51072) [Link]

SPF and DKIM have been in development for years. Both assume that an email
provider will supply a secure, authenticated relay and clients will select
the correct relay based on their 'from' domain. This hasn't happened on a
large scale yet, from what I've seen. 

SPF etc

Posted Apr 17, 2008 7:26 UTC (Thu) by odie (guest, #738) [Link]

I run into two problems with SPF (haven't looked at DKIM).

First, I often send mail from my laptop, connected to the internet with various means. Many
networks have firewalls blocking outgoing connections to port 25, meaning I have to relay my
mail via their SMTP gateway instead of my own. I suppose I could tunnel all my outgoing mail
to my SMTP server via ssh or some such, but I have several less technical users of my domain
who also are heavy laptop users, or use my domains for personal mails when at work etc.

Secondly, I run a bunch of mailing lists. This means a lot of mail from various senders is
relayed by my server. The official solution to this is to rewrite the sender in the mail
headers, which seems like a terrible kludge. A mechanism whereby I could insert a server
signature in the headers indicating that I have OK'd the sender would be a better solution.

SPF just doesn't fit in with how SMTP works.

SPF etc

Posted Apr 17, 2008 9:11 UTC (Thu) by cras (guest, #7000) [Link]

Submission port (587, RFC 2476) helps with your first problem. It's not usually blocked by
firewalls. I wish it would get better support from clients though.

Backscatter increase clogs inboxes

Posted Apr 10, 2008 13:37 UTC (Thu) by dwmw2 (subscriber, #2063) [Link]

It's not particularly difficult to avoid backscatter. I never send MAIL FROM:<dwmw2@infradead.org>, and thus I never need to accept bounces to that address.

Instead of using my raw email address as the SMTP reverse-path of outgoing mail, my mailservers automatically rewrite it to include a timestamp (and an md5 hash to make it non-trivial to fake). Then they can recognise and accept only valid bounces to mail which I did actually send, while rejecting the backscatter from fakes.

As an added bonus, when I started doing this, people whose mailservers bother with sender verification callouts were also able to reject the mail faked to appear from dwmw2@infradead.org too.

Backscatter increase clogs inboxes

Posted Apr 10, 2008 16:22 UTC (Thu) by eli (guest, #11265) [Link]

Interesting idea.  Do you have a link to the software you use for that, 
and configuration details?  What drawbacks have you run into with this 
technique?

Backscatter increase clogs inboxes

Posted Apr 10, 2008 16:34 UTC (Thu) by jake (editor, #205) [Link]

> Interesting idea.  Do you have a link to the software you use for that,
> and configuration details?  What drawbacks have you run into with this technique?

I probably should have referred to it in the article, but we had some information about Bounce
Address Verification in http://lwn.net/Articles/189531/ 

I think that is what David is referring to or is something similar.  The info in that article
may be getting out of date now, though.

jake

Backscatter increase clogs inboxes

Posted Apr 11, 2008 11:20 UTC (Fri) by dwmw2 (subscriber, #2063) [Link]

Interesting idea. Do you have a link to the software you use for that, and configuration details? What drawbacks have you run into with this technique?
It was originally done in response to the brain-damage of SPF — an implementation of the 'Sender Rewriting Scheme' which the SPF nutters thought every existing forwarding host would end up implementing to conform to their retroactive redefinition of SMTP.

The idea was that in the Brave New World you weren't allowed to simply forward mail intact as we've been doing for decades. That's now called "forgery". So when forwarding a mail to a broken server which checks SPF, you have to mangle the source address to appear to be from one of your own domains instead. So you might rewrite it something like:

<originaldomain>+<originaluser>@example.org
But that obviously isn't safe, because you want people to be able to receive bounces through this path — you have to relay those bounces on to the original user. If you were to just accept any old domain+user@example.org and forward that to user@domain then you'd effectively be an open relay.

So the scheme has to provide some kind of authentication on these made-up addresses, which we do with a crypto hash of the address with some local secret. Also we want to time-limit these signed addresses to prevent them being harvested and used for ever more. And add a prefix so we disambiguate from normal user addresses easily. We end up with something more like:

SRS0+<hash>+<timestamp>+<domain>+<user>@example.org
I implemented this and set it up so that it triggers only when forwarding mail from a domain with an SPF record, to a known SPF-afflicted recipient. Although it was mostly pointless because before adding such recipient domains to the list, I'd contact their postmaster and explain to them how broken SPF was and how it was causing them to reject valid mail — and they'd usually just stop rejecting for SPF failures. My 'spf-afflicted-domains' list has remained fairly much empty

But then I realised that the rewriting scheme itself could achieve most of what SPF set out to do, without the brokenness. I just set it to rewrite my own outgoing mail. So instead of sending MAIL FROM:<dwmw2@infradead.org>, it's rewritten to something like <SRS0+d6ac8e5b21d8f29d40b9+1675+infradead.org+dwmw2@pentafluge.srs.infradead.org> — which is only in the SMTP transactions and the Return-Path: header of the mail; it's not visible in the From: header so most people never even notice.

I accept bounces (MAIL FROM:<>) only to the signed addresses, and — this is the important part — don't accept bounces to the normal address.

This has the knock-on effect that other people don't need to accept faked mail from the normal address either. Here's an attempt to feed fake mail from dwmw2@infradead.org to sourceforge, for example:

220 mail.sourceforge.net ESMTP Exim 4.44 Fri, 11 Apr 2008 02:40:10 -0700 sc8-sf-mx1.sourceforge.net
HELO pmac.infradead.org
250 mail.sourceforge.net Hello pmac.infradead.org [90.155.92.200]
MAIL FROM:<dwmw2@infradead.org>
250 OK
RCPT TO:<bluez-devel@lists.sourceforge.net>
550-Verification failed for <dwmw2@infradead.org>
550-Called:   213.146.154.40
550-Sent:     RCPT TO:<dwmw2@infradead.org>
550-Response: 550-This address never sends messages directly, and should not accept bounces.
550-550-Please see http://www.infradead.org/rpr.html or contact
550-550 postmaster@infradead.org for further information.
550 Sender verify failed
QUIT
221 mail.sourceforge.net closing connection

It's done in Exim, which is versatile enough to do it all for itself without needing external software. My original write-up is at http://www.infradead.org/rpr.html. It's slightly out of date now — one of the things I've changed since then is to remove the full timestamp from the domain part, since it interacted badly with greylisting (as it's regenerated each time). The setup I'm actually using currently is here.

There are a couple of caveats that I've noticed, but nothing particularly problematic (for me, at least). The first is that it can interact badly with broken vacation messages or other autoresponses. If an autoresponse message is sent as a bounce, but to the From: address instead of to the SMTP reverse-path, then it obviously gets rejected. Some consider that a feature, though — and in fact most of the broken autoresponders out there are doubly-broken: not only are they sent to the From: address, but they're also not sent as a bounce. Which means they get through just fine, which is unfortunate when they are responding to faked mail, but thankfully there aren't so many of them that it negates the benefit overall.

The other problem I've seen was a mailing list which allows only senders to post, and does so by checking the SMTP reverse-path. Almost all mailing lists check the From: address, which works fine. I've encountered precisely one list which does it on the reverse-path, and ironically that was the one which was set up to discuss the whole 'Signed Envelope Sender' scheme. It's easy enough to work around — you just keep a list of such recipients and have your system use a special, fixed 'time' value when sending mail to those recipients. I never got round to implementing that; I just used a different address for that list, and have never seen the problem again.

Backscatter increase clogs inboxes

Posted Apr 11, 2008 15:41 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

I use a similar technique in my personal spam filter. All mail I send has my name in the "from:" RFC822 header. When I receive a bounce message, I look at the from: header of the bounced message and if it doesn't have my name, I know it's a bounce of something I didn't send. (Actually, I think I just do a string search of the entire bounce message for my name).

But I note that in the past two days, only 1 of the 900 spams addressed to my userid was a backscatter bounce. I know it used to be more.

Backscatter increase clogs inboxes

Posted Apr 12, 2008 5:17 UTC (Sat) by Xman (guest, #10620) [Link]

I myself have had a most unfortunate encounter with backscatter, and I tried adding SPF with no measurable effect. I'm going to try adding DKIM soon to see if it helps, but I'm not anticipating much in the way of good news.

Backscatter increase clogs inboxes

Posted Apr 13, 2008 20:17 UTC (Sun) by gvegidy (subscriber, #5063) [Link]

A colleague wrote this extension to the SpamAssassin-VBounce-plugin. If you sign all your mail with DKIM it makes sure that the bounce contains valid DKIM-headers of the original mail. The only problem we experienced is that a few servers don't include any headers of the original mail in the bounce :-(

The patch did not get accepted yet because the CLA-process is not completed/carried on into the sa-bugtracker

backscatterer blacklist at ips.backscatterer.org

Posted Apr 18, 2008 23:09 UTC (Fri) by endecotp (guest, #36428) [Link]

I use the RBL-style DNS list of backscatter-generating mail servers at ips.backscatterer.org.
When I get a message that looks like a bounce (i.e. its envelope has an empty sender), if the
sender is listed at ips.backscatterer.org then I reject the message.

The disadvantage of this is that genuine bounces from listed domains will be lost.  But not
other email from them.

The backscatterer.org list doesn't seem to be very comprehensive; I'm not sure where the data
comes from.  I still get significant bursts of backscatter and I haven't measured how many
messages are rejected by this rule during those bursts.

Also, some backscatter doesn't have the empty sender that a normal bounce does - for example,
vacation messages.  So these still get through.

Here's the exim.conf fragment that I use; it lives in the acl_check_data section:

  deny senders = :
       dnslists = ips.backscatterer.org
       message = This message looks like a bounce, and your server is listed at \
                 ips.backscatterer.org, so I assume that this is "backscatter". \
                 Please configure your mail server to not send "backscatter spam". \
                 For advice, try http://www.dontbouncespam.org/

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds