LWN.net Logo

How the spammers find you

The Center for Democracy and Technology has released the results from a six-month survey on how spammers obtain email addresses. The researchers created a few hundred special-purpose email addresses, then carefully exposed each one in exactly one place. After that, it was mostly a matter of sitting back and waiting for the spam to roll in. The destination of each spam indicated where the address had been found.

The report is well worth a read. For those of you in a hurry, here are the highlights of the group's conclusions:

  • By far the most spam was sent to addresses harvested from web pages. Postings to Usenet newsgroups came in a distant second. On Usenet, posters to groups like alt.sex.erotica will receive vastly more spam than those posting to misc.industry.insurance.

  • Even the most simple sort of address obfuscation ("lwn at lwn.net") appears to be highly effective.

  • Dictionary attacks (simply trying login names from a list) result in a significant amount of delivered spam. Short account names are more likely to receive this sort of spam than longer ones.

  • Contrary to expectations, the WHOIS domain name database is not a big source of spam.

  • Most web sites honor their promises regarding unsolicited email - but you do have to be careful about making your wishes clear.

Regardless of source, spam is an increasing problem; the volume of spam sent to lwn@lwn.net (hmm...make that lwn at lwn.net) is now running about 500 messages per day. If it weren't for SpamAssassin, we would have a hard time dealing with our email at all.


(Log in to post comments)

Spamassassin setup, step by step.

Posted Apr 17, 2003 3:09 UTC (Thu) by wstearns (subscriber, #4102) [Link]

As a side note, for those interested in installing SpamAssassin on Sendmail, I have an article draft describing the process at http://www.stearns.org/doc/spamassassin-setup.current.html .

How the spammers find you

Posted Apr 17, 2003 12:52 UTC (Thu) by hansl (subscriber, #5086) [Link]

"By far the most spam was sent to addresses harvested from web pages"

This little trick that we apply here might interest you. Put this in the source of
one (or more) of your web pages:

<!-- Don't send e-mail to this address!!! It is used to catch spammers -->
<a href="mailto:spamtrap@lwn.net"></a>

Next, create a script that can

1. extract the IP adres of a mailserver that delivered a message to
spamtrap@lwn.net from your mailserver log file.
2. Add an entry to your mailserver blacklist

Then create an account called spamtrap and e.g. spend it an appropriate
.procmailrc or alias spamtrap to "|/somewhere/script.sh". Anything to
cause the script to get run when a mail is sent to spamtrap@lwn.net
will do. That's it, you'll receive a lot less spam!

How the spammers find you

Posted Apr 17, 2003 17:08 UTC (Thu) by cpeterso (guest, #305) [Link]

what happens when the IP address of the spam origin mailserver belongs to AOL? Should you block all AOL users? ok, AOL might be a bad example, but blindly blocking any mailserver might be too aggressive.

How the spammers find you

Posted Apr 18, 2003 13:06 UTC (Fri) by hansl (subscriber, #5086) [Link]

I should have mentioned that besides the blacklist we also maintain a
whitelist where most mailservers are listed now from which we have
received one or more legitimate emails.

I agree that this is a forceful method, but since the introduction of
DNS blacklists a couple of years ago spammers have not been sitting
still, and are moving towards the use of dial-up networks on which they
try to plant trojans and virii that will act as spam injectors.

Blocking dial-up networks is not too agressive in my opinion, since you
can expect ISP's to provide an MTA to their customers (that usually sits
outside their dial-up range). You can trust most ISP's mailservers like
AOL's a lot better than Joe User's PC running LookOut. And I've found
that when it happens that an ISP's mailserver does get blacklisted, they're
usually very fast at correcting the situation. They have an incentive to do
so, because it being blocked makes a lot of customers unhappy...

dictionary attack

Posted Apr 17, 2003 17:18 UTC (Thu) by cpeterso (guest, #305) [Link]


I always preferred the name "Rumplestiltskin attack" over "dictionary attack". ;-)


How the spammers find you

Posted Apr 22, 2003 11:30 UTC (Tue) by ekj (subscriber, #1524) [Link]

I would imagine that time is a very important factor in how much and what kind of spam you get. The results of such a survey running for a month would likely be very different from the same survey running for 3 years.

This is offcourse because many spammers get their email-adresses, not from any of the sources mentioned in this article, but rather from other spammers. (indeed much spam is advertising "100 million fresh email-adresses") Thus, once an adress starts receiving spam it will only ever get worse, never better.

How the spammers find you

Posted Apr 28, 2003 15:42 UTC (Mon) by Klavs (subscriber, #10563) [Link]

Instead of just blacklisting mailservers that send SPAM - they should be reported to SPAM-Cop or some other spam-stopping initiative - so they will be warned that they are blacklisted and perhaps do something about it.

Another way that I believe will work to stop spam (If enough people do it) would be to have the mailserver scan the spam while the DATA is being written - and if it seems to be spam - don't acknowledge it was received - disconnect the sending client (so it'll try again) and drop a iptables rule rewriting connections from that IP to port 25 to your local tarpit (a program that keeps mailserver connections open forever - thus wasting their capacity. This would be great if enough users did it - because then the mail-servers would much more easily be maxed out in their memory capacity - and the owners would notice their servers not sending their own mail - and be (I should think) be much more interested in dealing with the problem.

What do you think?

Copyright © 2003, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds