Security [LWN.net]

Weblog Comments - A New Frontier for Spam

October 29, 2003

This article was contributed by Jake Edge.

The war over spam has erupted recently in a new arena: weblog comments. The parallels to the battles that have been fought on the email spam front are considerable, but unlike email spam, weblog spam is targeted at Google (and other search engines that use number of links to derive page rankings) to increase the visibility of the sites that are being advertised via spam. Comment spam seems to be on the rise with weblog owners noticing a large increase in the number of incidents over the last month or two.

Weblogs are sites that allow the owner to post articles and essays of whatever happens to strike their fancy that day and most weblog software enables readers to post comments on the stories. LWN's comment system provides the same feature for this site but, unlike LWN comments, many weblogs allow (and even encourage) anonymous comments. That openness, like the lack of sender authentication for email, provides an avenue for abuse. Requiring registration before allowing comments does not eliminate the problem entirely (LWN has had a small amount of comment spam), but it does increase the amount of work the spammer must do.

The basic mode of attack uses a program to automatically post comments on multiple articles throughout the weblog. These unwanted messages include the URL of a website that will give you the opportunity to buy one or more of the usual items: diplomas, prescription drugs, porn, etc. The program then moves on to other sites using the same software, aided, no doubt, by the various directories of weblogs using a particular software package that are available. Eventually, Google and other search engines visit the weblog sites; thereafter, the spammer's site gains a high ranking due to all of the links to it that are found.

One of the more popular (though not entirely free) packages for running a weblog is Movable Type; its user community has been the most active so far in combating comment spam. For example, one set of tips (described by Yoz Grahame) attempts to thwart the way the current spam programs work by changing the default behavior of the software. Something as simple as changing the "post a comment" link can be sufficient to confuse most automated comment posting scripts. These techniques will only help until enough people implement them and it makes it worth the effort for a spammer to write more adaptable code to circumvent them.

Many of the other comment spam handling techniques will seem very familiar to anyone who has been dealing with the deluge of email spam: bayesian filtering and blacklisting based on the URLs in the comment and/or user profile are two of the more popular techniques. Bayesian filtering uses the frequency of words in a message and a database of word counts from previous messages that have been categorized as spam or non-spam (often called "ham") to determine a probability that the new message is spam. If the probability is too high, the message is rejected. The blacklisting patch collects the URLs that are advertised in the offending messages and rejects any comments that refer to any of those URLs. Both of these techniques can be worked around by a spammer with enough incentive, but it does make it much more difficult.

Another technique that is becoming more popular is email and web-based challenge-response systems which generate a blurry graphic that is (presumably) only readable by humans. Such systems require that the text in the graphic be typed into a form to ensure that a human, and not a program, is initiating the action. This technique, too, has made its way into the arsenal of webloggers via this plug-in for Movable Type. This scheme does have a number of downsides because it requires a graphical browser to post messages and may be unusable by the visually impaired.

Other weblogging software developers may have run into this problem and come up with their own sets of fixes, but the Movable Type community appears to be the at the forefront of this particular battle. Perhaps the spammers have yet to target other systems in an automated way. If (or more likely when) they do, newly targeted weblogging software can use one or more of the techniques above to combat the spam.

Both weblog comment and email spam fighters are running into the same issues and producing similar solutions in many cases and cooperation between the two groups will lead to better spam fighting. One of the future plans for Jay Allen's blacklist (above) is to create a distributed list of URLs that are being advertised via spam and with proper controls one can imagine that list being useful to the email spam fighting crowd. A filter using the rules for email message bodies in SpamAssassin might be useful for folks confronting spam in their weblog comments as well.

Comments (12 posted)

apache: buffer overflows in mod_alias, mod_rewrite

Package(s):

apache

CVE #(s):

CAN-2003-0542 CAN-2003-0789

Created:

October 28, 2003

Updated:

February 13, 2004

Description:

André Malo discovered buffer overflows in the mod_alias and mod_rewrite modules of the Apache webserver. These occurred if a regular expression with more than 9 capturing parenthesis was configured. To exploit this, an attacker would need to be able to locally create a carefully crafted configuration file (.htaccess or httpd.conf). CAN-2003-0542

Another buffer overflow in Apache 2.0.47 and earlier in mod_cgid's mishandling of CGI redirect paths could result in CGI output going to the wrong client when a threaded MPM is used. CAN-2003-0789.

Alerts:

Whitebox	WBSA-2004:015-01	httpd	2004-02-12
Fedora	FEDORA-2003-004	httpd	2004-01-08
Red Hat	RHSA-2003:405-00	Apache	2003-12-18
Red Hat	RHSA-2003:320-01	httpd	2003-12-16
Red Hat	RHSA-2003:360-01	Apache	2003-12-10
Gentoo	200310-03	net-www/apache	2003-10-28
Trustix	2003-0041	apache	2003-11-15
Conectiva	CLA-2003:775	apache	2003-11-05
Slackware	SSA:2003-308-01	apache	2003-11-03
EnGarde	ESA-20031105-030	apache	2003-11-05
Mandrake	MDKSA-2003:103	apache	2003-11-03
Gentoo	200310-04	net-www/apache	2003-10-31
Immunix	IMNX-2003-7+-025-01	apache	2003-10-28
OpenPKG	OpenPKG-SA-2003.046	apache	2003-10-28

Comments (none posted)

libnids: remotely exploitable buffer overflow

Package(s):

libnids

CVE #(s):

CAN-2003-0850

Created:

October 29, 2003

Updated:

January 6, 2004

Description:

libnids (a NIDS plugin which emulates the Linux 2.0 IP stack) contains a buffer overflow vulnerability which can be exploited remotely. Version 1.18 fixes the problem.

Alerts:

Debian	DSA-410-1	libnids	2004-01-05
Gentoo	200311-07	net-libs/libnids	2003-11-22
Conectiva	CLA-2003:773	libnids	2003-10-29

Comments (none posted)

thttpd: multiple vulnerabilities

Package(s):

thttpd

CVE #(s):

CAN-2002-1562 CAN-2003-0899

Created:

October 29, 2003

Updated:

November 6, 2003

Description:

The thttpd web server has a pair of vulnerabilities which can lead to information disclosure and arbitrary code execution; both are remotely exploitable.

Alerts:

Conectiva	CLA-2003:777	thttpd	2003-11-06
SuSE	SuSE-SA:2003:044	thttpd	2003-10-31
Debian	DSA-396-1	thttpd	2003-10-29

Comments (none posted)

Interview: Brian Hatch (LinuxQuestions)

LinuxQuestions.org interviews Brian Hatch, author of Hacking Linux Exposed. "So true, not everyone can read and understand the code that they end up running, and not anyone can read all of the code that they end up running. There's a level of trust, and that's no different than when you run proprietary software. The big difference is the number of individuals who do view that code."

Comments (4 posted)

Security

Brief items

Weblog Comments - A New Frontier for Spam

New vulnerabilities

apache: buffer overflows in mod_alias, mod_rewrite

libnids: remotely exploitable buffer overflow

thttpd: multiple vulnerabilities

Resources

Interview: Brian Hatch (LinuxQuestions)