Net abuser: Go fake yourself

For the last several months, one of my Web sites has received hundreds of bogus form submissions advertising commercial dreck from China, a country that seems to be adapting to capitalist values more widely than reported, if my junk e-mail is any indication.

The Chinese entrepreneurs are abusing a Perl script and Web form of my own design, so they're not taking advantage of popular software with a known CGI interface, unlike spammers who are flooding Movable Type comment pages. By my guess, they put together a database of Web URLs that accept site submissions, then let their software scrape the forms and blast e-mail to all of them at once.

To test this theory, I put a hidden field on my Web form and made it a comment, using the following code:

<td>
<input type="text" name="name" size="40"></span><!--
<input type="hidden" name="subd" value="$ip_address">
--></td>

The subd field, which contains the IP address of the computer requesting the page, does not appear when the form page is used normally. Web browsers know how to parse and ignore anything within comment tags.

For this reason, any form submission I get with a subd field is likely to be the product of form-scraping software. Even better, it provides the IP address used by the spammer to scrape my site, which is more likely to be a legitimate account than one used to broadcast e-mail.

This technique has nabbed its first spammer. In the last week, I've received 89 e-mails promoting dubious sites from a Russian Internet business, all of which contain the same IP address in the subd field.

Though I expect my technique will cease to work as soon as it becomes too popular, the karmic pleasure of detecting form abuse is its own reward.

Comments

Thanks for the link ... hope my forms or any comment you may have left on my blog caused your current spam woes.

As for dealing with insidious form scraping spam machines ... one little technique I don't talk too much about is "poisoning" the page. Usually at the very bottom left or right, in the same color as the background, in the smallest font possible, is a loopback address. Either:

abuse@[127.0.0.1]

and/or programatically

postmaster@[$ipaddress]

Though I think after reading your article, I might stick to [127.0.0.1] an shove the IP address into the subject line into the name, so it would look something like ereg_replace(".", "_", $ipaddress) + "@[127.0.0.1]" ...

.... hmmmmmm so many booby traps, so little time.

I think my form woes are simply a result of running a site that takes submissions. I linked to your site because it's a good discussion of the issue and some of the ways to deal with it.

The site's offline at the moment, but if you like poisoning spam robots, there's a cool script on Arpa.org (username guest, password arpa).

Add a Comment

All comments are moderated before publication. These HTML tags are permitted: <p>, <b>, <i>, <a>, and <blockquote>. This site is protected by reCAPTCHA (for which the Google Privacy Policy and Terms of Service apply).