Spider Trap
Posted: December 17th, 2004, 7:34 pm
SOURCE: http://www.webmasterworld.com/forum88/3104.htm
Hello,
I recently set up my own spider trap after reading about it here. I finally got sick of site-suckers driving up my bandwidth to the point I had to upgrade my hosting package twice.
So anyway, I don't use Perl much and decided to make a PHP trap. It's working nicely and just wanted to post it up here in case anyone wants to use it.
*Notes:
1. Add the robots.txt snippet days before luring bots to the trap. This gives the good bots time to read the disallow and obey.
2. chmod .htaccess to 666 and getout.php to 755(please correct me here if I'm wrong)
3. Replace the broken pipe(¦) with a solid one in .htacces snippet.
4. Edit getout.php with the real path to your .htaccess file and also change the email to your own so you will recieve the "spider alert".
Robots.txt
User-agent: *
Disallow: /getout.php
.htaccess(keep this code at the top of the file)
SetEnvIf Request_URI "^(/403.*\.htm¦/robots\.txt)$" allowsome
<Files *>
order deny,allow
deny from env=getout
allow from env=allowsome
</Files>
getout.php
<?php
$filename = "/var/www/html/.htaccess";
$content = "SetEnvIf Remote_Addr ^".str_replace(".","\.",$_SERVER["REMOTE_ADDR"])."$ getout\r\n";
$handle = fopen($filename, 'r');
$content .= fread($handle,filesize($filename));
fclose($handle);
$handle = fopen($filename, 'w+');
fwrite($handle, $content,strlen($content));
fclose($handle);
mail("[email protected]",
"spider Alert!",
"The following ip just got banned because it accessed the spider trap.\r\n\r\n".$_SERVER["REMOTE_ADDR"]."\r\n".$_SERVER["HTTP_USER_AGENT"]."\r\n".$_SERVER["HTTP_REFERER"]
,"FROM: [email protected]");
print "Goodbye!";
?>