Spider Hunter

16 Jul

Realtime Spider List

I was writing a blog for another one of my sites ( http://www.spamfreeemail.com ) and I was talking about Real-time Black-hole lists. Then the thought occurred to me that I have never seen this simple implementation translated over into Search engine spider detection. Using the same techniques as the RBL a Real-time spider list could be implemented. By doing a simple DNS Look up for the IP address at a domain name you could get a response that would tell you not only if it is a search engine spider, but what search engine the spider is from. Say for instance you did a look up for 64.68.86.7 which is crawler1.googlebot.com. The IP address you might look up would be 7.86.68.64.spider.realtimespiderlist.com and then you would get either no response, meaning that it is not a spider or an IP address that would correspond to a search engine. This is something that I will be looking into over the next few days to weeks to see how difficult this would be to implement.

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Netvouz
  • DZone
  • ThisNext
  • MisterWong
  • Wists

Leave a Reply

You must be logged in to post a comment.

© 2008 Spider Hunter | Entries (RSS) and Comments (RSS)