Spider Hunter

04 Nov

Netflix and SpidErl

For the past month I have been working on the Netflix Prize, trying to create some collaborative filters in pure Erlang. While I am no where near a winning score, at the moment, I have learn a lot about how I want to create my own websites in a pure Erlang environment. One of the more interesting things is going to be the fact that I will be needing my own spider, and in the standard Erlang way I have named it SpidErl. Erlang people like putting Erl into the names they give things.

SpidErl will mostly just fetch things in the first few versions, I’m going to be collecting RSS feeds and some Amazon affiliate data for the most part. I think I’m going to write a module that collects AdSense data as well, obviously you need a user name and password to get that data. In the long run I hope to create a full fledged Search Engine around SpidErl’s information store, but that will be month or years from now.

In the mean time look for some notes about building my own spider and a site redesign taking advantage of the information that the spider collects :-)

Share and Enjoy: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • Netvouz
  • DZone
  • ThisNext
  • MisterWong
  • Wists

Leave a Reply

You must be logged in to post a comment.

© 2008 Spider Hunter | Entries (RSS) and Comments (RSS)