Re: David Bartosik: Why Robots.txt by Matt BenyaSaturday, May 07, 2005I came across a blog article, David Bartosik: Why Robots.txt by Matt Benya, which happens to mention RoboGen, a program for editing robots.txt files that I wrote nearly six years ago! I do enjoy finding references to my previous work. Mr. Benya’s explanation on of the robots.txt file reminds me of a situation I came across a few weeks ago. I had logged into one of the web servers and noticed the system was not responding as snappily as usual. I turns out the load average was at 15%, caused by a large number of instances of a customer CGI script. Fortunately, these scripts were being run by a particular user so I was able to find and inspect the tail end of the log file and determined that ZyBorg, from wisenutbot.com, was rapidly accessing the dynamically generated site by a CGI-interface Perl script. In order to get the server load under control, I created a robots.txt for the site and blocked the ZyBorg user-agent from indexing the Perl scripts for the site. Fortunately the robot did comply with the exclusion standard and the rapid-fire crawling stopped. While this story has nothing to do with RoboGen, I used Vim in the SSH session, it does show one concrete example of the continued applicability of the robot exclusion standard.
Posted by Frank Rietta at
7:38 PM
|
"Whenever you find a man who says he doesn't believe in a
real Right and Wrong, you will find the same man going back on this a moment later."
Recent Posts
ArchivesApril 2005 / May 2005 / June 2005 / July 2005 / August 2005 / November 2005 / April 2006 / June 2006 / August 2006 / September 2006 / November 2006 / December 2006 / January 2007 / January 2008 /
About MeI am a software developer who has been marketing on the internet since 1999. I hold an MS in Information Security from the Georgia Institute of Technology, from where I previously earned a BS in Computer Science in 2005. I ran an Atlanta-based web hosting business from 1999 until I sold it in 2005. |
Home | Product List | Privacy | Contact