| This glossary is intended to help
provide greater explanation of the terms used in this documentation.
The author is avoiding redefining common terms of which numerous definitions
may be found. It is assumed that the reader is familiar with terms
such as FTP, ASCII, and directories.
Document Root
The document root is the server
path that is accessible by web visitors and robots. While it can
be the server root, it is commonly set as a secondary directory such as
public_html. There is nothing special about this name and the directory
which it is set varies from web server to web server. Please consult
with your systems administrator, virtual domain host, or server documentation
if you are not sure what your document root is.
Disallow List
RoboGen Specific – it is the list
on the left of the main window in which agent rules are viewed for the
currently selected robot.
Robot Exclusion Protocol
The standard used for robot exclusion
files. It defines the syntax and location for ROBOTS.TXT and how
web robots are to parse that file. Each robot has a user-agent,
which is its handle, and must follow all directives under the section for
its user-agent. If there are no directives specific to its user-agent,
then the robot is to follow all directives under the universal user-agent
(which is denoted by an asterisk). Also, the robot exclusion file
must be called ROBOTS.TXT and reside in the server’s document root.
ROBOTS.TXT
The file name utilized by the robot
exclusion protocol. Web robots download this file from the server’s
document root and parse it for instructions on what to index and not to
index. The case of the file name does not matter, but it must exist
in the document root.
Web Robot
Also known as a Web Wanderer or
Web Spider, it is a program that traverses the Internet automatically by
retrieving a document, and recursively retrieving all documents that are
referenced. Robots can perform any number of functions, but the most
common uses are: indexing, validating HTML, validating links, “What’s New”
monitoring, and site mirroring.
|