Forbidden Access
Ignoring servers' robots.txt access restrictions.
LinkAlarm will not use a server's robots.txt for access restrictions, just like anyone using a browser.
Some sites disallow robots (like LinkAlarm) from accessing their content by using a robots.txt file. Links that are unable to be checked by this restriction are displayed in the Links Not Checked section of your report.
To check these links anyway (overriding others' robots.txt files) turn Forbidden Access ON.
Note that if a robots.txt file has specific declarations for LinkAlarm, these will be respected.
robots.txt
The robots.txt exclusion standard is used by webmasters to tell spiders (robots that walk every page of your site) to keep away from their site. The standard also dictates "good citizen" network usage by automated pieces of software, like LinkAlarm.
LinkAlarm follows these robot guidelines for accessing all sites (including yours) by avoiding the directories you specify. Some sites disallow all robots, so when you run LinkAlarm on your site there may be lots of 903 - Robots Forbidden alarms.
There are many reasons a webmaster uses the robots.txt protocol, the most important of which is that it's the only one everyone uses. An enhanced version is in the works, but for now the web uses the robots.txt avoidance protocol for restricting inadvertant CGI access, and unnecessary site traffic, among other reasons.
You can ask for your checking robot to ignore access restrictions on remote servers. When the Site Control : Forbidden Access is turned on, LinkAlarm will verify the existence of objects on other servers with robots.txt restrictions.
By using this option, LinkAlarm will "ping" the pages at remote servers to verify their existence. Note that LinkAlarm usually does not have to retrieve the entire page to do this, so the load on the remote server is minimal. Additionally, LinkAlarm waits 30 seconds between accesses.
robots META tags
LinkAlarm complies with the Robots META tag, not collecting links to follow on pages containing:<META NAME="ROBOTS" CONTENT="NOINDEX, NOFOLLOW">Use this feature to have LinkAlarm avoid checking parts of your site.