The site control page that handles forbidden access to web pages has been upgraded. Forbidden Access controls whether robots.txt is obeyed by the Link Alarm robot or bypassed.
Previously the Forbidden Access was a simple on/off setting and only affected external pages; outside of the pages within the Checking Limits settings. Pages included in the Checking Limits were still subject to robots.txt and this caused some frustration for our users.
Now there is far greater control in bypassing robots.txt. Bypass can now be set to not bypass (the default), bypass only on internal links, only on external links, or both. The greater granularity of control will help checking sites where portions of the site are protected by robots.txt while still obeying the rules for external sites.
There are other ways to avoid robots.txt problems on your own site. You can explicitly allow Link Alarm access to your site by adding the following at the top of your robots.txt:
Remember to be considerate when bypassing robots.txt with sites other than your own. Site owners who have a robots.txt file usually have a reason to do so and bypassing those controls can be seen as rude. So unless it's essential, respect robots.txt on sites you do not own. It's good netiquette.