Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block ahrefs bot in robots.txt #408

Merged
merged 1 commit into from
May 30, 2014
Merged

Block ahrefs bot in robots.txt #408

merged 1 commit into from
May 30, 2014

Conversation

samjsharpe
Copy link
Contributor

This bot crawls us very very regularly (sometimes more than once a day) and
causes spikes in our load and exceptions. It seems to serve no useful purpose
to us, in that it's not a search engine that users would use.

  1. What is ahrefs.com
    Ahrefs.com is an independent tool for SEO analysis with a wide range
    of features. It is designed, first of all, for SEO specialists and
    site owners but may be of interest to other concerned Internet
    researchers.
    https://ahrefs.com/faqs.php

This change to the robots.txt is the recommended way to stop the bot crawling.
See https://ahrefs.com/robot

This bot crawls us very very regularly (sometimes more than once a day)
and causes spikes in our load and exceptions. It seems to serve no useful
purpose to us, in that it's not a search engine that users would use.

 > 1. What is ahrefs.com
 > Ahrefs.com is an independent tool for SEO analysis with a wide range
 > of features. It is designed, first of all, for SEO specialists and
 > site owners but may be of interest to other concerned Internet
 > researchers.
 > https://ahrefs.com/faqs.php

This change to the robots.txt is the recommended way to stop the bot
crawling. See https://ahrefs.com/robot
@bradwright
Copy link
Contributor

In the spirit of remaining mostly open, should we set an explicit (and long, 10 seconds or so) crawl delay instead? Their site says they support it.

bradwright added a commit that referenced this pull request May 30, 2014
Block ahrefs bot in robots.txt
@bradwright bradwright merged commit d7a241f into master May 30, 2014
@bradwright bradwright deleted the block-ahrefs branch May 30, 2014 15:36
alexmuller added a commit that referenced this pull request Jan 7, 2015
@bradleywright [noted][1] in #408 that it would be nice if we tried
to remain mostly open.

This commit sets a long Crawl-delay which will allow the bot to crawl
us without impacting the site for actual users.

If this causes problems for users we should revert it.

[1]: #408 (comment)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants