We relocated our site to a dedicated server at WP Engine and found that AhrefsSiteAudit Crawler could not crawl our website. Instead, we got a 502 error for our robots.txt file.
Checking robots.txt access using CURL
Using CURL on the command line, we can see that the URL was available and returning a 200 Found code, which is what we’d expect.
The CURL command to use is:
curl -I -L https://domain.com/robots.txt
Make sure to use your site address in place of the https://domain.com.
If you see a response like HTTP/2 200, you can access the file without problems directly from your machine.
You may notice in the above screenshot that the error mentions Cloudflare. Our site uses WP Engine’s Advanced Network, a partnership with Cloudflare, so I suspected that was causing the problem.
Ahrefs has a nifty article on troubleshooting common issues with Site Audit access.
Allow Ahrefs using robots.txt
We made sure to allow Ahrefs in our robots.txt file by adding the following lines:
User-agent: AhrefsSiteAudit Allow: / User-agent: AhrefsBot Allow: /
That didn’t fix the problem.
Adding Web Rules at WP Engine to Allow IPs
Then, we collected the Ahrefs IP blocks and addresses we needed to whitelist or allow.
As of this writing, we added the following comma-separated block of IPs to WP Engine’s Web rules.
Once you’ve accessed the site in WP Engine’s dashboard, go to Web Rules and add a new access rule to allow traffic from the above IPs.
Finally, save the rule, clear your caches, and test. If all is well, you should see something like this:
Take a moment for a happy dance, then return to your search optimization.
If you’re stuck or need help with search engine optimization please let us know.