Configure a crawl delay

A crawl delay will limit the frequency at which the crawler will request pages on a website. This results in an overall slower crawl but can prevent overloading the website with too many requests at once. This is rarely needed, but it can be useful for “sensitive” servers that don’t have a lot of bandwidth.

The crawl delay is defined in the website’s robot.txt file, using the following format:

User-agent: cludo
Disallow:
Crawl-delay: 5

The example above sets the crawl delay to 5, only allowing a new request from the crawler every 5 seconds.

Tags: