Category: FAQ

Why is this file not indexed?

Once a crawler has crawled the defined domain(s), you may experience a specific file not being added to the search index. This will typically be due to one of the following reasons:

What are the crawlers’ IP addresses?

In some cases, the crawler may be blocked from indexing your website. To fix this, you may need to whitelist our IP address to allow the crawler to access the site. Our crawler’s IP addresses are: These are the IPs of the proxy that all of Cludo’s internal services use . . . Read more

What is the maximum file size Cludo can index?

Cludo’s crawlers can index files up to 15 MB. Anything larger can be pushed directly via Cludo’s API. The extraction of files removes the size of images and other irrelevant information prior to looking at the file size. For reference, the raw text of the entire Bible is around 5MB . . . Read more

How to delete a crawler

For security reasons, crawlers can only be deleted by Cludo staff. If you need to delete a crawler, please contact support and let us know the ID(s) of the crawler(s) you would like to delete.

How to delete an engine

For security reasons, engines can only be deleted by Cludo staff. If you need to delete an engine, please contact support and let us know the ID(s) of the engine(s) you would like to delete.

What file types does Cludo index?

The indexability of a file is not defined by its extension (e.g. “.pdf”), but rather by the content type, as returned in the HTTP headers. In the list below, we have added extensions as examples. Supported file types

How many requests does the crawler make?

Our crawler will always attempt to make as many requests as possible, often requesting multiple pages per second, but the actual frequency of requests depends on the server response from the website. Some websites might also have a crawl delay set in their robots.txt, which can impact how many requests . . . Read more

What stop words does Cludo use?

Stop words are fill words that don’t provide any context and they will be ignored in the search query, thereby increasing relevancy in search. There is a list of stop words for every language that is supported by Cludo. Below you can find the stop words for a few languages: . . . Read more