Tag: Files

How does Cludo index files?

As long as a file is machine-readable (not an image), Cludo is able to crawl its content along with the information sent with the HTTP headers. File titles It is possible to select how the file title should be extracted by selecting one of the following: Automatic The default option is Automatic, . . . Read more

Why is this file not indexed?

Once a crawler has crawled the defined domain(s), you may experience a specific file not being added to the search index. This will typically be due to one of the following reasons:

What is the maximum file size Cludo can index?

Cludo’s crawlers can index files up to 15 MB. Anything larger can be pushed directly via Cludo’s API. The extraction of files removes the size of images and other irrelevant information prior to looking at the file size. For reference, the raw text of the entire Bible is around 5MB.

What file types does Cludo index?

The indexability of a file is not defined by its extension (e.g. “.pdf”), but rather by the content type, as returned in the HTTP headers. In the list below, we have added extensions as examples. Supported file types