Category: FAQ
In compliance with our commitment to transparency and privacy, Cludo aims to provide comprehensive information regarding the cookies and local storage utilized on the MyCludo platform. No third-party cookies are stored in MyCludo. However, local storage serves as a successor to cookies, ensuring that they are treated equally under EU . . . Read more
If there are certain parts of the content on a page that you would like the crawler to ignore, this can be achieved using cludooff/cludoon.
It is possible to add development to have the cookies set by Cludo connect with your general cookie consent formula. For general information on cookies set by Cludo, read this article. Technical Overview At the moment Cludo stores session information in local storage that persists across end-user visits, we will . . . Read more
Cludo’s strategy for crawling sites is based on finding as many pages as possible within the user-defined domains, indexing, and storing their content. The step-by-step process can be seen in detail in the diagram at the end of the article and will be explained further below: Crawling: Step-by-step process 1: Sites . . . Read more
What is a compound word? A compound word is a word that consists of two or more nouns that together form a word with its own meaning, which is very typical in some languages, like the Scandinavian languages. How is a compound word treated in a search? When searching, a . . . Read more
As long as a file is machine-readable (not an image), Cludo is able to crawl its content along with the information sent with the HTTP headers. File titles It is possible to select how the file title should be extracted by selecting one of the following: Automatic The default option is Automatic, . . . Read more
When searching, you may experience the same content appearing more than once in the results. Since a crawler is unable to index the same URL twice, this will always be due to the same content existing on multiple URLs. That is, of course, unless you have two crawlers that index . . . Read more
Once a crawler has crawled the defined domain(s), you may experience a specific file not being added to the search index. This will typically be due to one of the following reasons:
The “Category” field is a standard field in the crawler, and can be set up to identify a specific type of content when crawling. This can be useful when implementing the template as it becomes very easy to implement a filter on said category. The example above is just one . . . Read more
You may wish to implement Cludo on an intranet solution that is otherwise closed to the public. For this, you will want to consider how the crawler should access the site as well as how secure the implementation needs to be. Ways to allow crawling behind login In order to . . . Read more