Setting up Cludo for intranets

You may wish to implement Cludo on an intranet solution that is otherwise closed to the public. For this, you will want to consider how the crawler should access the site as well as how secure the implementation needs to be.

Ways to allow crawling behind login

In order to index a page, the crawler must have access to it. In certain cases, the content to crawl may be located behind a login and therefore is not reachable by the crawler by default.

IP whitelist

Whitelisting the crawler can be done without any further help from Cludo by whitelisting one of these IP addresses depending on whether your account resides on our EU or US cluster. If you’re in doubt, reach out to support or whitelist both.

Configure the crawler with a login method

The crawler is able to interact with most login forms as long as 2FA/MFA is not required. By providing the crawler with login credentials, Cludo is able to define a sequence that lets the crawler log in and start indexing the site.

This configuration can only be done by Cludo staff and requires that a user is created for the crawler. Reach out to support if you are interested in setting this up for your site.

Implementing on intranets

There are generally two different ways to set up a crawler and search engine to work on an site behind a login (intranet).

The standard implementation

For password-protected websites with basic security levels, Cludo can be implemented similarly to public engines. It should be noted that the page title, description, and URL of the intranet pages will be exposed in the search.

Using an internal relay server (proxy)

Cludo can provide a secure intranet search using an internal relay server on the requested network.
This ensures that requests from outside cannot access confidential information.

Intranet security works by requiring a secret key, which must be provided with every search request. This key is referred to as the “customer key”. For public-facing search solutions, the customer key does not need to be kept secret, but for intranet search, the key is kept secret but does still allow internal users to search. This is done by relaying all search requests through an internal relay server on the requested network. The relay server will add the secret key to the request upon forwarding the request to Cludo’s public API.

A diagram of how a visitor on an intranet can successfully search with Cludo.
Tags: