How is relevance determined?
Cludo Search is not a black box – the search relevance is based on a unique blend of machine learning and human customization on top of an algorithm.
Before customization or machine learning is applied, the Cludo search engine bases relevance using the Okapi BM25 algorithm. A rule of thumb for good search relevance – and SEO in general – is that a good content structure is key. While this also applies to Cludo, the relevance in a Cludo search engine is highly dependent on how the crawler is configured, i.e. what fields and content are indexed for the pages.
How crawler fields and boosting affect the search relevance
The crawler only indexes the content that it is configured to read. When searching for results, only the content indexed by the crawler is searchable.
The title and description of a page are always required for a crawler to pick it up. Additional fields such as meta description, subtitles, or intro text can also be set up. These fields will then be indexed separately, making sure these are not only searchable but also allowing for later adjustment to the relevance using boosting.
Why you should pay more attention to your Description field
It is important that the Description field is set to only include the actual page content and not static page elements such as navigation items or the footer. If these elements are indexed, they will be searchable, risking that irrelevant results appear for a given search because the search term exists outside of the main content.
How to measure Relevance
Relevance for search can be measured using the Mean Reciprocal Rank (MRR). Mean Reciprocal Rank is a statistical measure which takes a list of possible search page rankings and defines an order by the position of the relevance page ranking and click-through rates. For example, if someone searches a term, clicks on the first-page result, that would be a perfect MRR score of 1. The reciprocal rank is calculated using:
Search Query | Page Rankings | Clicked on Ranking | Rank in Rankings | Reciprocal Rank |
---|---|---|---|---|
dog | 1. doggy 2. doghouse 3. dogs | dogs | 3 | 1/3 = 0.33 |
monkey | 1. monkey 2. monkey bars 3. monkey pets | monkey bars | 2 | 1/2 = 0.5 |
cat | 1. cat 2. catholic 3. category | cat | 1 | 1 = 1 |
A score of 0.5 is considered a good standard MRR score. This means the visitors are on clicking on the 2nd result or higher on average. It is possible to get the average MRR score of a search engine by reaching out to support.
How to impact the search algorithm
With Cludo, there are multiple ways to impact or customize the search algorithm, from determining which page results should show up for specific queries, to prioritize or de-prioritize certain areas of a website, or even using machine learning to dynamically adjust the order of results based on user activity.
The following Cludo tools can be used to customize the search algorithm:
- Page rankings
- Intelligent Re-ranking
- Boostings
These tools all act on top of the Okapi BM25 algorithm in the search application.