Why is this page not indexed?

Once a crawler has crawled the defined domain(s), you may experience a specific page not being added to the search index. This will typically be due to one of the following reasons:

  • The page has a canonical tag in the header of the HTML, which indicates that the page is a copy of another page. (learn more about canonicalization)
  • The page has a noindex tag in the header of the HTML, which indicates the page should not be indexed by any search engine.
  • The page redirects to another URL and is therefore ignored.
  • The URL of the page is outside the defined website for the crawler.

Example

The crawler is set up to crawl https://cluborughlu.com/schools/ and finds the URL https://cluborughlu.com/hardvard-university/. This page is not indexed, as it does not include /schools/ in the URL (URL is outside of base domains). To fix this, update the crawled website to https://cluborughlu.com/

  • The page is an orphan page, meaning that there are no links to it, preventing the crawler from ever finding it.
  • The page is missing a required field, such as title or description, and is therefore being ignored.

Example

The crawler is set up to use First H1 for the Title field, which is by default a required field, but the page doesn’t have an H1 and is therefore ignored. To fix this, add a fallback to the Title field that would match the page in question.

Feel free to contact support if you have further questions on why a page was not indexed as expected.

Tags: