Jump to content

NOINDEX selected pages/products/urls and block them to appear on sitemaps.xml


Chris Carter

Recommended Posts

Hi, I am looking for a way to configure certain URLs to mark them as NOINDEX using the Robots Meta Tag:

<meta name="robots" content="noindex" />

And all URLs marked as NOINDEX I want to prevent them from being included in the Sitemaps file (sitemap.xml). Because everything that is as NOINDEX, Google does not have to crawl it. And if you crawl it because it is in the Sitemap and has the Meta NOINDEX, you waste Crawl Budget and also generate errors/warnings in Google Search Console.

I have not found any module that does this management. Do you know any?

My intention is to prevent Google from indexing certain urls: mostly product pages, category pages and brand pages that do not have good content (Thin Content). These are pages that can be useful for when the customer is already on the web, and knows what he is looking for. But on the other hand, they are not good for Google, and can negatively affect the valuation of the entire site, and harm URLs that do have good content.
 

The important thing here is that the module that displays the NOIDEX also manages to remove the URL from the list of URLs in the Sitemap.

Link to comment
Share on other sites

Hi,

In PrestaShop, you can modify the template files for these specific pages (e.g., product.tpl, category.tpl) and add the <meta name="robots" content="noindex" /> tag to the <head> section of these templates. This will instruct search engines not to index these pages.

To exclude these NOINDEX URLs from your sitemap, you'll need to customize your sitemap generation process.  You can create a custom PHP script or override the default sitemap generation in PrestaShop. In your custom sitemap generation script, ensure that it does not include URLs marked as NOINDEX in the sitemap.xml file.

You can also use your robots.txt file to disallow crawling of URLs that are marked as NOINDEX. For example, you can add the following lines to your robots.txt:

User-agent: *
Disallow: /path-to-noindex-page/

After implementing these changes, thoroughly test to ensure that NOINDEX tags are correctly applied and that excluded URLs do not appear in your sitemap.

Monitor your website's performance in Google Search Console to verify that these changes are having the desired effect.

Remember to back up your website and be cautious when modifying template files and sitemap generation scripts, as improper changes can affect your site's functionality.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...