The sort by, # or results per page + language flags create new links with a potentially endless number of variables.
Example:
/best-sales.php?isolang=hi&n=15&p=3&id_lang=23&orderby=price&orderway=desc
/5-prestashop-modules?n=10&isolang=id&;id_lang=13&orderby=name&orderway=asc&id_lang=13&id_category=5
/5-prestashop-modules?n=10&isolang=id&id_lang=13&orderby=name&orderway=asc
While those URLs are not a problem for regular users, search engines that crawl your site find those links, then they crawl those links and find more new links on them.
For search engines, the sort order of products on a page, the number of result per page, and a language change variable are not needed, in fact, they just add new links for them to crawl with virtually the same data.
I estimate that the extra URLs that search engines crawl are about 50-100+ times more than that actual URLs that are valid and need crawling, this results in a lot of meta tag duplication, very similar content, and creates a huge load on your site when they crawl pages that are not really needed.
In the past I have attempted to address this by giving the page unique meta tags (see http://www.prestasho...ewthread/52665/), however, that was not a complete solution for two reasons:
1) Search engines still see and crawl those pages, and their content is virtually the same.
2) When the order of the extra variables would be different, the meta tag would still be the same.
The ideal solution is to hide all those extra variables from search engines, the only variable that should be available is the page number (when pagination is present).
This will make sure there are no multiple URLs for the same page with different variables.
This modification also includes the Multilingual URL FIX from in http://www.prestasho...ewthread/55895/
This fix will prevent Search Engines from seeing these new pages, however, if they already crawled your site and have those links, they will continue to crawl them unless told otherwise.
I have added to my Duplicate URL Redirect module 301 redirects from each page with variables (11-category-name?orderby=name&orderway=asc) to the page without any variables (11-category-name).
The only variable that will stay is p= which is used for pagination.
I have been closely monitoring the HTML suggestions on the Google webmaster tools since I made these changes and upgraded the Duplicate URL Redirect module (over 3 weeks ago), and the number of duplicate meta tags listed there went down from around 800-900 to about 100.
Attached is a zip with the modified files (4 in total) for PS 1.2, 1.3 and 1.3.2, make sure you keep a backup of your files before copying the modified ones.
HOW TO TEST?
This change does not affect regular users, so to test it, you need to trick the server to think you're a search engine.
1) Open Firefox, in the address bar type "about:config" and press enter.
2) At the top there's a filter, type "agent"
3) Right click on "general.useragent.extra.firefox" and choose modify.
4) Replace the value inside with "bot" and click OK
Now when you access your site, the module will you are a search engine, and apply the changes.
You will see that choose a filter will not do anything, the Results per page dropdown is gone, and if you enter one of those URLs with extra variables, they will all be stripped out (except p= if it was present).
To return the browser to normal, right click on "general.useragent.extra.firefox" and click on "reset".
UPDATED 11/20/10 (added /modules/sendtoafriend/sendtoafriend.php)





Back to top










