Jump to content

robots.txt what should it contain and be placed?


Timpet

Recommended Posts

I need help help on what the robots.txt should look like. My prestashop can not generate it by it self. I have and old one so can use most of that.

my webby is placed at a sub folder called shop, så they web adresse is http://www.verdious-wardrobe.dk/shop/

Should robots.txt be placed in the shop folder?

and should i put /shop/ in front of all these

# Directories
Disallow: /classes/
Disallow: /config/
Disallow: /download/
...

# Files
Disallow: /addresses.php
Disallow: /address.php
Disallow: /authentication.php
...

Link to comment
Share on other sites

I noticed my authentification.php page getting stuck on "Google Analytics' for a while before the page is fully loaded. I think I can remedy this by placing a robots.txt file in the root directory but have no idea what this file should contain?

There is no way to generate this in the Back Office (Generators), can someone post an example of what theirs looks like? I'm not sure what pages to exclude as I'm sure authentification.php is not the only one that is unnecessary to be crawled?

Link to comment
Share on other sites

Hi,

I did genarate a robots.txt from BO with PS 1.2.5.
Here is what's in it :

# robots.txt automaticaly generated by PrestaShop e-commerce open-source solution
# http://www.prestashop.com - http://www.prestashop.com/forums

# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.

# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/wc/robots.html

User-agent: *

# Directories
Disallow: /your root folder/classes/
Disallow: /your root folder/config/
Disallow: /your root folder/download/
Disallow: /your root folder/mails/
Disallow: /your root folder/modules/
Disallow: /your root folder/translations/
Disallow: /your root folder/tools/

# Files
Disallow: /your root folder/addresses.php
Disallow: /your root folder/address.php
Disallow: /your root folder/authentication.php
Disallow: /your root folder/cart.php
Disallow: /your root folder/contact-form.php
Disallow: /your root folder/discount.php
Disallow: /your root folder/footer.php
Disallow: /your root folder/get-file.php
Disallow: /your root folder/header.php
Disallow: /your root folder/history.php
Disallow: /your root folder/identity.php
Disallow: /your root folder/images.inc.php
Disallow: /your root folder/init.php
Disallow: /your root folder/my-account.php
Disallow: /your root folder/order.php
Disallow: /your root folder/order-slip.php
Disallow: /your root folder/order-detail.php
Disallow: /your root folder/order-follow.php
Disallow: /your root folder/order-return.php
Disallow: /your root folder/order-confirmation.php
Disallow: /your root folder/pagination.php
Disallow: /your root folder/password.php
Disallow: /your root folder/pdf-invoice.php
Disallow: /your root folder/pdf-order-return.php
Disallow: /your root folder/pdf-order-slip.php
Disallow: /your root folder/product-sort.php
Disallow: /your root folder/search.php
Disallow: /your root folder/statistics.php
Disallow: /your root folder/zoom.php

Hope this helps ;-)

Link to comment
Share on other sites

  • 7 months later...

Tnx for this pasko, just what i was looking for.

But what about css and js directories? Should i disallow both, cause you don't want google to index that, cause it's pure code, right?

And how about docs, which includes just licences?

I would add all three into robots under disallow =)

I think it's important to do all that, because you want google to focus on your content, on your products actually.

Cheers,
Housy

Link to comment
Share on other sites

  • 3 weeks later...
Hi,

I did genarate a robots.txt from BO with PS 1.2.5.
Here is what's in it :

# robots.txt automaticaly generated by PrestaShop e-commerce open-source solution
# http://www.prestashop.com - http://www.prestashop.com/forums

# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.

# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/wc/robots.html

User-agent: *

# Directories
Disallow: /your root folder/classes/
Disallow: /your root folder/config/
Disallow: /your root folder/download/
Disallow: /your root folder/mails/
Disallow: /your root folder/modules/
Disallow: /your root folder/translations/
Disallow: /your root folder/tools/

# Files
Disallow: /your root folder/addresses.php
Disallow: /your root folder/address.php
Disallow: /your root folder/authentication.php
Disallow: /your root folder/cart.php
Disallow: /your root folder/contact-form.php
Disallow: /your root folder/discount.php
Disallow: /your root folder/footer.php
Disallow: /your root folder/get-file.php
Disallow: /your root folder/header.php
Disallow: /your root folder/history.php
Disallow: /your root folder/identity.php
Disallow: /your root folder/images.inc.php
Disallow: /your root folder/init.php
Disallow: /your root folder/my-account.php
Disallow: /your root folder/order.php
Disallow: /your root folder/order-slip.php
Disallow: /your root folder/order-detail.php
Disallow: /your root folder/order-follow.php
Disallow: /your root folder/order-return.php
Disallow: /your root folder/order-confirmation.php
Disallow: /your root folder/pagination.php
Disallow: /your root folder/password.php
Disallow: /your root folder/pdf-invoice.php
Disallow: /your root folder/pdf-order-return.php
Disallow: /your root folder/pdf-order-slip.php
Disallow: /your root folder/product-sort.php
Disallow: /your root folder/search.php
Disallow: /your root folder/statistics.php
Disallow: /your root folder/zoom.php

Hope this helps ;-)



Thanx for this.

1. But where is in the back office this module for generating robots?
2. what if my site is hosted direct on : public html it has no other root folder?
should i write for example:

Disallow: /public_html/zoom.php

or just

Disallow: /zoom.php
Link to comment
Share on other sites

Yes, robots.txt should be placed in the shop folder and you should add /shop/ to the front of these so it is /shop/classes/ for example.


robots.txt file should be placed in the top-level directory (not shop folder) of web server. It is usually "public_html", "www" or "httpdocs" directory on FTP (sorry it is critical so I have had to correct it). What rocky wrote is correct only when PS is installed in top-level directory.

Format: All disallowed URIs in this file should be in absolute format. It means that if you would like to disable e.g.: www.example.com/shop/my-account.php the content of this file should look the following way:
User-agent: *

# Directories
Disallow: /shop/my-account.php



For www.example.com/my-account.php it should be:

User-agent: *

# Directories
Disallow: /my-account.php

Link to comment
Share on other sites

  • 4 weeks later...

I had a lot of problems with Googlebot drawing down colossal bandwidth (13 gigabytes a month on average) so I had to block it completely. I will be re introducing access 1 file at a time..

There appears to be some potential syntax errors in the Presta generated files - It wasn't until I checked my robots.txt file with this tool (below) that it actually worded properly.

# http://tool.motoricerca.info/robots-checker.phtml
# http://www.mcanerin.com/en/search-engine/robots-txt.asp

Link to comment
Share on other sites

  • 1 month later...
Hi,

I did genarate a robots.txt from BO with PS 1.2.5.
Here is what's in it :

# robots.txt automaticaly generated by PrestaShop e-commerce open-source solution
# http://www.prestashop.com - http://www.prestashop.com/forums

# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.

# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/wc/robots.html

User-agent: *

# Directories
Disallow: /your root folder/classes/
Disallow: /your root folder/config/
Disallow: /your root folder/download/
Disallow: /your root folder/mails/
Disallow: /your root folder/modules/
Disallow: /your root folder/translations/
Disallow: /your root folder/tools/

# Files
Disallow: /your root folder/addresses.php
Disallow: /your root folder/address.php
Disallow: /your root folder/authentication.php
Disallow: /your root folder/cart.php
Disallow: /your root folder/contact-form.php
Disallow: /your root folder/discount.php
Disallow: /your root folder/footer.php
Disallow: /your root folder/get-file.php
Disallow: /your root folder/header.php
Disallow: /your root folder/history.php
Disallow: /your root folder/identity.php
Disallow: /your root folder/images.inc.php
Disallow: /your root folder/init.php
Disallow: /your root folder/my-account.php
Disallow: /your root folder/order.php
Disallow: /your root folder/order-slip.php
Disallow: /your root folder/order-detail.php
Disallow: /your root folder/order-follow.php
Disallow: /your root folder/order-return.php
Disallow: /your root folder/order-confirmation.php
Disallow: /your root folder/pagination.php
Disallow: /your root folder/password.php
Disallow: /your root folder/pdf-invoice.php
Disallow: /your root folder/pdf-order-return.php
Disallow: /your root folder/pdf-order-slip.php
Disallow: /your root folder/product-sort.php
Disallow: /your root folder/search.php
Disallow: /your root folder/statistics.php
Disallow: /your root folder/zoom.php

Hope this helps ;-)


I make robots.txt by this way but when i go to google/webmaster to manage my website . In Crawl errors. i see it have 54 erros . should i repair it to optimum SEO ?
Link to comment
Share on other sites

  • 8 years later...

Hi

My friends I have a question about my robot.txt in my server that has permission 666

but I am not sure this details are OK and Is it OK for search of google too.

# robots.txt automatically generated by PrestaShop e-commerce open-source solution
# http://www.prestashop.com - http://www.prestashop.com/forums
# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.
# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/robotstxt.html
User-agent: *
# Allow Directives
Allow: */modules/*.css
Allow: */modules/*.js
Allow: */modules/*.png
Allow: */modules/*.jpg
Allow: /js/jquery/*
# Private pages
Disallow: /*?order=
Disallow: /*?tag=
Disallow: /*?id_currency=
Disallow: /*?search_query=
Disallow: /*?back=
Disallow: /*?n=
Disallow: /*&order=
Disallow: /*&tag=
Disallow: /*&id_currency=
Disallow: /*&search_query=
Disallow: /*&back=
Disallow: /*&n=
Disallow: /*controller=addresses
Disallow: /*controller=address
Disallow: /*controller=authentication
Disallow: /*controller=cart
Disallow: /*controller=discount
Disallow: /*controller=footer
Disallow: /*controller=get-file
Disallow: /*controller=header
Disallow: /*controller=history
Disallow: /*controller=identity
Disallow: /*controller=images.inc
Disallow: /*controller=init
Disallow: /*controller=my-account
Disallow: /*controller=order
Disallow: /*controller=order-slip
Disallow: /*controller=order-detail
Disallow: /*controller=order-follow
Disallow: /*controller=order-return
Disallow: /*controller=order-confirmation
Disallow: /*controller=pagination
Disallow: /*controller=password
Disallow: /*controller=pdf-invoice
Disallow: /*controller=pdf-order-return
Disallow: /*controller=pdf-order-slip
Disallow: /*controller=product-sort
Disallow: /*controller=search
Disallow: /*controller=statistics
Disallow: /*controller=attachment
Disallow: /*controller=guest-tracking
# Directories for 20bekhar.com
Disallow: /app/
Disallow: /cache/
Disallow: /classes/
Disallow: /config/
Disallow: /controllers/
Disallow: /download/
Disallow: /js/
Disallow: /localization/
Disallow: /log/
Disallow: /mails/
Disallow: /modules/
Disallow: /override/
Disallow: /pdf/
Disallow: /src/
Disallow: /tools/
Disallow: /translations/
Disallow: /upload/
Disallow: /var/
Disallow: /vendor/
Disallow: /webservice/
Disallow: /fa/app/
Disallow: /fa/cache/
Disallow: /fa/classes/
Disallow: /fa/config/
Disallow: /fa/controllers/
Disallow: /fa/download/
Disallow: /fa/js/
Disallow: /fa/localization/
Disallow: /fa/log/
Disallow: /fa/mails/
Disallow: /fa/modules/
Disallow: /fa/override/
Disallow: /fa/pdf/
Disallow: /fa/src/
Disallow: /fa/tools/
Disallow: /fa/translations/
Disallow: /fa/upload/
Disallow: /fa/var/
Disallow: /fa/vendor/
Disallow: /fa/webservice/
# Files
Disallow: /*fa/address
Disallow: /*fa/addresses
Disallow: /*fa/login
Disallow: /*fa/cart
Disallow: /*fa/discount
Disallow: /*fa/guest-tracking
Disallow: /*fa/order-history
Disallow: /*fa/identity
Disallow: /*fa/my-account
Disallow: /*fa/سÙارش
Disallow: /*fa/order-confirmation
Disallow: /*fa/order-follow
Disallow: /*fa/order-slip
Disallow: /*fa/password-recovery
Disallow: /*fa/search
 

please help me in this case

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...