Jump to content
Timpet

robots.txt what should it contain and be placed?

Recommended Posts

I need help help on what the robots.txt should look like. My prestashop can not generate it by it self. I have and old one so can use most of that.

my webby is placed at a sub folder called shop, så they web adresse is http://www.verdious-wardrobe.dk/shop/

Should robots.txt be placed in the shop folder?

and should i put /shop/ in front of all these

# Directories
Disallow: /classes/
Disallow: /config/
Disallow: /download/
...

# Files
Disallow: /addresses.php
Disallow: /address.php
Disallow: /authentication.php
...

Share this post


Link to post
Share on other sites

Yes, robots.txt should be placed in the shop folder and you should add /shop/ to the front of these so it is /shop/classes/ for example.

Share this post


Link to post
Share on other sites

I noticed my authentification.php page getting stuck on "Google Analytics' for a while before the page is fully loaded. I think I can remedy this by placing a robots.txt file in the root directory but have no idea what this file should contain?

There is no way to generate this in the Back Office (Generators), can someone post an example of what theirs looks like? I'm not sure what pages to exclude as I'm sure authentification.php is not the only one that is unnecessary to be crawled?

Share this post


Link to post
Share on other sites

Hi,

I did genarate a robots.txt from BO with PS 1.2.5.
Here is what's in it :

# robots.txt automaticaly generated by PrestaShop e-commerce open-source solution
# http://www.prestashop.com - http://www.prestashop.com/forums

# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.

# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/wc/robots.html

User-agent: *

# Directories
Disallow: /your root folder/classes/
Disallow: /your root folder/config/
Disallow: /your root folder/download/
Disallow: /your root folder/mails/
Disallow: /your root folder/modules/
Disallow: /your root folder/translations/
Disallow: /your root folder/tools/

# Files
Disallow: /your root folder/addresses.php
Disallow: /your root folder/address.php
Disallow: /your root folder/authentication.php
Disallow: /your root folder/cart.php
Disallow: /your root folder/contact-form.php
Disallow: /your root folder/discount.php
Disallow: /your root folder/footer.php
Disallow: /your root folder/get-file.php
Disallow: /your root folder/header.php
Disallow: /your root folder/history.php
Disallow: /your root folder/identity.php
Disallow: /your root folder/images.inc.php
Disallow: /your root folder/init.php
Disallow: /your root folder/my-account.php
Disallow: /your root folder/order.php
Disallow: /your root folder/order-slip.php
Disallow: /your root folder/order-detail.php
Disallow: /your root folder/order-follow.php
Disallow: /your root folder/order-return.php
Disallow: /your root folder/order-confirmation.php
Disallow: /your root folder/pagination.php
Disallow: /your root folder/password.php
Disallow: /your root folder/pdf-invoice.php
Disallow: /your root folder/pdf-order-return.php
Disallow: /your root folder/pdf-order-slip.php
Disallow: /your root folder/product-sort.php
Disallow: /your root folder/search.php
Disallow: /your root folder/statistics.php
Disallow: /your root folder/zoom.php

Hope this helps ;-)

Share this post


Link to post
Share on other sites

Tnx for this pasko, just what i was looking for.

But what about css and js directories? Should i disallow both, cause you don't want google to index that, cause it's pure code, right?

And how about docs, which includes just licences?

I would add all three into robots under disallow =)

I think it's important to do all that, because you want google to focus on your content, on your products actually.

Cheers,
Housy

Share this post


Link to post
Share on other sites
Hi,

I did genarate a robots.txt from BO with PS 1.2.5.
Here is what's in it :

# robots.txt automaticaly generated by PrestaShop e-commerce open-source solution
# http://www.prestashop.com - http://www.prestashop.com/forums

# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.

# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/wc/robots.html

User-agent: *

# Directories
Disallow: /your root folder/classes/
Disallow: /your root folder/config/
Disallow: /your root folder/download/
Disallow: /your root folder/mails/
Disallow: /your root folder/modules/
Disallow: /your root folder/translations/
Disallow: /your root folder/tools/

# Files
Disallow: /your root folder/addresses.php
Disallow: /your root folder/address.php
Disallow: /your root folder/authentication.php
Disallow: /your root folder/cart.php
Disallow: /your root folder/contact-form.php
Disallow: /your root folder/discount.php
Disallow: /your root folder/footer.php
Disallow: /your root folder/get-file.php
Disallow: /your root folder/header.php
Disallow: /your root folder/history.php
Disallow: /your root folder/identity.php
Disallow: /your root folder/images.inc.php
Disallow: /your root folder/init.php
Disallow: /your root folder/my-account.php
Disallow: /your root folder/order.php
Disallow: /your root folder/order-slip.php
Disallow: /your root folder/order-detail.php
Disallow: /your root folder/order-follow.php
Disallow: /your root folder/order-return.php
Disallow: /your root folder/order-confirmation.php
Disallow: /your root folder/pagination.php
Disallow: /your root folder/password.php
Disallow: /your root folder/pdf-invoice.php
Disallow: /your root folder/pdf-order-return.php
Disallow: /your root folder/pdf-order-slip.php
Disallow: /your root folder/product-sort.php
Disallow: /your root folder/search.php
Disallow: /your root folder/statistics.php
Disallow: /your root folder/zoom.php

Hope this helps ;-)



Thanx for this.

1. But where is in the back office this module for generating robots?
2. what if my site is hosted direct on : public html it has no other root folder?
should i write for example:

Disallow: /public_html/zoom.php

or just

Disallow: /zoom.php

Share this post


Link to post
Share on other sites

Just:

Disallow: /zoom.php



You should never include public_html in the path.

Share this post


Link to post
Share on other sites
Yes, robots.txt should be placed in the shop folder and you should add /shop/ to the front of these so it is /shop/classes/ for example.


robots.txt file should be placed in the top-level directory (not shop folder) of web server. It is usually "public_html", "www" or "httpdocs" directory on FTP (sorry it is critical so I have had to correct it). What rocky wrote is correct only when PS is installed in top-level directory.

Format: All disallowed URIs in this file should be in absolute format. It means that if you would like to disable e.g.: www.example.com/shop/my-account.php the content of this file should look the following way:
User-agent: *

# Directories
Disallow: /shop/my-account.php



For www.example.com/my-account.php it should be:

User-agent: *

# Directories
Disallow: /my-account.php

Share this post


Link to post
Share on other sites

I had a lot of problems with Googlebot drawing down colossal bandwidth (13 gigabytes a month on average) so I had to block it completely. I will be re introducing access 1 file at a time..

There appears to be some potential syntax errors in the Presta generated files - It wasn't until I checked my robots.txt file with this tool (below) that it actually worded properly.

# http://tool.motoricerca.info/robots-checker.phtml
# http://www.mcanerin.com/en/search-engine/robots-txt.asp

Share this post


Link to post
Share on other sites
Hi,

I did genarate a robots.txt from BO with PS 1.2.5.
Here is what's in it :

# robots.txt automaticaly generated by PrestaShop e-commerce open-source solution
# http://www.prestashop.com - http://www.prestashop.com/forums

# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.

# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/wc/robots.html

User-agent: *

# Directories
Disallow: /your root folder/classes/
Disallow: /your root folder/config/
Disallow: /your root folder/download/
Disallow: /your root folder/mails/
Disallow: /your root folder/modules/
Disallow: /your root folder/translations/
Disallow: /your root folder/tools/

# Files
Disallow: /your root folder/addresses.php
Disallow: /your root folder/address.php
Disallow: /your root folder/authentication.php
Disallow: /your root folder/cart.php
Disallow: /your root folder/contact-form.php
Disallow: /your root folder/discount.php
Disallow: /your root folder/footer.php
Disallow: /your root folder/get-file.php
Disallow: /your root folder/header.php
Disallow: /your root folder/history.php
Disallow: /your root folder/identity.php
Disallow: /your root folder/images.inc.php
Disallow: /your root folder/init.php
Disallow: /your root folder/my-account.php
Disallow: /your root folder/order.php
Disallow: /your root folder/order-slip.php
Disallow: /your root folder/order-detail.php
Disallow: /your root folder/order-follow.php
Disallow: /your root folder/order-return.php
Disallow: /your root folder/order-confirmation.php
Disallow: /your root folder/pagination.php
Disallow: /your root folder/password.php
Disallow: /your root folder/pdf-invoice.php
Disallow: /your root folder/pdf-order-return.php
Disallow: /your root folder/pdf-order-slip.php
Disallow: /your root folder/product-sort.php
Disallow: /your root folder/search.php
Disallow: /your root folder/statistics.php
Disallow: /your root folder/zoom.php

Hope this helps ;-)


I make robots.txt by this way but when i go to google/webmaster to manage my website . In Crawl errors. i see it have 54 erros . should i repair it to optimum SEO ?

Share this post


Link to post
Share on other sites

Hi

My friends I have a question about my robot.txt in my server that has permission 666

but I am not sure this details are OK and Is it OK for search of google too.

# robots.txt automatically generated by PrestaShop e-commerce open-source solution
# http://www.prestashop.com - http://www.prestashop.com/forums
# This file is to prevent the crawling and indexing of certain parts
# of your site by web crawlers and spiders run by sites like Yahoo!
# and Google. By telling these "robots" where not to go on your site,
# you save bandwidth and server resources.
# For more information about the robots.txt standard, see:
# http://www.robotstxt.org/robotstxt.html
User-agent: *
# Allow Directives
Allow: */modules/*.css
Allow: */modules/*.js
Allow: */modules/*.png
Allow: */modules/*.jpg
Allow: /js/jquery/*
# Private pages
Disallow: /*?order=
Disallow: /*?tag=
Disallow: /*?id_currency=
Disallow: /*?search_query=
Disallow: /*?back=
Disallow: /*?n=
Disallow: /*&order=
Disallow: /*&tag=
Disallow: /*&id_currency=
Disallow: /*&search_query=
Disallow: /*&back=
Disallow: /*&n=
Disallow: /*controller=addresses
Disallow: /*controller=address
Disallow: /*controller=authentication
Disallow: /*controller=cart
Disallow: /*controller=discount
Disallow: /*controller=footer
Disallow: /*controller=get-file
Disallow: /*controller=header
Disallow: /*controller=history
Disallow: /*controller=identity
Disallow: /*controller=images.inc
Disallow: /*controller=init
Disallow: /*controller=my-account
Disallow: /*controller=order
Disallow: /*controller=order-slip
Disallow: /*controller=order-detail
Disallow: /*controller=order-follow
Disallow: /*controller=order-return
Disallow: /*controller=order-confirmation
Disallow: /*controller=pagination
Disallow: /*controller=password
Disallow: /*controller=pdf-invoice
Disallow: /*controller=pdf-order-return
Disallow: /*controller=pdf-order-slip
Disallow: /*controller=product-sort
Disallow: /*controller=search
Disallow: /*controller=statistics
Disallow: /*controller=attachment
Disallow: /*controller=guest-tracking
# Directories for 20bekhar.com
Disallow: /app/
Disallow: /cache/
Disallow: /classes/
Disallow: /config/
Disallow: /controllers/
Disallow: /download/
Disallow: /js/
Disallow: /localization/
Disallow: /log/
Disallow: /mails/
Disallow: /modules/
Disallow: /override/
Disallow: /pdf/
Disallow: /src/
Disallow: /tools/
Disallow: /translations/
Disallow: /upload/
Disallow: /var/
Disallow: /vendor/
Disallow: /webservice/
Disallow: /fa/app/
Disallow: /fa/cache/
Disallow: /fa/classes/
Disallow: /fa/config/
Disallow: /fa/controllers/
Disallow: /fa/download/
Disallow: /fa/js/
Disallow: /fa/localization/
Disallow: /fa/log/
Disallow: /fa/mails/
Disallow: /fa/modules/
Disallow: /fa/override/
Disallow: /fa/pdf/
Disallow: /fa/src/
Disallow: /fa/tools/
Disallow: /fa/translations/
Disallow: /fa/upload/
Disallow: /fa/var/
Disallow: /fa/vendor/
Disallow: /fa/webservice/
# Files
Disallow: /*fa/address
Disallow: /*fa/addresses
Disallow: /*fa/login
Disallow: /*fa/cart
Disallow: /*fa/discount
Disallow: /*fa/guest-tracking
Disallow: /*fa/order-history
Disallow: /*fa/identity
Disallow: /*fa/my-account
Disallow: /*fa/سÙارش
Disallow: /*fa/order-confirmation
Disallow: /*fa/order-follow
Disallow: /*fa/order-slip
Disallow: /*fa/password-recovery
Disallow: /*fa/search
 

please help me in this case

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.


×
×
  • Create New...

Important Information

Cookies ensure the smooth running of our services. Using these, you accept the use of cookies. Learn More