Jump to content

[Solved] Robots generating Carts


Recommended Posts

I've searched the forums to try and find an answer I am having where robots are generating multiple abandoned carts a day but can't find any post that helps.

We are seeing 20 - 30 incomplete carts per day even though the PS notes say that bots should not be able to generate carts.

I need to stop this from happening as it is causing issues with our tracking and causing issues on the backend.

Can anyone give any advice?

Link to comment
Share on other sites

  • 3 weeks later...

This is interesting, I've recently been looking into the reason why i'm getting 50+ abandoned carts every day. I thought at first it was customers but none of my tracking software suggests this is the case. I'm also coming to the conclusion that they are somehow robot generated.

I also have a robots.txt file and my next step is to install Analytics to check for sure. Which page did you intall the Analytic code into??

Any insight into this behaviour would be much apreciated as it's playing havoc with the backend for me too.

Thanks

Link to comment
Share on other sites

  • 2 weeks later...
  • 1 month later...

Most carts have a facility to stop bots generating carts when spidering - Are the lastest versions of Prestashop still doing this? If not could a member of the Presta team or indeed anyone who knows how to stop Googlebot adding hundreds of carts a day please provide the appropriate code mods.

This is actually quite a problem My Customers table has 2029 entries and the carts table now has 9415 entries all extras have come from google and yahoo indexing!

Hundreds of phantom carts a day! How can I and the numerous peoiple getting the same problem stop this happening.

Thanks

Baz

Link to comment
Share on other sites

So it seems like search engines don't alway play very nicely with robot.txt :(

I also checked this on my site, and saw many carts created within seconds of each other, which implies an automatic process.

The code below should be placed in /cart.php below the 2 "require_once()" lines.

It checks the user agent to see if it's once on the known crawling bots, and if it is, it does a 301 redirect to the homepage.

I tested it on my site, and had no extra carts created in the last 24+ hours.

if (strpos($_SERVER['HTTP_USER_AGENT'],'bot') !== false || 
    strpos($_SERVER['HTTP_USER_AGENT'],'baidu') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'spider') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'Ask Jeeves') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'slurp') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'crawl') !== false)
{
   Header( "HTTP/1.1 301 Moved Permanently" );
   Header( "Location: ".__PS_BASE_URI__);
   exit; 
}



To make sure you added the code correctly, try to add an item to your cart when you are done.

Link to comment
Share on other sites

Hi Tomer

Have added code as suggested, but get the following pop-up error box:
'Technical error: unable to add product
details:
error: thrown
text status: error

Am using 1.2.1

Any ideas as to why the error?

Thanks again

Baz

Link to comment
Share on other sites

  • 3 weeks later...
  • 2 weeks later...

EDIT: This comment was relevant and valid only for the time it was posted. Do not apply these steps to any version never than post date.

 

SVN commit comment:

FO : ajax "add to cart" now uses a POST request instead of GET

FO : "add to cart" is now protected against bots

Files modified:

/cart.php

/classes/Tools.php

/modules/blockcart/ajax-cart.js

To manually apply the change

in cart.php replace the lines 22-25

//update the cart...
if ($add OR Tools::getIsset('update') OR $delete)
{
//get the values

with

// Update the cart ONLY if cookies are available, in order to avoid ghost carts created by bots
if (($add OR Tools::getIsset('update') OR $delete) AND isset($cookie->date_add))
{

in Tools.php replace the line 127

return isset($_POST[$key]) ? true : (isset($_GET[$key]) ? true : false);

with

return (isset($_POST[$key]) OR isset($_GET[$key]));

in ajax-cart.js replace the line 155

type: 'GET',

with

type: 'POST',

replace the line 160

data: 'add&ajax=true&qty;=' + ( (quantity && quantity != null) ? quantity : '1') + '&id;_product=' + idProduct + '&token;=' + static_token + ( (parseInt(idCombination) && idCombination != null) ? '&ipa;=' + parseInt(idCombination): ''),

with

data: 'add=1&ajax=true&qty;=' + ( (quantity && quantity != null) ? quantity : '1') + '&id;_product=' + idProduct + '&token;=' + static_token + ( (parseInt(idCombination) && idCombination != null) ? '&ipa;=' + parseInt(idCombination): ''),

replace the line 203

type: 'GET',

with

type: 'POST',

replace the line 208

data: 'delete' + '&id;_product=' + idProduct + '&ipa;=' + ((idCombination != null && parseInt(idCombination)) ? idCombination : '') + ((customizationId && customizationId != null) ? '&id;_customization=' + customizationId : '') + '&token;=' + static_token + '&ajax=true',

with

data: 'delete=1&id;_product=' + idProduct + '&ipa;=' + ((idCombination != null && parseInt(idCombination)) ? idCombination : '') + ((customizationId && customizationId != null) ? '&id;_customization=' + customizationId : '') + '&token;=' + static_token + '&ajax=true',

These line numbers are from an unmodified 1.3.1 version so if you made any changes already they may differ.

  • Like 1
Link to comment
Share on other sites

having the code is more helpful to those who have made changes, like they say replace line 208, well it might be line 220 on mine if i've edited before, so it's better to CTRL+F and find the part, and then add the change.
if a file were simply provided, i'd have to csdiff and figure out what changes were made compared to a raw file, then take those changes and find them in my own file.

and yes it's more work to paste all that code than it is to add attachments. but thank you on behalf of all that are having this problem.
i currently am not.

Link to comment
Share on other sites

the edits posted above have been done to this ajax-cart.js
nothing else has been done to it since ver 1.3.1

for future info, as my guess is, your unable to open the file. right/click - open with and select notepad. it's the most basic of all editors but would have worked. i used dreamweaver CS3 because of it's line count (often easier than ctrl+f)

Edit: i forgot to mention, i did the edits, but have not tested the file. so if you draw some error, that might be a place to check, though i am confident it was done right.

ajax-cart.js

Link to comment
Share on other sites

I'm using final 1.2. will these corrections work on it? I just looked and now have over 20,000 carts. I posted last year about this and never received a response so thought I was the only one with the problem.

Link to comment
Share on other sites

I have over 2000 products in my store and reading here it looked like 1.3 had a different db structure so I'd have to import them. haven't looked further as when I first switched to prestastore I had to upload my products at only 50 at a time or it timed out. not looking forward to that again. and will my theme work, plus I have heavily modified that theme to suit me. ugh, don't want to even think about it right now. but the phantom carts are really annoying.

Link to comment
Share on other sites

  • 2 weeks later...
i would suggest investing the time to do a full update, there were several security fixes (even between 1.3 and 1.3.1) and some other big changes.

running an old shop puts you at risk, no?


Yes and constantly having to upgrade it time consuming. Time that I find better used working on SEO..... I am getting more and more frustrated with Prestashop. Maybe a sign that I need to move on to a supported platform. More expensive for sure but probably less troublesome.

I use Wordpress for a couple of other sites I run and the upgrade is automatic, upgrading Prestashop is not a small affair!
  • Like 1
Link to comment
Share on other sites

The code edits posted above are producing the following errors when adding an item to cart:

Not logged in: 'Item not found'

Logged in: 'Item not found, invalid token'

The changes don't seem to be available (yet?) in the current 1.3.x branch of svn, but are available in the 1.4 svn.

The errors seem to be happening because the code is slightly different (in 1.4 svn) than posted above for the edits at lines 160 and 208.

These edits (taken from the 1.4 svn) seem to remove the above add to cart errors in v1.3.1 :

Around line 160:

           data: 'add=1&ajax=true&qty;=' + ( (quantity && quantity != null) ? quantity : '1') + '&id;_product=' + idProduct + '&token;=' + static_token + ( (parseInt(idCombination) && idCombination != null) ? '&ipa;=' + parseInt(idCombination): ''),



And around line 208:

           data: 'delete=1&id;_product=' + idProduct + '&ipa;=' + ((idCombination != null && parseInt(idCombination)) ? idCombination : '') + ((customizationId && customizationId != null) ? '&id;_customization=' + customizationId : '') + '&token;=' + static_token + '&ajax=true',



Notice that the code changes for these lines (as first posted by phrasespot above) have semi-colons (;) in them - I don't know if the forum is adding them, but neither of these two lines of code should have any semi-colons - if they have, remove them.

EDIT: Yes, the forum is adding semi-colons to code that's posted - this needs fixing by the developers!

Link to comment
Share on other sites

  • 4 weeks later...

does this take a while to kick in or if implemented while a bot is already there does it not work until the next time they visit?

I upgraded to 1.3 and then made the above changes and yet a bot is still on my site right now making multiples carts. I am up to 25,000 abandoned carts clogging up my admin :(

Link to comment
Share on other sites

  • 1 month later...

...for now, until they learn how to, again.
it's part of a plan for world domination. ...or was that a movie?... hmm.

anyway, was this fixed in the newest official update, or just on SVN? (i have already read change-logs, didn't see it.)
any foreseeable repercussions to deleting carts ?

Link to comment
Share on other sites

  • 2 weeks later...
  • 1 month later...
  • 3 weeks later...

Hi,

I have downloaded the ajax-cart.js file that Fallenleader was kind enough to post up here but if I understand correctly there may be the issues with the semi colons in the .js file?

Do I need to edit them out of this file or is it ok?
And if I do, can I get a clarification on the specifics? I don't want to break our shop!

We are running 1.3.6, and we have in the region of 30,000 ghost carts.. want to fix this!

Thanks!

Link to comment
Share on other sites

in admin/AdminCarts.php change
$this->delete = false;

to

$this->delete = true;

and you can delete any carts you don't want.

I'm using SVN 1.4 and can confirm that the bots have been stopped from creating carts!



works in 1.3.x though the correct path is admin/tabs/

thanks for the tip!
Link to comment
Share on other sites

  • 2 weeks later...
  • 4 weeks later...
So it seems like search engines don't alway play very nicely with robot.txt :(

I also checked this on my site, and saw many carts created within seconds of each other, which implies an automatic process.

The code below should be placed in /cart.php below the 2 "require_once()" lines.

It checks the user agent to see if it's once on the known crawling bots, and if it is, it does a 301 redirect to the homepage.

I tested it on my site, and had no extra carts created in the last 24+ hours.

if (strpos($_SERVER['HTTP_USER_AGENT'],'bot') !== false || 
    strpos($_SERVER['HTTP_USER_AGENT'],'baidu') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'spider') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'Ask Jeeves') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'slurp') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'crawl') !== false)
{
   Header( "HTTP/1.1 301 Moved Permanently" );
   Header( "Location: ".__PS_BASE_URI__);
   exit; 
}



To make sure you added the code correctly, try to add an item to your cart when you are done.



Can somebody explain why such modification isn't enought. I see you guys talking about SVN and I'm wondering why.

I applayed his change and I'm wondering which solution is the best.


Regards!
Link to comment
Share on other sites

  • 3 weeks later...
  • 4 weeks later...
So it seems like search engines don't alway play very nicely with robot.txt :(

I also checked this on my site, and saw many carts created within seconds of each other, which implies an automatic process.

The code below should be placed in /cart.php below the 2 "require_once()" lines.

It checks the user agent to see if it's once on the known crawling bots, and if it is, it does a 301 redirect to the homepage.

I tested it on my site, and had no extra carts created in the last 24+ hours.

if (strpos($_SERVER['HTTP_USER_AGENT'],'bot') !== false || 
    strpos($_SERVER['HTTP_USER_AGENT'],'baidu') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'spider') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'Ask Jeeves') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'slurp') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'crawl') !== false)
{
   Header( "HTTP/1.1 301 Moved Permanently" );
   Header( "Location: ".__PS_BASE_URI__);
   exit; 
}



To make sure you added the code correctly, try to add an item to your cart when you are done.



thanks very much,,,its work for my site
Link to comment
Share on other sites

  • 2 weeks later...

I had read all the forum on this topic, I tryed to applayed both methods.

Result's - robots carts is still adding.

Today I had been banned because of the CPU usage and the file that is responsabile is order.php

I'm sure that this robots generated carts are the problem.


Are somebody there that could end this nightmare? Please, I'm going crazy about this problem. :-S

I ignored those carts, but now where I was banned for that I have enought! >:(


Thank you in advance!

Link to comment
Share on other sites

The code I wrote should work (if you placed it in the right spot).

It's possible that a robot with different user_agent is doing it, you should check the raw access log of you site and see who is accessing cart.php (that's the actual file that handles adding to the cart.

It sounds like the issue may be elsewhere (and not robot generating carts)

Link to comment
Share on other sites

The code I wrote should work (if you placed it in the right spot).

It's possible that a robot with different user_agent is doing it, you should check the raw access log of you site and see who is accessing cart.php (that's the actual file that handles adding to the cart.

It sounds like the issue may be elsewhere (and not robot generating carts)


I attached my modified files. Could you please take a look if I did all proprely.

In the last two day I canno't stop the carts to be added. It's going on and on.

I'm preatty sure that this it's causing the error because everytime you go in the shopping cart the file order.php it's shown
This file is consuming 98% of the CPU so I had been banned. Other user has been banned for the products.php (because they have a lot of products).

Even if this isn't the cause I should fix this because it's consuming a lot of bandwitch.

If you are willing to take a look to my files I will really appreciated.


Please let me know.


Thank you in advance.

Robots cart.zip

Link to comment
Share on other sites

#tomerg3

I tryed your modification for more than two months, but it did not stop the robots carts.
I attached my modificated cart php so you can verify if it's done proprely.

In my previous post I have implemented the second technique implemented in this threat.

In my logs I have only i.p. How could I now witch robots it's in it?

Thank's one more time!

cart.php

Link to comment
Share on other sites

Code looks good, you can also add some debug code there to check who is creating these carts.

Something like

$myFile = dirname(__FILE__)."/cart_log.txt";
$fh = fopen($myFile, 'a');
fwrite($fh, "Accessing cart, ip: ".$_SERVER['SERVER_ADDR'].", Agent: ".$_SERVER['HTTP_USER_AGENT'].", add = $add, delete = $delete: \n\r");
fclose($fh);



This can get pretty big, but it will log any request to this file.

Link to comment
Share on other sites

Code looks good, you can also add some debug code there to check who is creating these carts.

Something like
$myFile = dirname(__FILE__)."/cart_log.txt";
$fh = fopen($myFile, 'a');
fwrite($fh, "Accessing cart, ip: ".$_SERVER['SERVER_ADDR'].", Agent: ".$_SERVER['HTTP_USER_AGENT'].", add = $add, delete = $delete: \n\r");
fclose($fh);



This can get pretty big, but it will log any request to this file.



I would like to try this. Where exactly I should insert this file (witch line), I must uppload the cart_log.txt to my server?

Can I ask you if you check also the three files for cookie cart?

I really appreciate your help!
Link to comment
Share on other sites

This is the culprit "Najdi.si"

You can add that to the list of blocked engines

if (strpos($_SERVER['HTTP_USER_AGENT'],'bot') !== false || 
    strpos($_SERVER['HTTP_USER_AGENT'],'baidu') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'spider') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'Ask Jeeves') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'slurp') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'crawl') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'Najdi.si') !== false)

Link to comment
Share on other sites

This is the culprit "Najdi.si"

You can add that to the list of blocked engines

if (strpos($_SERVER['HTTP_USER_AGENT'],'bot') !== false || 
    strpos($_SERVER['HTTP_USER_AGENT'],'baidu') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'spider') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'Ask Jeeves') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'slurp') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'crawl') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'Najdi.si') !== false)



#tomerg3

Thank you very much! I have made the change.

I will post if this solved my problems.

Regards!
Link to comment
Share on other sites

  • 3 weeks later...
This is the culprit "Najdi.si"

You can add that to the list of blocked engines

if (strpos($_SERVER['HTTP_USER_AGENT'],'bot') !== false || 
    strpos($_SERVER['HTTP_USER_AGENT'],'baidu') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'spider') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'Ask Jeeves') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'slurp') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'crawl') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'Najdi.si') !== false)



#tomerg3

Thank you very much! I have made the change.

I will post if this solved my problems.

Regards!



This code solved my problem! No more robots generating cart! :exclaim: :-)

#tomerg3 thank you very very much to helped me to resolve this issue.!

I really appreciate. I hope I could return the favor. :)

Best wishes!
Link to comment
Share on other sites

Because we use Google Analytics and can see the robot's hitting us.

Yes we have a robots.txt but this isn't stopping them


I am about to implement this change as I am also getting bots shopping on my site.

Just wondered how you can see the bots hitting you using analytics as I would like to view this also.
Do you need to create a custom report or is there something there already?
Link to comment
Share on other sites

This is the culprit "Najdi.si"

You can add that to the list of blocked engines

if (strpos($_SERVER['HTTP_USER_AGENT'],'bot') !== false || 
    strpos($_SERVER['HTTP_USER_AGENT'],'baidu') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'spider') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'Ask Jeeves') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'slurp') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'crawl') !== false ||
    strpos($_SERVER['HTTP_USER_AGENT'],'Najdi.si') !== false)



Please could you let me know where / which file you add this code. I am using version 1.4.1
Link to comment
Share on other sites

  • 4 weeks later...

I am using Version 1.4.0.17

Will the ideas mentioned in this discussion work for this version as most seem to talk about Prestashop 1.3

At the moment getting about 50 ghost carts a day.

Can I also delete these carts from the back office.

Thank you

Richard

Link to comment
Share on other sites

  • 3 weeks later...

You are changing controller and classes......
You should use the override folder to override only the parts you need, that way if you upgrade PS you are not deleting your modifications.

You can find more information about overriding in the forum, and manuals.

Link to comment
Share on other sites

  • 3 months later...
  • 1 month later...

I am seeing this every day at random timings :

 

2277 -- -- € 4.075,37 -- 2011-12-18 09:42:55

2273 -- -- € 4.115,32 -- 2011-12-17 10:34:01

2260 -- -- € 3.843,07 -- 2011-12-15 03:55:39

 

There are many more carts generated like this. It has a large collection of products.

In my Apache logs i find this :

 

82.192.66.205 - - [20/Dec/2011:07:37:23 +0300] "GET /armbanden/855-armband-malachiet.html HTTP/1.0" 200 9894 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"


82.192.66.205 - - [20/Dec/2011:07:37:33 +0300] "GET /yoga-overige-accessoires/1653-pranamat-spijkerbed-paars.html HTTP/1.0" 200 10044 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"
82.192.66.205 - - [20/Dec/2011:07:37:42 +0300] "GET /armbanden/863-armband-chunky-groen.html HTTP/1.0" 200 9629 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"
82.192.66.205 - - [20/Dec/2011:07:37:49 +0300] "GET /himalaya-medicine/1408-himalaya-geriforte-detox.html HTTP/1.0" 200 9760 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"
82.192.66.205 - - [20/Dec/2011:07:37:56 +0300] "GET /yoga-overige-accessoires/1652-pranamat-spijkerbed-blauw.html HTTP/1.0" 200 10026 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"
82.192.66.205 - - [20/Dec/2011:07:38:03 +0300] "GET /armbanden/865-armband-chunky-set.html HTTP/1.0" 200 9650 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"
82.192.66.205 - - [20/Dec/2011:07:38:12 +0300] "GET /wierook/745-wierook-deluxe.html HTTP/1.0" 200 10254 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"

 

 

And the log goes on and on...

It seems the Twenga Bot is also querring the order page (bestelpagina)

 

82.192.66.201 - - [20/Dec/2011:10:18:52 +0300] "GET /bestelpagina?ipa=161 HTTP/1.0" 301 409 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"


82.192.66.201 - - [20/Dec/2011:10:18:54 +0300] "GET /bestelpagina?ipa=161 HTTP/1.0" 301 409 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"
82.192.66.201 - - [20/Dec/2011:10:18:56 +0300] "GET /bestelpagina?ipa=161 HTTP/1.0" 301 409 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"
82.192.66.201 - - [20/Dec/2011:10:18:57 +0300] "GET /bestelpagina?ipa=161 HTTP/1.0" 301 409 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"
82.192.66.201 - - [20/Dec/2011:10:18:59 +0300] "GET /bestelpagina?ipa=161 HTTP/1.0" 301 409 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"
82.192.66.201 - - [20/Dec/2011:10:19:01 +0300] "GET /bestelpagina?ipa=161 HTTP/1.0" 301 409 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"

 

 

So it looks like the Twenga bot does NOT behave and the example code from Tomerg3 would solve it i guess.

But his example is made for 1.3.x. In 1.4.x you can use override.

 

So where to put his code ?

Link to comment
Share on other sites

I am seeing this every day at random timings :

2277 -- -- € 4.075,37 -- 2011-12-18 09:42:55

2273 -- -- € 4.115,32 -- 2011-12-17 10:34:01

2260 -- -- € 3.843,07 -- 2011-12-15 03:55:39

 

There are many more carts generated like this.

 

In 4 days 3 carts (unless you trimmed the list) could reasonable be humans making those carts. How many is "many more"? Give a figure carts/day.

 

Make a note of the cart creation times and correlate them with the apache log entries to pinpoint the exact request creating the carts. It is not one of the requests you have in log snippets you posted in your previous post.

 

82.192.66.205 - - [20/Dec/2011:07:37:23 +0300] "GET /armbanden/855-armband-malachiet.html HTTP/1.0" 200 9894 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"
...

 

This request (and similar) cannot create a cart. It is just the product page that is being requested. That is normal. All search engines will be requesting product pages.

 

It seems the Twenga Bot is also querring the order page (bestelpagina)

82.192.66.201 - - [20/Dec/2011:10:18:52 +0300] "GET /bestelpagina?ipa=161 HTTP/1.0" 301 409 "-" "TwengaBot-2.0 ([url="http://www.twenga.com/bot.html"]http://www.twenga.com/bot.html[/url])"
...

 

That is not order page but the page that displays when you e.g. click on cart link, i.e. the contents of cart. This request again cannot create a cart. If you have a cookie and an item in the cart, the cart displays, if you don't have a cookie just the cart page with the message "Your shopping cart empty".

 

Try this yourself. First add an item to your basket and make a request to http://yourdomain/be...pagina?ipa=161. (The number at the end is not relevant). The item you added to cart will display, but only because you have a cookie. Now clear the cookies and make the same request again.

 

Unless Twenga (or another crawler) started to make requests which include cookies (highly unlikely) no cookie, no cart

Link to comment
Share on other sites

This is from Yesterday :

 

 

2297 -- -- € 2.683,77 -- 2011-12-21 13:30:06	
2296 -- -- € 1.754,10 -- 2011-12-21 10:18:27	
2294 -- -- € 2.932,87 -- 2011-12-21 06:58:20

 

And i trimmed the list. When i list 300 items in the cart page i have 24 big carts since 14 nov.

So that is almost 1/day or sometimes 2x per day.

 

Each amount is different and consists of different products. If this was a human being he is adding on average 175 products. I dont know who is doing that (especially not at 7 am in the morning (local time that is).

 

I tried to see if can correlate the time of the cart with the Apache logs.

For example cart nr 2294 : at 6:58:20.

 

The first Twenga-Bot items i can find starts at 6:42


Dec/2011:06:42:56 +0300] "GET /robots.txt HTTP/1.0" 200 957 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"
82.192.66.206 - - [21/Dec/2011:06:42:57 +0300] "GET /yogatas/1615-yogatas-yogamad-blauw.html HTTP/1.0" 200 9831 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"
82.192.66.206 - - [21/Dec/2011:06:43:02 +0300] "GET /pilates-fitness/4701-pilates-fitness-mat-core-extra-dik.html HTTP/1.0" 404 5774 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"
82.192.66.206 - - [21/Dec/2011:06:43:06 +0300] "GET /olien/1316-cypres-olie.html HTTP/1.0" 200 9998 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"
82.192.66.206 - - [21/Dec/2011:06:43:11 +0300] "GET /rokken/514-3laagrok-saffraan.html HTTP/1.0" 200 9760 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"
82.192.66.206 - - [21/Dec/2011:06:43:16 +0300] "GET /olien/1302-teatree-olie.html HTTP/1.0" 200 9977 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"
82.192.66.206 - - [21/Dec/2011:06:43:22 +0300] "GET /tassen/548-heuptas-ohm-saffraan-small.html HTTP/1.0" 200 9493 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"
82.192.66.206 - - [21/Dec/2011:06:43:26 +0300] "GET /wierook/718-nightqueen.html HTTP/1.0" 200 9503 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"
82.192.66.206 - - [21/Dec/2011:06:43:32 +0300] "GET /manduka/5934-manduka-rvs-bottle.html HTTP/1.0" 200 10699 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"

 

And the list goes on of course. All the articles in the log seems to be in the cart as far as i can see it.

In case you like to see it for yourself i have included the apache log.

 

What i discovered is that the Twenga Bot is the only bot that tries to access pages that are blocked by robots.txt like the accounts, order, discount and many more pages. It gets a 301 doeing so, but it is strange that the Twenga bot is doing it at all:

 

 

Dec/2011:08:50:31 +0300] "GET /account HTTP/1.0" 301 314 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"
82.192.66.203 - - [21/Dec/2011:08:50:34 +0300] "GET /account HTTP/1.0" 301 314 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"
82.192.66.203 - - [21/Dec/2011:08:50:35 +0300] "GET /account HTTP/1.0" 301 314 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"
82.192.66.203 - - [21/Dec/2011:08:50:38 +0300] "GET /account HTTP/1.0" 301 314 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"

 

If this is a human he has waaaay to much time. If this is twenga then i rather block ALL bots by using other means than just relying on the robots.txt

varsana-apachelog-21-dec-2011.zip

Link to comment
Share on other sites

Ignoring my post about Twenga for a moment, the question of this thread and solution to it, was to modify the cart.php and add the wonderful code from Tomerg to block bots.

Since 1.4.x we can use overrides. the cart.php is now a php redirect to /controllers/CartController.php

 

And in the CartController (line 102 version 1.4.6.2) i find this :

 

// Update the cart ONLY if $this->cookies are available, in order to avoid ghost carts created by bots
if (($add OR Tools::getIsset('update') OR $delete) AND isset($_COOKIE[self::$cookie->getName()]))

 

In previous posts in this thread the Prestateam has added this code to prevent the creation of carts by robots/spiders.

Ghost carts issue has been fixed since SVN rev 6851 (see issue http://forge.prestas...owse/PSCFI-2143) and is included in version 1.4.3.0

 

I dont know if the solution by the Prestateam was enough. But suppose i want to include the code from Tomerg, where should it be put ?

 

First i need to override the CartController, and then add the code like this ??

(the code i added starts from line 102 PS 1.4.6.2)

 

if (Configuration::get('PS_TOKEN_ENABLE') == 1
&& strcasecmp(Tools::getToken(false), strval(Tools::getValue('token')))
&& self::$cookie->isLogged() === true)
$this->errors[] = Tools::displayError('Invalid token');

// Adding the code from Tomerg to block nasty bots generating random carts
if (strpos($_SERVER['HTTP_USER_AGENT'],'bot') !== false ||
strpos($_SERVER['HTTP_USER_AGENT'],'baidu') !== false ||
strpos($_SERVER['HTTP_USER_AGENT'],'spider') !== false ||
strpos($_SERVER['HTTP_USER_AGENT'],'Ask Jeeves') !== false ||
strpos($_SERVER['HTTP_USER_AGENT'],'slurp') !== false ||
strpos($_SERVER['HTTP_USER_AGENT'],'TwengaBot') !== false ||
strpos($_SERVER['HTTP_USER_AGENT'],'crawl') !== false)
{
Header( "HTTP/1.1 301 Moved Permanently" );
Header( "Location: ".__PS_BASE_URI__);
exit;
}

// Update the cart ONLY if $this->cookies are available, in order to avoid ghost carts created by bots
if (($add OR Tools::getIsset('update') OR $delete) AND isset($_COOKIE[self::$cookie->getName()]))

And since i still did not figure out how to properly override i just copied the WHOLE CartController file and added the code (as shown above)...

Link to comment
Share on other sites

There are better ways to block a particular visitor than making modifications to prestashop. Leave prestashop core files alone and use a .htaccess file. Here is an example to block that particular UA you posted assuming you are running apache. Similar mechanisms exist for other servers.

 

Add this to a .htaccess at relevant directory level, replacing the commented line:

 

# Set an environment variable (EV) bad_twenga_bot
# if the request UA starts with TwengaBot
SetEnvIf User-Agent ^TwengaBot bad_twenga_bot
# Replace /docroot with appropriate directory on your server
<Directory /docroot>
# Allow everyone except if
# bad_twenga_bot EV was set for the request
Order Allow,Deny
Allow from All
Deny from env=bad_twenga_bot
</Directory>

  • Like 1
Link to comment
Share on other sites

Yeah, that is a good reminder. Leave core files alone!! I did override the core file and then my cart started creating double products when i added only 1..... Overriding is not that easy without a good grasp of PHP.

I like to start another post with a howto override...

 

And although i put in the blocking code that Tomerg suggested the big carts are still created. Thats weird.

It would be nice to have some more info like IP addres on the cart page.

Link to comment
Share on other sites

I think i correlated a cart that was created with the logs... Presumably.. :-)

 

Cart created at 2011-12-23 20:11:22

 

In the logs i found this :

 

31.211.200.102 - - [23/Dec/2011:21:59:21 +0300] "GET /yogatas/1606-yogatas-carryall-paars.html HTTP/1.1" 200 10085 "http://www.varsana.nl/" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"

212.64.20.172 - - [23/Dec/2011:21:59:22 +0300] "GET /cart.php?_=1324666759377&ajax=true&add&summary&id_product=5903&ipa=573&op=down&qty=1&token=59813786dc441d3f0de1f5b2ea9828a3 HTTP/1.1" 200 2037 "https://www.varsana.nl/snelle-bestelpagina" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7"
212.64.20.172 - - [23/Dec/2011:21:59:22 +0300] "GET /cart.php?_=1324666759708&ajax=true&token=59813786dc441d3f0de1f5b2ea9828a3 HTTP/1.1" 200 500 "https://www.varsana.nl/snelle-bestelpagina" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7"
31.211.200.102 - - [23/Dec/2011:21:59:22 +0300] "GET /winkelwagen?add&id_product=1606&token=865dce6ff424ddfbdf7cbe2cf766b359 HTTP/1.1" 302 352 "http://www.varsana.nl/" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
31.211.200.102 - - [23/Dec/2011:21:59:23 +0300] "GET /themes/vani/img/icon/icon_goodstock-list.gif HTTP/1.1" 200 460 "http://www.varsana.nl/" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
212.64.20.172 - - [23/Dec/2011:21:59:23 +0300] "GET /cart.php?_=1324666760449&ajax=true&add&summary&id_product=5903&ipa=573&op=down&qty=1&token=59813786dc441d3f0de1f5b2ea9828a3 HTTP/1.1" 200 180 "https://www.varsana.nl/snelle-bestelpagina" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7"
212.64.20.172 - - [23/Dec/2011:21:59:23 +0300] "GET /cart.php?_=1324666760733&ajax=true&token=59813786dc441d3f0de1f5b2ea9828a3 HTTP/1.1" 200 180 "https://www.varsana.nl/snelle-bestelpagina" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7"
31.211.200.102 - - [23/Dec/2011:21:59:23 +0300] "GET /yogatas/1607-yogatas-carryall-zwart.html HTTP/1.1" 200 10472 "http://www.varsana.nl/" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
31.211.200.102 - - [23/Dec/2011:21:59:23 +0300] "GET /6342-home/yogatas-carryall-zwart.jpg HTTP/1.1" 200 3331 "http://www.varsana.nl/" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
212.64.20.172 - - [23/Dec/2011:21:59:23 +0300] "GET /cart.php?_=1324666761041&ajax=true&add&summary&id_product=5903&ipa=573&op=down&qty=1&token=59813786dc441d3f0de1f5b2ea9828a3 HTTP/1.1" 200 180 "https://www.varsana.nl/snelle-bestelpagina" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7"
212.64.20.172 - - [23/Dec/2011:21:59:24 +0300] "GET /cart.php?_=1324666761292&ajax=true&token=59813786dc441d3f0de1f5b2ea9828a3 HTTP/1.1" 200 180 "https://www.varsana.nl/snelle-bestelpagina" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7"
31.211.200.102 - - [23/Dec/2011:21:59:24 +0300] "GET /winkelwagen?add&id_product=1607&token=865dce6ff424ddfbdf7cbe2cf766b359 HTTP/1.1" 302 352 "http://www.varsana.nl/" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
31.211.200.102 - - [23/Dec/2011:21:59:24 +0300] "GET /bestelpagina?ipa=1607 HTTP/1.1" 302 4676 "http://www.varsana.nl/" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
31.211.200.102 - - [23/Dec/2011:21:59:24 +0300] "GET /yogatas/1609-yogatas-mat-bordeaux.html HTTP/1.1" 200 9763 "http://www.varsana.nl/" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
212.64.20.172 - - [23/Dec/2011:21:59:25 +0300] "GET /cart.php?_=1324666762513&ajax=true&add&summary&id_product=5940&ipa=964&op=down&qty=1&token=59813786dc441d3f0de1f5b2ea9828a3 HTTP/1.1" 200 2037 "https://www.varsana.nl/snelle-bestelpagina" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7"
212.64.20.172 - - [23/Dec/2011:21:59:25 +0300] "GET /cart.php?_=1324666762831&ajax=true&token=59813786dc441d3f0de1f5b2ea9828a3 HTTP/1.1" 200 500 "https://www.varsana.nl/snelle-bestelpagina" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.63 Safari/535.7"
31.211.200.102 - - [23/Dec/2011:21:59:25 +0300] "GET /winkelwagen?add&id_product=1609&token=865dce6ff424ddfbdf7cbe2cf766b359 HTTP/1.1" 302 352 "http://www.varsana.nl/" "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"

 

I know you have said before that these statements cannot create carts. But i am just trying to figure it out why the carts are created. (and by who)

It is getting confusing

Link to comment
Share on other sites

Hmm.... some of those requests are different and _could_ be creating carts.

 

The IP starting 212.64... belongs to a Dutch broadband provider so I am 99.9% certain that it is just a human visitor, adding items to cart.

 

The other IP, starting 31.211... is also from a broadband provider, in Sweden. So my first guess was it would be a normal visitor too, but the UA of the visit indicates a web-site copier, a program to copy entire sites so you can 'browse them offline'. It does have the ability to send cookies so it is able to create carts. Why anyone would want to have a local copy of a shopping site, I don't know.

 

This just shows that it is ineffective to block arbitrary IP ranges (both are domestic broadband providers, and IPs may be dynamically assigned), crawlers (neither is a crawler in the common sense as in Google crawler), UA strings (httrack has the ability to send any UA string); it is a futile battle.

 

If a UA is able to interact with the site in a manner indistinguishable from that of a human using a normal browser, there is nothing PS, or any other cart software, can do about that w/o also inconveniencing all visitors. In some very old version of PS there was a bug that allowed any crawler to create a cart (hundreds/thousands per day) but this is now fixed and PS is not any more prone to ghost carts than any other cart software.

 

If you have a persistent, known bot/crawler with no regards to robots.txt that also has the ability/reason to send cookies, you can always ask that company to back off (no recognized crawlers wish to bring their bots into disrepute with the site owners/webmasters), or stop it with a .htaccess as I mentioned before. For odd IPs like in above log, not so easy.

 

Created and not processed carts do not lower the stock, do not cause extra processing for PS (unless you are frequently checking in BO :), not too difficult to remove, either manually or with an SQL script.

 

However I say leave them alone. If, as a customer, I carefully added some items to cart, left the computer to pour a coffee and found my cart empty upon return, I would be very miffed to say the least. Look at Amazon, you add an item to cart, go back 3 months later and it is still there. I know few people who use that as a birthday present repository, add some items to cart and buy them nearer the time.

 

The negative effects of ghost carts I can think of are skewing you impression of _really_ abandoned carts by legitimate users and the load on server at the time those carts are being created. If any of those are sufficient concern for you, I would recommend you seek the assistance of a professional to take measures at the perimeter before those request ever reach to PS, rather than modifying the PS itself.

 

Note though that nothing can stop this happening completely. Easiest and cheapest is to accept it as a part of doing business online (undesirables may walk into your brick 'n mortar shop too) and learn to live with it.

 

 

Link to comment
Share on other sites

The other IP, starting 31.211... is also from a broadband provider, in Sweden. So my first guess was it would be a normal visitor too, but the UA of the visit indicates a web-site copier, a program to copy entire sites so you can 'browse them offline'. It does have the ability to send cookies so it is able to create carts. Why anyone would want to have a local copy of a shopping site, I don't know.

 

Your comments absolutely make sense and thanks for taking the time to investigate. I also found out the IP addres belongs to some Swedish company. And you found out it is a website copier (httrack) that can manipulate its UA and could possible be a dynamic IP.

 

And the plot thickens because today i already have 3 big carts and HTTrack is not in the logs.........

 

Is it possible to log the IP address of an abandoned cart ? With this i can quickly find it in the logs and search further?

 

Tomerg suggested this code :


$myFile = dirname(__FILE__)."/cart_log.txt";
$fh = fopen($myFile, 'a');
fwrite($fh, "Accessing cart, ip: ".$_SERVER['SERVER_ADDR'].", Agent: ".$_SERVER['HTTP_USER_AGENT'].", add = $add, delete = $delete: nr");
fclose($fh);

 

But i think logging an IP as part of the cart creation must be easier

Link to comment
Share on other sites

I just checked the carts and another one was generated. Now i got the server timing figured out. So :

 

Cart € 2103,72 - generated on

2011-12-24 11:33:27

 

 

In the log, on this exact time i find this log :

 

 

82.192.66.202 - - [24/Dec/2011:13:33:27 +0300] "GET /winkelwagen?qty=1&id_product=1616&token=865dce6ff424ddfbdf7cbe2cf766b359&add HTTP/1.0" 302 629 "-" "TwengaBot-2.0 (http://www.twenga.com/bot.html)"

 

The bot keeps getting 302/301. But it seems not to stop...

The robots.txt tells it NOT to look there. So whats happening?

Link to comment
Share on other sites

  • 4 months later...

I've found all my ghost carts are created by this bot:

 

Mozilla/5.0 (compatible; archive.org_bot +http://www.archive.org/details/archive.org_bot) @ IP 207.241.237.224.

 

I have now banned the ip in cPanel.

 

Jay

Link to comment
Share on other sites

  • 3 weeks later...
  • 1 month later...
  • 3 months later...

EDIT: This comment was relevant and valid only for the time it was posted. Do not apply these steps to any version never than post date.

 

SVN commit comment:

FO : ajax "add to cart" now uses a POST request instead of GET

FO : "add to cart" is now protected against bots

Files modified:

/cart.php

/classes/Tools.php

/modules/blockcart/ajax-cart.js

To manually apply the change

in cart.php replace the lines 22-25

............

 

thank you phrasespot!

changes applied successfully so far

 

i'll see in the coming days if i stop getting bot carts in my admin console

 

if i dont post back chances are that it worked!

im still using version 1.3.1 since summer 2010

Link to comment
Share on other sites

  • 2 weeks later...
  • 8 months later...

EDIT: This comment was relevant and valid only for the time it was posted. Do not apply these steps to any version never than post date.

 

SVN commit comment:

FO : ajax "add to cart" now uses a POST request instead of GET

FO : "add to cart" is now protected against bots

Files modified:

/cart.php

/classes/Tools.php

/modules/blockcart/ajax-cart.js

To manually apply the change

in cart.php replace the lines 22-25

//update the cart...
if ($add OR Tools::getIsset('update') OR $delete)
{
//get the values

with

// Update the cart ONLY if cookies are available, in order to avoid ghost carts created by bots
if (($add OR Tools::getIsset('update') OR $delete) AND isset($cookie->date_add))
{

in Tools.php replace the line 127

return isset($_POST[$key]) ? true : (isset($_GET[$key]) ? true : false);

with

return (isset($_POST[$key]) OR isset($_GET[$key]));

in ajax-cart.js replace the line 155

type: 'GET',

with

type: 'POST',

replace the line 160

data: 'add&ajax=true&qty;=' + ( (quantity && quantity != null) ? quantity : '1') + '&id;_product=' + idProduct + '&token;=' + static_token + ( (parseInt(idCombination) && idCombination != null) ? '&ipa;=' + parseInt(idCombination): ''),

with

data: 'add=1&ajax=true&qty;=' + ( (quantity && quantity != null) ? quantity : '1') + '&id;_product=' + idProduct + '&token;=' + static_token + ( (parseInt(idCombination) && idCombination != null) ? '&ipa;=' + parseInt(idCombination): ''),

replace the line 203

type: 'GET',

with

type: 'POST',

replace the line 208

data: 'delete' + '&id;_product=' + idProduct + '&ipa;=' + ((idCombination != null && parseInt(idCombination)) ? idCombination : '') + ((customizationId && customizationId != null) ? '&id;_customization=' + customizationId : '') + '&token;=' + static_token + '&ajax=true',

with

data: 'delete=1&id;_product=' + idProduct + '&ipa;=' + ((idCombination != null && parseInt(idCombination)) ? idCombination : '') + ((customizationId && customizationId != null) ? '&id;_customization=' + customizationId : '') + '&token;=' + static_token + '&ajax=true',

These line numbers are from an unmodified 1.3.1 version so if you made any changes already they may differ.

 

It doesn't work in my PS 1.3.2.3 as I can't add products to my cart after modifying the files.

I will try with the phrasespot htaccess suggestion in order not to modify the core files...

Thanks for sharing!

Edited by c.carlos.s (see edit history)
Link to comment
Share on other sites

There are better ways to block a particular visitor than making modifications to prestashop. Leave prestashop core files alone and use a .htaccess file. Here is an example to block that particular UA you posted assuming you are running apache. Similar mechanisms exist for other servers.

 

Add this to a .htaccess at relevant directory level, replacing the commented line:

 

# Set an environment variable (EV) bad_twenga_bot
# if the request UA starts with TwengaBot
SetEnvIf User-Agent ^TwengaBot bad_twenga_bot
# Replace /docroot with appropriate directory on your server
<Directory /docroot>
# Allow everyone except if
# bad_twenga_bot EV was set for the request
Order Allow,Deny
Allow from All
Deny from env=bad_twenga_bot
</Directory>

 

Hi, I don´t know where is the correct place to insert this in my .htaccess file, can you help me? This is my .htaccess code:

# .htaccess automaticaly generated by PrestaShop e-commerce open-source solution
# http://www.prestashop.com - http://www.prestashop.com/forums

# URL rewriting module activation
RewriteEngine on
RewriteCond %{HTTP_HOST} ^myshop.com
RewriteRule ^(.*)$ http://www.myshop.com/tienda/$1 [R=301,L]

# URL rewriting rules
RewriteRule ^([a-z0-9]+)\-([a-z0-9]+)(\-[_a-zA-Z0-9-]*)/([_a-zA-Z0-9-]*)\.jpg$ /tienda/img/p/$1-$2$3.jpg [QSA,L,E]
RewriteRule ^([0-9]+)\-([0-9]+)/([_a-zA-Z0-9-]*)\.jpg$ /tienda/img/p/$1-$2.jpg [QSA,L,E]
RewriteRule ^([0-9]+)(\-[_a-zA-Z0-9-]*)/([_a-zA-Z0-9-]*)\.jpg$ /tienda/img/c/$1$2.jpg [QSA,L,E]
RewriteRule ^lang-([a-z]{2})/([a-zA-Z0-9-]*)/([0-9]+)\-([a-zA-Z0-9-]*)\.html(.*)$ /tienda/product.php?id_product=$3&isolang=$1$5 [QSA,L,E]
RewriteRule ^lang-([a-z]{2})/([0-9]+)\-([a-zA-Z0-9-]*)\.html(.*)$ /tienda/product.php?id_product=$2&isolang=$1$4 [QSA,L,E]
RewriteRule ^lang-([a-z]{2})/([0-9]+)\-([a-zA-Z0-9-]*)(.*)$ /tienda/category.php?id_category=$2&isolang=$1 [QSA,L,E]
RewriteRule ^([a-zA-Z0-9-]*)/([0-9]+)\-([a-zA-Z0-9-]*)\.html(.*)$ /tienda/product.php?id_product=$2$4 [QSA,L,E]
RewriteRule ^([0-9]+)\-([a-zA-Z0-9-]*)\.html(.*)$ /tienda/product.php?id_product=$1$3 [QSA,L,E]
RewriteRule ^([0-9]+)\-([a-zA-Z0-9-]*)(.*)$ /tienda/category.php?id_category=$1 [QSA,L,E]
RewriteRule ^content/([0-9]+)\-([a-zA-Z0-9-]*)(.*)$ /tienda/cms.php?id_cms=$1 [QSA,L,E]
RewriteRule ^([0-9]+)__([a-zA-Z0-9-]*)(.*)$ /tienda/supplier.php?id_supplier=$1$3 [QSA,L,E]
RewriteRule ^([0-9]+)_([a-zA-Z0-9-]*)(.*)$ /tienda/manufacturer.php?id_manufacturer=$1$3 [QSA,L,E]
RewriteRule ^lang-([a-z]{2})/(.*)$ /tienda/$2?isolang=$1 [QSA,L,E]

# Catch 404 errors
ErrorDocument 404 /tienda/404.php

 

PS 1.3.2.3

Link to comment
Share on other sites

  • 3 weeks later...
  • 1 year later...
  • 1 month later...

So it seems like search engines don't alway play very nicely with robot.txt :(

 

I also checked this on my site, and saw many carts created within seconds of each other, which implies an automatic process.

 

The code below should be placed in /cart.php below the 2 "require_once()" lines.

 

It checks the user agent to see if it's once on the known crawling bots, and if it is, it does a 301 redirect to the homepage.

 

I tested it on my site, and had no extra carts created in the last 24+ hours.

 

if (strpos($_SERVER['HTTP_USER_AGENT'],'bot') !== false ||      strpos($_SERVER['HTTP_USER_AGENT'],'baidu') !== false ||     strpos($_SERVER['HTTP_USER_AGENT'],'spider') !== false ||     strpos($_SERVER['HTTP_USER_AGENT'],'Ask Jeeves') !== false ||     strpos($_SERVER['HTTP_USER_AGENT'],'slurp') !== false ||     strpos($_SERVER['HTTP_USER_AGENT'],'crawl') !== false){    Header( "HTTP/1.1 301 Moved Permanently" );    Header( "Location: ".__PS_BASE_URI__);    exit; }

To make sure you added the code correctly, try to add an item to your cart when you are done.

 

 

where is the cart.php file? I am on version 1.6.0.11.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...