nicoledesignstore Posted September 20, 2024 Share Posted September 20, 2024 (edited) Hello, Our website is being saturated via a new kind of attack from spam ip addresses. It seems like the bots use a functionality embedded to PrestaShop to create indexes of faceted search, filtering products in specific categories by brand. I alredy: - tried to disable the affected category (they move to another one) - disable and even delete faceted search module (the controller keeps working) I run on PrestaShop 1.7.8.5. Is there a way to permanently disable the controller that responds to the "?q=" request in a link? Thanks a lot for any help provided! Edited September 20, 2024 by nicoledesignstore (see edit history) Link to comment Share on other sites More sharing options...
ps8modules Posted September 21, 2024 Share Posted September 21, 2024 Hi. Admin menu => Shop Parameters => Traffic & SEO => Search Engines Link to comment Share on other sites More sharing options...
nicoledesignstore Posted September 21, 2024 Author Share Posted September 21, 2024 Hi, thanks for you feedback, but I do not understand your reply. The panel of the screenshot is to allow presta shop tracking down stie research source that lead to the website. My issue is with bots that toggles the layered search in the categories multiple times in a row, creating a cue that depletes our server resources. I am not understanding why the controller of the faceted search is toggled despite the module being disabled. Link to comment Share on other sites More sharing options...
nicoledesignstore Posted September 21, 2024 Author Share Posted September 21, 2024 Hello, So I decided fo the time being to work on the htaccess side of the presta shop directory, blocking any request that contains the string "q=" and redirecting to the root folder <IfModule mod_rewrite.c> RewriteCond %{QUERY_STRING} q=([A-Z-a-z-0-9]+) [NC] RewriteRule (.*) /ROOTFOLDER [R=301,L,QSD] </IfModule> Does anyone have any clue which kind of other functionalities of the website may be put in jeopardy from this action? Is there any other presta shop functionality that may depend on "q=" other then faceted search? Please let me know 1 Link to comment Share on other sites More sharing options...
kerami82 Posted November 6, 2024 Share Posted November 6, 2024 (edited) Hello @nicoledesignstore. I have the same problem with bots. Do you found other resolution of the problem? Edited November 6, 2024 by kerami82 (see edit history) Link to comment Share on other sites More sharing options...
nicoledesignstore Posted November 7, 2024 Author Share Posted November 7, 2024 Hi, Unfortunately the only workaround that works is to block faceted search controller use for every visitor by using the string above in htaccess file. There is not a single clean solution for this, unless you start blocking any IP from which those bots come from. But you would need like an half hour a day worth of time to do so and always having the IP tracking activated. Link to comment Share on other sites More sharing options...
tapukatata Posted December 13, 2024 Share Posted December 13, 2024 I have the same issue. I have about 50 000 generated links with "?q=" and even google bot detects all of them Link to comment Share on other sites More sharing options...
mypresta.rocks Posted February 28 Share Posted February 28 On 9/20/2024 at 10:00 PM, nicoledesignstore said: Hello, Our website is being saturated via a new kind of attack from spam ip addresses. It seems like the bots use a functionality embedded to PrestaShop to create indexes of faceted search, filtering products in specific categories by brand. I alredy: - tried to disable the affected category (they move to another one) - disable and even delete faceted search module (the controller keeps working) I run on PrestaShop 1.7.8.5. Is there a way to permanently disable the controller that responds to the "?q=" request in a link? Thanks a lot for any help provided! Hello nicoledesignstore, I was working today with a store that was being brought down every 5-15 minutes by a similar bot attach - targeting a category url with filters. Mass requests, that were overpowering PHP and mysql - store was not responding for real customers. if you have access to terminal, you can type this command there to confirm you have the same problem: echo "=== $(date) ==="; echo "LOAD: $(cat /proc/loadavg)"; echo "=== PHP PROCESSES ==="; ps aux | grep php-fpm | grep -v grep; echo "=== MYSQL QUERIES ===";mysql -e "SHOW FULL PROCESSLIST" | grep -v Sleep; echo "=== RECENT REQUESTS ==="; tail -20 /var/log/nginx/access.log to resolve this, for now we applied a dispatcher override fix, please see: <?php class Dispatcher extends DispatcherCore { public function dispatch() { // Only redirect if 'q' is present AND the request is NOT an AJAX call. if (Tools::getValue('q') && ( empty($_SERVER['HTTP_X_REQUESTED_WITH']) || strtolower($_SERVER['HTTP_X_REQUESTED_WITH']) !== 'xmlhttprequest' )) { // Get the full request URI (path + query string) $uri = $_SERVER['REQUEST_URI']; $parts = parse_url($uri); $path = $parts['path']; $queryParams = []; if (isset($parts['query'])) { parse_str($parts['query'], $queryParams); } // Remove the problematic "q" parameter unset($queryParams['q']); // Rebuild the query string (if there are other parameters) $newQuery = http_build_query($queryParams); // Build the new URL using the original path $newUrl = Tools::getShopDomainSsl(true) . $path; if (!empty($newQuery)) { $newUrl .= '?' . $newQuery; } // Redirect with a 301 status code header("Location: $newUrl", true, 301); exit; } // For AJAX requests (or if 'q' isn't set), continue as normal. return parent::dispatch(); } } This, makes direct http/browser access for sites to url with filter be redirected to that exact same page without filter, while keeping the standard prestashop layered filters module work - detects if request is made by AJAX and if so, keeps the filtering functionality working. Always consider how this impacts your SEO. But for us, it eliminated the problem with the store being overwhelmed and inaccessible for real people. 1 1 Link to comment Share on other sites More sharing options...
addisnetwork Posted June 18 Share Posted June 18 Hi, try adding the following code to the beginning of your .htaccess file. Basically, what it does is prevent bots from performing faceted searches or filtering. RewriteEngine On RewriteCond %{HTTP_USER_AGENT} (CCBot|ChatGPT|GPTBot|anthropic-ai|ClaudeBot|Google-CloudVertexBot|Omgilibot|Omgili|FacebookBot|Meta-ExternalAgent|Meta-ExternalFetcher|Diffbot|DuckAssistBot|AI2Bot|Bytespider|PerplexityBot|ImagesiftBot|Kangaroo-Bot|cohere-ai|cohere-training-data-crawler|PanguBot|Timpibot|Webzio-Extended|YouBot|Amazonbot|bingbot) [NC] RewriteCond %{QUERY_STRING} (^|&)q=[^&]* [NC] RewriteRule .* - [G,L] Link to comment Share on other sites More sharing options...
mypresta.rocks Posted June 18 Share Posted June 18 @addisnetwork thanks for your input, one thing I'd like to add is that blocking bots is not exactly ideal in any e-commerce store due to the constant growth of traffic generated by the AI chat services. If anyone is interested in a module that keeps the default faceted filter block running, while preventing the harmful effect of massive bot traffic, please contact me in a direct message. I may release it on my my shop in the near future, however at the moment I am focused on other modules. Please see by Semrush Link to comment Share on other sites More sharing options...
Paul C Posted July 12 Share Posted July 12 I've just recently come across this but I think it was maybe a low-level issue for a while. Despite the standard robots.txt file disallowing crawling for /*?q= amongst others (same happens with /*?order= and /*&order=), bingbot is particularly bad at deciding to ignore these entries and go crawling the many possible combinations. These aren't links that are in any way useful to have indexed. I have implemented custom security rules on cloudflare for a customer whose site was recently getting hammered both by genuine bots crawling these, along with a weird collection of other apparently random ip addresses (all with valid user agents set). The challenge solve rate on the "non-bot" requests was 0% indicating that it was indeed automated and/or malicious. The issue with just blocking non-ajax calls is that someone might genuinely want to bookmark a particular filtered page on your site and you want that to work when they visit it or if it has been shared and others click the link. Using a "Managed Challenge" strategy for these maintains functionality, whilst protecting your site. @mypresta.rocks I work with content creators who have seen a massive decline in traffic due to AI. Google's AI results are literally stealing clicks from these folks and is reflected in the above Semrush prediction of lower total visitor numbers over time. I suspect the decline will be steeper and any recovery slower. This is part of the reason that cloudflare now block AI training bots by default. In terms of internet quality, AI is a race to the bottom. Link to comment Share on other sites More sharing options...
El Patron Posted July 12 Share Posted July 12 2 hours ago, Paul C said: I've just recently come across this but I think it was maybe a low-level issue for a while. Despite the standard robots.txt file disallowing crawling for /*?q= amongst others (same happens with /*?order= and /*&order=), bingbot is particularly bad at deciding to ignore these entries and go crawling the many possible combinations. These aren't links that are in any way useful to have indexed. I have implemented custom security rules on cloudflare for a customer whose site was recently getting hammered both by genuine bots crawling these, along with a weird collection of other apparently random ip addresses (all with valid user agents set). The challenge solve rate on the "non-bot" requests was 0% indicating that it was indeed automated and/or malicious. The issue with just blocking non-ajax calls is that someone might genuinely want to bookmark a particular filtered page on your site and you want that to work when they visit it or if it has been shared and others click the link. Using a "Managed Challenge" strategy for these maintains functionality, whilst protecting your site. @mypresta.rocks I work with content creators who have seen a massive decline in traffic due to AI. Google's AI results are literally stealing clicks from these folks and is reflected in the above Semrush prediction of lower total visitor numbers over time. I suspect the decline will be steeper and any recovery slower. This is part of the reason that cloudflare now block AI training bots by default. In terms of internet quality, AI is a race to the bottom. I'm starting to see some of my PrestaShop solutions being suggested by AI...that's like getting your name and number listed in the phone boook...I'm sure there is future plan to monetize placement of suggested solutions... Link to comment Share on other sites More sharing options...
Paul C Posted July 12 Share Posted July 12 26 minutes ago, El Patron said: I'm starting to see some of my PrestaShop solutions being suggested by AI...that's like getting your name and number listed in the phone boook...I'm sure there is future plan to monetize placement of suggested solutions... Like everything there will be winners and losers. If you're an expert on a subject and write public articles that answer specific questions, then be prepared for your work to be summarised and delivered to people with no visit to your site. If you sell a unique enough product that satisfies a niche, then you might well get referrals. I suspect small stores that sell commodity items will find it tough, particularly during the wild west stage - and that could be a significant proportion of *your* potential customer base. Google has every incentive to shift traffic via its AI products and monetise the sh*t out of it. They'll charge extra "because it uses AI". Link to comment Share on other sites More sharing options...
VICOM Posted August 8 Share Posted August 8 Hi, We also have over 10,000 requests from fake IPs every day. How to get rid of this? We have prestashop 1.7.8.8. any suggestions ? Best Regards Link to comment Share on other sites More sharing options...
Paul C Posted August 10 Share Posted August 10 (edited) On 8/8/2025 at 9:40 AM, VICOM said: Hi, We also have over 10,000 requests from fake IPs every day. How to get rid of this? We have prestashop 1.7.8.8. any suggestions ? Best Regards The most "performant" way is to firewall them. You can use a free cloudflare account and set up security rules to block bots accessing urls with the offending query strings. For non-bots you can use "managed challenge" and in cases where folks have bookmarked the urls, then they can still access the page (albeit with a turnstile captcha). Just be careful to check that the requests aren't referred by your own host and are simply the ajax calls used by the faceted module (can be set up as part of the cloudflare security rules). Using cloudflare with Prestashop isn't without challenge though and your server may need to be appropriately configured to pass through the genuine visitor ip addresses (otherwise you'll have issues with ip-matched cookies). For cpanel accounts there was also a recent issue with an easyapache release that broke proxy servers (421 Misdirected Request Error). I assume you're also seeing these kinds of urls in Search Console? Probably not indexed. Bing seems to be one of the worst at crawling these urls but that should be prevented by an appropriately configured robots.txt, although search engines these days (possibly related to machine learning) seem to be much more free and easy about their crawls.... EDIT: Whether cloudflare works for you can also be down to the particular modules that you use, so careful testing is required. This is particularly true of cache and payment modules. I have had success with using it with PrestaShop 8.x though. Paul Edited August 10 by Paul C Added additional note on cloudflare compatibility. (see edit history) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now