I am working on a site for a customer with 3.4 million products. I am running into issue with memory and queries long before I get to the full catalog size. They are currently using a very old product called "Interchange" - its simple database structure combined with super simplistic page layout works, but it is a 30 year old design... it's time to get them up and running on a more modern platform.
I have written custom import libraries and can get the products properly imported. I have been increasing the catalog size in powers of 10 and when I got to 1 million items, things went pear shaped.
The product detail page (showing a single item) alone required me to bump the max PHP memory to 512MB / process. This seemed off to me, so I enabled the debugging and profiling.
With caching disabled, this page throws 450 queries at the database. With it enabled (which causes other issues with catalogs of this size), it still executes roughly 140 queries. Some of the queries are pulling back a number of rows equal to the product count in the database.
The hardware needed is still being evaluated. If need be, I can put multiple headend servers talking to one of more dedicated MySQL servers. However, at 450 queries per item displayed and half a gig per process, search engine spiders are going to annihilate the site. I can't block web crawlers as they have nearly a $1 million / year in sales depending on organic search traffic.
Does anyone have any suggestions for optimizing PrestaShop for very large catalogs other than "throw more hardware at it"? I am trying to avoid having to do too awful much modifying of the core PrestaShop code as this would become a very expensive support issue for them. They have a decent revenue stream, but several employees and a warehouse full of parts - net profits don't really support massive IT spending.