Jump to content

[Module] to potentially fix the gsitemap module for PS 1.2


Paul C
 Share

Recommended Posts

I managed to have a play with the code for this today and threw a fair bit of it out :lol:

This version should now generate the urls correctly, adds an automatic change timestamp on the cms documents (so we behave nicely and don't lie to Mr Google). It also comes with a nice xsl xml-stylesheet added as a processing instruction so that at least when it does mess up it'll be easily readable in your browser :P

Please test, as I've currently only got a demo PrestaShop store to play with, so can't test across a range of languages and configurations (especially needs testing with a store installed in a subdirectory). Seems to work for folks whether using "Friendly" urls or not.

***YOU WILL HAVE TO UNINSTALL THE OLD VERSION BEFORE COPYING THIS TO YOUR STORE AND INSTALLING IT***

***Latest version is attached for convenience (v1.4.4) It supports versions 1.2.0.x and 1.2.1.x including indexing images as introduced in the 1.2.1.0 update.

Hope this helps a few of you!

Paul

P.S. Example output can be viewed at: http://prestashop.ecartservice.net/sitemap.xml for friendly urls and
http://prestashop.ecartservice.net/sitemap_ugly.xml with them turned off

gsitemap_1_4_4.zip

Share this post


Link to post
Share on other sites

Great stuff - I've submitted the sitemap for the demo store too, so I'll keep an eye on it in Webmaster tools and see if there is anything else I need to do to make this as effective as possible.

I've also enabled the canonical url module on that store to test it out too (in conjunction with the sitemap, as the two are similar "hints" to the search engines). I'm not 100% convinced it's doing what we need it to do either, but I'll report back on my findings soon. I'm going to do exhaustive SEO tests on the demo store to see if there any any other issues we need to address.

Paul

Share this post


Link to post
Share on other sites

I de-installed and re-installed the google sitemap module, after generating I get no feedback in my admin (no confirmation that the sitemap is generated).
Looking in the root of my server is see a sitemap.xml with 0 bytes.

Edit: this happens on my live and shadow servers/site.

Share this post


Link to post
Share on other sites

I should perhaps clarify - There are two options to create xml documents SimpleXML (which was used originally) and DOMDocument.

I chose to use DOMDocument as it's standard (same interface as used in javascript etc.) and in my opinion more reliable --SimpleXML has been known to do "weird stuff". DOM does have to be enabled in PHP though, so some hosting providers may have chosen not to support it -- although I can't think of any good reasons why they would consciously decide to do that.

If it turns out to be a huge issue I might do two versions, and then choose which method to use depending on server support :cheese:

Paul

Share this post


Link to post
Share on other sites

Right the error is related to it being unable to load the DOMDocument class, which is a little odd, since it appears to be installed (at least it's configured as enabled on the server - it is normally, and you have to choose to disable it). However the server isn't finding it in the php include path, so it's falling back to the PrestaShop standard behaviour of looking in the /classes folder.

I'm afraid I'm not an expert on server configuration really, so I'm not sure what could be done to fix the issue, but the server isn't configured correctly for DOM.

If this is a common problem, then I'll need to think of a work-around!

Paul

Share this post


Link to post
Share on other sites

Right the error is related to it being unable to load the DOMDocument class, which is a little odd, since it appears to be installed (at least it's configured as enabled on the server - it is normally, and you have to choose to disable it). However the server isn't finding it in the php include path, so it's falling back to the PrestaShop standard behaviour of looking in the /classes folder.

I'm afraid I'm not an expert on server configuration really, so I'm not sure what could be done to fix the issue, but the server isn't configured correctly for DOM.

If this is a common problem, then I'll need to think of a work-around!

Paul


Hi Paul, glad to see that you could find this "problem" on my server.
I will talk to my hosting-company and see what they come up with.

As soon as I got some answers I will let you know.

Meanwhile, keep the FTP and other information so you can access my test-server if needed.

Thanks again!

Share this post


Link to post
Share on other sites

I think I'll add a work-around anyway, might as well ;)

Paul


Turns out the DOMdocument is indeed incorrect configured/installed on both of my domains!
There where errors in my serverlog about missing classes.

My hoster is taking care of it and will contact me back if they have sort it out.

Share this post


Link to post
Share on other sites

Cool. I'll hold off making any changes until we determine if this was a "localised" problem ;-)

Paul


It turned out that de DOMdocument module was not installed at all.

Now it is and your 1.4.3 module seems to be working.

I've send it to Google and now we just have to wait and see.

Thanks so far! :smirk:

Share this post


Link to post
Share on other sites

Wasn't it? Strange as it's part of the PHP5 distribution, and was enabled in the php startup!

We'll see if anyone else who uses it has a problem I guess.

The sitemap I submitted for my demo store has 1 url indexed (the homepage I suspect) which bothered me until I realised I'd also turned on the "friendly urls" option for the first time before generating the sitemap I submitted.... Once Google catches up with those I should be able to continue my "experiment" on this and the canonical module.

Paul

Share this post


Link to post
Share on other sites

Hi Paul,

It seems I am having the same issue. The old version 1.4.0 worked fine, but now I get the zero length file and no confirmation page.

I may be out of luck with getting DOM installed - I am on a shared server. Don't think it can be enabled on my backend?


Dave

Share this post


Link to post
Share on other sites

Hi Paul,

It seems I am having the same issue. The old version 1.4.0 worked fine, but now I get the zero length file and no confirmation page.

I may be out of luck with getting DOM installed - I am on a shared server. Don't think it can be enabled on my backend?


Dave


I'm on a shared server to. Just ask you hoster.

Share this post


Link to post
Share on other sites

Hello Paul
Congratulations on your module. Great job.

I found a little problem.
I believe that the module does not support multilingual sites as you can see:

/lang-en/content/1-delivery
2009-08-13T13:43:15+00:00
monthly
0.8


http://192.168.x.x/content/1-livraison
2009-08-13T13:43:15+00:00
monthly
0.8



Hope you can fix it

UPDATE : The problem occurs only on a Mac OSX / MAMP configuration. Works great on my prod server RedHat/Apache/PHP 5.2.8/MySQL Ver 14.12 Distrib 5.0.67/

Share this post


Link to post
Share on other sites

hello

great module, well done !

just one question, maybe stupid, but why in the xml file generated by the module, we can't find all the links to the products or the cms content, I have 600 products, but i can see only few of them

Thanks

Share this post


Link to post
Share on other sites

@AlexH: hmm I think the problem has to do with the Global Server variables - the code removes the first bit based on the domain name configured in the store- it's a hack because 1.1 and 1.2 return different results from the Link generation functions; it obviously doesn't work on a local server- but shouldn't be a big issue unless you want you local webserver indexed :)

@fab4_33: Missing CMS pages can be explained since, like the original - it looks in the cms_block table to determine which pages are "active" - I realise that many people create CMS pages for other uses, so that's something that probably needs changed.

On the products.... the only thing I can think of is that you have products and/or categories marked inactive somehow? If that's not the case, then it's something we'll have to look at!

Paul

Share this post


Link to post
Share on other sites

@AlexH: hmm I think the problem has to do with the Global Server variables - the code removes the first bit based on the domain name configured in the store- it's a hack because 1.1 and 1.2 return different results from the Link generation functions; it obviously doesn't work on a local server- but shouldn't be a big issue unless you want you local webserver indexed :)

@fab4_33: Missing CMS pages can be explained since, like the original - it looks in the cms_block table to determine which pages are "active" - I realise that many people create CMS pages for other uses, so that's something that probably needs changed.

On the products.... the only thing I can think of is that you have products and/or categories marked inactive somehow? If that's not the case, then it's something we'll have to look at!

Paul


for the cms I have 46 cms page, only 9 in the sitemap

concerning the products as i said I have around 600 producs, few of them inactive (around 10), but the sitemap show around 100

my sitemap

Share this post


Link to post
Share on other sites

Looking at the changelog, then no it doesn't look like it - they may have made changes something in the core, but there doesn't seem to be any clues in the changelog.

Looking at the code, there have been changes made that may fix it (they appear to manipulate the $_SERVER['SCRIPT_NAME'] prior to calling Link::getLangageLink() now), but as always you'd have to test it!

The fundamental problem appears to be the method used to re-write the urls. It works on a "real" page by using the current page and server variables - however - when you try to generate them from the back office, these variables don't make "sense" to the Link::getLangageLink() function call (it assumes that these are "valid"). This is why these "constants" have to be manipulated. I would have liked to see the Link class having the tools required to generate an "friendly" SEO url based on the link type (e.g. Link::getProductLink() should return a full "friendly" url to the product optionally including the protocol and domain) cleanly.

They've added the images in there though -- why? I have no idea really, as I wouldn't have though that this was a particularly good thing for your store, unless you sell your own products. One of the things I do on sites is to block the Google image crawler as it makes it too easy for folks to search for and then link to your images and steal your bandwidth ;)

I think it might be a good idea for me to rename my version of the module, and for my own purposes at least I'll keep maintaining it I think. It would be good to have the ability to add support for other non-official modules at the very least, and add some configuration to customise the behaviour (e.g. allowing you to change the priority settings).

Paul

Share this post


Link to post
Share on other sites

I have made some fixes to last version of gsitemap module (PS 1.2.1.0) for language problem.
There is a problem with image links also. To fix that you have to add to AdminGenerator.php the following line to htacces rules:
$tab['RewriteRule']['content']['^([a-z0-9]+)\-([a-z0-9]+)/([_a-zA-Z0-9-]*)\.jpg$'] = 'img/p/$1-$2-home.jpg [L,E]';
You can replace "home" with the image format you want to index.
(Tested only for URL Friendly enabled)

gsitemap.php

Share this post


Link to post
Share on other sites

Grif, I think you've confirmed that the 1.2.1.0 sitemap still doesn't work, but now we've added images that don't work either :lol:

I don't think adding the images to the sitemap helps us a great deal to be honest -- see my previous post.

Paul

Share this post


Link to post
Share on other sites

Right, it appears that the 1.2.1.0 update broke my version again, so attached is 1.4.4 (I'll update the first post in the thread I think with the latest version, and remove all the old ones to prevent confusion).

This one should work with or without "friendly" urls in both 1.2.0.x and 1.2.1.x

I've also added the urls for the images, I guess you can always prevent them being crawled with robots.txt should you choose to -- although it will cause errors/warnings in webmaster tools if you do.

Paul

gsitemap_1_4_4.zip

Share this post


Link to post
Share on other sites

  • 3 weeks later...
  • 2 weeks later...

Thanks Paul!

I just discovered that the sitemap was not good, after upgrading to 1.2.4.
Just installed it and works like a charm.

Just one thing where i was thinking of.. now you get domain.com/lang-fr for example..
Could lang-fr be changed to yourkeywordofchoise-fr? to get better search results on a language / keyword you want.

Marco

Share this post


Link to post
Share on other sites

  • 2 months later...
  • 1 month later...

Hi all... I'm getting this error with this module.

Unknown column 'ct.ecs_lastmodified' in 'field list'


SELECT DISTINCT b.id_cms, cl.link_rewrite, cl.id_lang, Unix_Timestamp(ct.ecs_lastmodified) as lastmod
FROM ps_block_cms b
LEFT JOIN ps_cms_lang cl ON (b.id_cms = cl.id_cms)
LEFT JOIN ps_cms ct ON (b.id_cms = ct.id_cms)
LEFT JOIN ps_lang l ON (cl.id_lang = l.id_lang)
WHERE l.`active` = 1
ORDER BY cl.id_cms, cl.id_lang ASC

Any ideas?

Thanks as always.
Patrick

Share this post


Link to post
Share on other sites

Guest
This topic is now closed to further replies.
 Share

×
×
  • Create New...

Important Information

Cookies ensure the smooth running of our services. Using these, you accept the use of cookies. Learn More