Add a Google Sitemap

The Sitemap Protocol allows you to inform search engine crawlers about URLs on your Web sites that are available for crawling. The XML Sitemap Format provides a list of URLs and includes additional information about those URLs.

Create a Pagetemplate (Content-Type:text/xml) in your custom Folder. The Name should be something like sitemap.xml or google-sitemaps.xml (the .xml Extension seems to be required by google)

paste the code:

 <?xml version="1.0" encoding="UTF-8"?>
 <urlset xmlns="http://www.google.com/schemas/sitemap/0.84"
        xmlns:tal="http://xml.zope.org/namespaces/tal"
        xmlns:metal="http://xml.zope.org/namespaces/metal"
         tal:define="results python:container.portal_catalog(
                             portal_type = ['News Item','Document', 'Topic'], 
                             review_state=['published'],
                             sort_on='modified' ,sort_order='reverse');
                    dummy python:request.RESPONSE.setHeader('Content-Type', 'text/xml');
                    dummy2 python:request.RESPONSE.setHeader('charset', 'UTF-8');"
         tal:condition="results"
         tal:on-error="nothing" > 
  <url tal:repeat="result results" tal:on-error="nothing">       
            <loc tal:content="result/getURL">http://www.yoursite.com/</loc>
            <lastmod tal:content="python: DateTime(result.modified).HTML4()">2005-01-01</lastmod>       
   </url>
 </urlset>

into it.

This Pagetemplate list all your published contents of the types News Item,Document and Topic. Add all contenttypes or states you want to have listed in the sitemap.

If you are using EasyRating you may want to add a priority tag:

 <priority tal:condition="result/amount_of_ratings | nothing" tal:content="python:result.average_rating/5">0.8</priority> 

assuming that your ratings are in a range between 0 and 5 (the default values). If you define other values you have to change the tal:content acordingly.

You can specify the location of the Sitemap using a robots.txt file. To do this, simply add the following line:

Sitemap: <sitemap_location>

The <sitemap_location> should be the complete URL to the Sitemap, such as: http://www.example.com/google-sitemaps.xml. This directive is independent of the user-agent line, so it doesn't matter where you place it in your file.

You can add your sitemap to Google Sitemaps at http://www.google.com/webmasters/sitemaps/

Learn more about sitemaps at sitemaps.org

DateTime

Posted by Christian Ledermann at Jun 07, 2005 10:15 AM
The encoding must be .HTML4() not .ISO().
Corrected in the code.

google refuses my sitemap

Posted by Christof Haemmerle at Jun 15, 2005 11:54 PM
i tried this with the code above and with the google sitemap product.
i always get the error message:

We couldn't find the Sitemap at the location you provided. Please make sure the Sitemap URL is correct and resubmit your Sitemap.

the url is: http://www.buero-newyork.com/google-sitemaps
when i open this with firefox everything seams to be ok.

thanx reco

re: google refuses my sitemap

Posted by Christian Ledermann at Jun 16, 2005 11:02 AM
Thats a question to ask the google guys. The sitemap at the above url is working.

wired but thanx reco

Posted by Christof Haemmerle at Jun 16, 2005 02:35 PM
wired but thanx reco

enconding seams to be corruped

Posted by Christof Haemmerle at Jun 21, 2005 09:27 PM
when i download my site map. save as text/xml open it in my exteditor it shows a different endocing, as soon i change to utf-8, upload the sit emap to the plone server and let google subscribe the static xml file everything works.

it the output of pgsm really utf-8?

thanx reco

now it worked

Posted by Christof Haemmerle at Jun 21, 2005 09:34 PM
1. in the first line the UTF-8 encoding is missing:
<?xml version="1.0" encoding="UTF-8"?>

2. in my case the object name has to end with xml. so i simply renamed the google-sitemaps pagetemplate to google-sitemaps.xml

utf-8

Posted by Christian Ledermann at Jun 22, 2005 10:32 AM
The sitemap only returns the ids (absolute_url) which are (AFAIK) in ascii (<128) and the date (also all chars < 128). So the encoding should work anyway. Yes I know some encoding should be added to be on the sure side but actually I do not think this is necessary.

Space before Document

Posted by constantin at Jul 15, 2005 11:58 AM
In lien 6 of the code above there is a small but annoying typo: an extra space in ' Document'. This prevents things of the type Document to be found. Replace the string with 'Document' and everything is OK.

Another warning: By default the index_html page of a Plone site is not published. Lots of people forget it, because in the default setting it is visible and thus will be shown when the site is visited. If you want the index_html to be included in the Google sitemap, publish it.

fixed the typo

Posted by Christian Ledermann at Jul 15, 2005 05:14 PM
thanks for you input.

Sitemap does not seem to show any data

Posted by Blair Lowe at Sep 27, 2006 08:29 PM
Hi,

I am running Plone 2.0.3 . I followed the instructions to a tee, and still have no data as shown in the sample sitemap. If I go to my sitemap, it just shows:

<urlset>
<url>
      
  </url>
<url>
      
  </url>
</urlset>

Has anyone got this working with Plone2-2.0.3 and apache 2.0.52 on Linux (centos in this case)? Here is my code:

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84"
       xmlns:tal="http://xml.zope.org/namespaces/tal"
       xmlns:metal="http://xml.zope.org/namespaces/metal"
        tal:define="results python:container.portal_catalog(
                            portal_type = ['News Item','Document', 'Topic'],
                            review_state=['published'],
                            sort_on='modified' ,sort_order='reverse');
                   dummy python:request.RESPONSE.setHeader('Content-Type', 'text/xml');
                   dummy2 python:request.RESPONSE.setHeader('charset', 'UTF-8');"
        tal:condition="results"
        tal:on-error="nothing" >
  <url tal:repeat="result results">
      <tal:block tal:define="resultObject result/getObject;"
                 tal:on-error="nothing">
           <loc tal:content="resultObject/absolute_url">http://www.sleepees.net/</loc>
           <lastmod tal:content="python: DateTime(resultObject.modified()).HTML4()">2006-09-12</lastmod>
      </tal:block>
  </url>
</urlset>

RE:Sitemap does not seem to show any data

Posted by Christian Ledermann at Sep 28, 2006 08:10 AM
Is your content published? delete the line "review_state=['published'], " and try again.

Yes this is running on plone 2.0.3 (that was the version at the time this howto was written) with apache and linux.

RE:Sitemap does not seem to show any data

Posted by Blair Lowe at Oct 03, 2006 04:24 PM
Yes it was published.

I removed that line and get the same dead result.

What I really need is an upgrade to 2.05, but I cannot find a .src.rpm file for centOS (rhel4) to do that.

RE:Sitemap does not seem to show any data

Posted by Christian Ledermann at Oct 04, 2006 06:21 AM
When you comment out the tal:on-error statements does it thow an exception?
I think there might be something wrong with your catalog.

RE:Sitemap does not seem to show any data

Posted by Blair Lowe at Oct 05, 2006 04:04 PM
No just <urlset/>

Error Message

Posted by Billye Joyce Roberts at Oct 22, 2006 04:27 AM
I cut and pasted the code with the only change being adding my site's URL and this is the error message I got:

Compilation failed
xml.parsers.expat.ExpatError: unbound prefix: line 3, column 4

I apologize if this is an easy fix, but would appreciate any help as I have no idea what this means.

Thanks.

Rework URL?

Posted by arboundy at Nov 15, 2006 12:26 PM
For internal reasons I reverse proxy from another server (for "external" requests) and keep the site internaly on IP, thus my results are:

<loc>http://10.1.1.3/site/front-page</loc>

Is there a way within the code to rework the result to:

<loc>http://www.example.com/site/front-page</loc>

Im thinking in line:

<loc tal:content="result/getURL">http://www.arboundy.com/</loc>

Cheers

Show also public draft pages

Posted by Marilen Corciovei at Dec 05, 2006 06:33 AM
In order to show also the public draft pages in your sitemap you just have to change the review_state params to:
...
review_state=['published','visible'],
...

XML Page Cannot Be Displayed

Posted by Andrew Gould at May 14, 2007 02:00 AM
I've added the sitemap.xml file to portal_skins/custom but I get an error message when trying to view it on the site: "The XML page cannot be displayed. Cannot view XML input using XSL style sheet. Please correct the error and then click the Refresh button, or try again later. The system cannot locate the object specified." The file is a direct c/p from the above, with no modifications. The "test" tab in the ZMI displays the sitemap correctly.

sitemap

Posted by Mike Johnston at Sep 14, 2008 02:00 PM
I tried making this work but had no luck on my site: http://cmscritic.com

Works with Plone 2.5 still

Posted by Julian Robbins at May 26, 2009 05:09 PM
Still works fine for Plone 2.5 if you still need it ...