Integrate external content in Plone

Describes how I managed to integrate some external content into my plone site using the plone look.

Background

I have some content sitting on another server that I wanted to integrate into the rest of my plone site. This describes how to do it (in probably completely the wrong way, but it works for me!)...

WARNING this method fiddles with traversal and is slightly magic. It works for me on zope 2.5.1 and plone 1.0beta2 and Zope 2.7.0 and plone 2.0.4. Your milage may vary... In particular, relative links inside your external document may not work correctly. As always, try it on a test Plone instance before putting it on a production site!

What to do

Put all of these files in your custom skin folder

  • getBody.py - an external method, mounted as getBody:
      import urllib
      import pre
      def getBody(self, REQUEST, url):
         page = urllib.urlopen(url)
         data = page.read()
         contents = page.info().getheader('Content-Type')
         page.close()
         REQUEST.RESPONSE.setHeader('Content-Type', contents)
    
         #check that it is an html document
         if pre.search('<(h|H)(t|T)(m|M)(l|L)',data):
             #find start of body conents
             #assumes <body *>tag is on one line (?)
             start = pre.search('<(b|B)(o|O)(d|D)(y|Y)[^>]*>',data).end()
             end = pre.search('</(b|B)(o|O)(d|D)(y|Y)',data).start()
             return {'data': data[start:end], 'Content-Type': contents}
         else:
             return {'data': data, 'Content-Type': contents}
    

This gets the external page, and if that page is an html document, strips it down to the body.

  • external - a python script:
      import string
      request = context.REQUEST
      extpath = request['ExternalPath']
      exturl = context.link + extpath
      obj = context.getBody(context, request, exturl)
      if obj['Content-Type'] == 'text/html':
          return context.external_view(contents=(obj['data']), extpath=extpath)
      else:
          return obj['data']
    

This script marshalls whether to put the external page through a page template or not (so images come through in one piece)

  • external_view - a page template:
      <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
                "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
      <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US"
            lang="en-US"
            metal:use-macro="here/main_template/macros/master">
    
      <head metal:use-macro="here/header/macros/html_header">
        <metal:block metal:fill-slot="base">
              <base href="" tal:attributes="href string:${here/absolute_url}/${options/extpath}" />
        </metal:block>
      </head>
    
      <body>
          <div metal:fill-slot="main">
              <div tal:replace="structure options/contents">
                    replaced site contents
              </div>
          </div>
      </body>
      </html>
    

This is the page template that defines the look of the external conten. Note the fiddling with teh base fill-slot, this ensures that relative links in the external content work properly

  • Next create a new plone folder to act as the root of the external content

Give the folder a string property named link, containing the url for the external site, ie http://somwhere/somefolder/

  • create a python script in the folder called 'access_rule':
      #use /_SUPPRESS_ACCESSRULE/manage to access this
      from string import split,join
      request = container.REQUEST
    
      #The stack of URL path items after here
      stack = request['TraversalRequestNameStack']
    
      #make a copy of the remaining path to be used later
      extpath = stack[:]
      extpath.reverse()
      request.set('ExternalPath', join(extpath, '/'))
    
      #consume the rest of the stack so that external path items are ignored in traversal
       while stack:
          stack.pop()
    
      #add the path of objects in the title to the stack
      add_path = filter(None, split(script.title, '/'))
      add_path.reverse()
      stack.extend(add_path)
    

Set the script title to the path that should be appended during traversal, here /external

This bit can be dangerous and stop your Zope Management Interface from working

use http://url/to/folder/_SUPPRESS_ACCESSRULE/manage to remove it if you're having problems

finally set the access rule for the folder to access_rule (add Set Access Rule in the zope management interface)

Hopefully everything should now be working... have a look at http://umiststudents.com/about/minutes/meetings/ for a demo. (Plone 1)

References

Laurence Rowe ( l at lrowe dot co dot uk )

dealing with the query part of the URL

Posted by Kevin Lacobie at Apr 13, 2005 02:37 PM
The **external** method documented here doesn't deal with the query part of the URL very well. That is, the part after the question mark: http://host/foo?param=value&param2=etc...

A simple modification to fix this is to use the following in the external method:

exturl = context.link + extpath + '?'
for k,v in request.form.items():
  exturl = exturl + k + '=' + v + '&'

OK, I'm bad at quoting ...

Posted by Kevin Lacobie at Apr 13, 2005 02:38 PM
... gotta remember structured markup better. In the code snippet above, quotes should be around the question mark, equal sign and ambersand.

Incompatible with URLs containing :

Posted by Susheel Mannepalli at Dec 07, 2005 06:20 AM
This how-to works well with basic URLs, which do not contain special characters. It does not work with special characters like ':' with out quotes.
For example: http://somemachine.somedomain:someport/index.php?ulist=a:b:c

Above URL errors out as ':' with out quotes is considered a de-limiter/splitter in python method.

Any help on this particular issue??

correction

Posted by Susheel Mannepalli at Dec 07, 2005 06:22 AM
instead of de-limiter/splitter, I should have said stack delimiter

How to continue ?

Posted by Benoit Blais at Apr 25, 2006 07:44 PM
1st of all, excellent howto!
I can now get external content though my plone site.
My problem is the following : I'd like to place that content inside my actual plone template, not in an empty page...
Could someone help me out and point me where to look, I wonder how come I'm so confused, maybe the too many hours in front of that silly computer... Any help will be appreciated.

Productized

Posted by Shane Graber at Sep 16, 2006 02:27 PM
FYI someone has made this into a Plone product: http://www.agoric.com/products/externalcontent/

Shane

Doesent work in Plone 2.5.1

Posted by Jacob Nordfalk at Dec 08, 2006 11:29 AM
I tried instaling http://www.agoric.com/products/externalcontent/ on Plone 2.5.1 on Linux but it doesent work. No product is ever installed. Seems like the product was done in Windows, becaurse some names are wrongly uppercase (I tried moving __INIT__.PY to __init__.py etc but still no luck)

some files missing

Posted by Patrick Pollet at Dec 16, 2006 05:28 PM
This will not work :

Looking at Extensions/Install.py

<pre>
from Products.ExternalContent import PROJECTNAME,product_globals
</pre>

These 2 files are missing from the product, so no surprise ; it won't install...

Please fix .

getting close but ...

Posted by Patrick Pollet at Jan 04, 2007 05:22 PM
After renaming all files to lowercase, the product still do not install due to an error in import (import pre instead of import re) in ExternalContent.py.

Now it does install but, adding an external content gives the following traceback :

<pre>
Traceback (innermost last):
  Module ZPublisher.Publish, line 115, in publish
  Module ZPublisher.mapply, line 88, in mapply
  Module ZPublisher.Publish, line 41, in call_object
  Module Products.CMFCore.FSPythonScript, line 108, in __call__
  Module Shared.DC.Scripts.Bindings, line 311, in __call__
  Module Shared.DC.Scripts.Bindings, line 348, in _bindAndExec
  Module Products.CMFCore.FSPythonScript, line 164, in _exec
  Module None, line 12, in external2
   -
   - Line 12
  Module Products.ExternalContent.ExternalContent, line 78, in extractBody
  Module urllib, line 82, in urlopen
  Module urllib, line 190, in open
  Module urllib, line 313, in open_http
  Module httplib, line 798, in endheaders
  Module httplib, line 679, in _send_output
  Module httplib, line 646, in send
  Module httplib, line 614, in connect
IOError: [Errno socket error] (-2, Nom ou service inconnu)

</pre>

patched version available

Posted by Patrick Pollet at Jan 05, 2007 09:46 AM
A modified version of the product is available here :

http://cipcnet.insa-lyon.fr/[…]/ExternalContent_0.3.2_PP.tgz

It "quick_install" OK under Plone 2.51, Zope 2.9.5 final on Linux.

See CHANGES file.

 

thanks for the patch

Posted by John DeStefano at Jan 16, 2007 06:23 PM
Patrick, thank you for patching the public version and for making your version available.

How extensively have you played with this product, and how well does it serve your purposes? I have been experimenting with the windowZ product, which essentially does the same thing but has some limitations, specifically with cookie transfers and proxying. From the looks of "Planned Enhancements" on the ExternalContent page, I think they may both suffer currently from the same limitations.

a very basic patch indeed

Posted by Patrick Pollet at Jan 17, 2007 11:57 AM
Hello,
   Publishing it was the minimum I could do. This is all what Open Source software is about ;-)

   I did not play much with it. My purpose was only to quickly include a few dozen of PHP scripts in my Plone site without rewriting them in TAL templates & Python. I do not use any cookies nor proxies. This product is simply for me a quick & dirty fix and will diseapear from my site when all PHP scripts will be rewritten.

   I peeked at the code of windowZ that's seems to me "more promizing" than ExternalContent. At least the "IFrame" used will include all the external site and all links with be properly followed within this IFrame. With ExternalContent, any link in the included page will overwrite your Plone skin. Try to include www.google.com ... The query page in indeed within your Plone site but the results of the search are 100% Google, and you Plone site is out...
    
   My two cents...

Cheers.