Configure and Run ContentMirror

by John Samuel Anderson last modified Jul 02, 2010 01:50 PM
Connect ContentMirror to your database and create table structures for your Plone content types.

Install ContentMirror 

The fine details of this are beyond the scope of this document, but you can find out more here: ContentMirror.

 For my buildout-based installation, I added ContentMirror to the "productdistros" section, like this:

[productdistros]
urls +=
    http://contentmirror.googlecode.com/files/ContentMirror-0-4-1.tgz
Then, I re-ran buildout.

Configure ContentMirror

 

  1. Find the settings-example.zcml.  For buildout-based installations, it is located at parts/productdistros/ContentMirror/settings-example.zcml.  For non-buildout installations, look in Products/ContentMirror/settings-example.zcml.
  2. Copy settings-example.zcml to settings.zcml.
  3. Edit settings.zcml to point to the database:
    <!-- setup a database connection -->
    <db:engine url="postgres://localhost/mydatabase"
               name="mirror-db"
               echo="True"/>
    Be sure that you specify the correct username and password in this file, too. See the "Configuration" section here: http://code.google.com/p/contentmirror/wiki/Installation
  4. Create SQL statements for tables
    ddl.py postgres > somestuff.sql
  5. Run that SQL against your database, to create ContentMirror table.

    psql mydatabase < somestuff.sql
  6. Run bulk.py to export existing Plone content into the database.  (NOTE: bulk.py is buried in the eggs somewhere.)
Hack ContentMirror code

Everything worked fine the first time I tried this. But the second time, I ran into a couple roadblocks. They might not be true bugs, in the sense that they might work fine on normal data. But my database is full of bad, corrupted data. So, I had to make a couple hacks to get this to work.

Hack #1 - Run as Admin User

ContentMirror is only designed to mirror content that has been Published, and is viewable by Anonymous users. I needed to mirror ALL content, regardless of its workflow state. In bulk.py, I added three lines in main():

 def main( app ):
    if not len(sys.argv) == 2:
        print "mirror-batch portal_path"
        sys.exit(1)


    #Hack: run as admin user - 06/30/2010
    from AccessControl.SecurityManagement import newSecurityManager
    admin = app.acl_users.getUserById("admin")
    newSecurityManager(None, admin)

Incidentally, ContentMirror does mirror the workflow state to the external database, so I am able to see which content is Public, Private, etc. after export.

Hack #2 - Fix ReferenceTransform for bad data

I also had to hack transform.py, so that ContentMirror would keep going, despite the bad data in my database. In class ReferenceTransform, I added four lines to the copy() method:

    def copy( self, instance, peer ):
        value = self.context.getAccessor( instance )()

        if not value:
            return

        if not isinstance( value, (list, tuple)):
            value= [ value ]

        #Hack - remove None values.
        goods = [x for x in value if x is not None]
        if len(goods) < 1:
            return
        value = goods

I even compared this to the latest source code (version 0.6.0rc2) and found that the latest source still has this vulnerability. So, in true open-source spirit, I submitted an issue to the ContentMirror team.