Building communal reference lists

by Frank Bennett last modified Dec 30, 2008 03:03 PM
Provide a fast facility for obtaining a union of lists referring to a particular item, using a keyword index in Plone's portal catalog

This how-to describes an indexing trick that I used in building a site I maintain at the university where I work. It struck me as clever and original when I dreamed it up, but as I am not a trained computer scientist, it is almost certainly neither. I offer this description here so that those who recognize it can have a good snicker over my layman's computing vocabulary, and so that those who don't can safely pretend to have known about it all along.

To follow this how-to, you will want to be familiar with the content of A. McKay, The Definitive Guide to Plone (2005), or at least have a copy close to hand for reference.

The Model

The original context for this was a bibliography system built for graduate students in the humanities faculty in which I am an instructor. I wanted to offer a community-driven "related items" feature, so that a student clicking on a link next to a citation (say, for example, V. Brannigan and R. Dayhoff, Liability for Personal Injuries Caused by Defective Medical Computer Programs, 7 Am. J. L. and Med. 123 (1981)) would be given a list of possibly related resources to explore. Many of you out there will recognize this as something similar to "Amazon reading suggestions", but at the time, I had in mind another, more spartan communal site that does this and nothing else:

http://www.whatshouldireadnext.com/

Explore the site or not as you like (I have no idea who runs it, and I certainly have no commercial stake in it; I just find it interesting). But it does something like this. A member starts by registering a list of favorite readings. He or she can then call for a list of reading suggestions. The suggestions are drawn from a union of all reading lists that share some number of items with the requesting member's own list, to provide a loose form of guidance to the reading habits of site members.

Because I am very close to bone idle, my first thought was to find a way to implement something like this in Plone, as it is one of the few computer systems that I know anything about (...and as to why, ignorant as I am, I didn't hire someone to do this, my colleagues in the Information Science faculty will be happy to confirm that I don't have a clue how to write a project specification).

The Design

To stop beating around the bush, it turned out that this can be done quite simply using keyword indexes in the portal catalog. In finger-painting, the overall design of the implementation looks like this:

The gray bits are Archetypes content objects. The "items" are content objects that display some kind of notionally useful information. The "list" objects are just simple objects with a title and maybe a description, that serve as the source object for references; through a skin template or a view, these will display a formatted list of summary information from the targets. If you provide some means for users to create references (say, ATReferenceBrowserWidget), you're in business on the right side of the diagram. Note that the references are flagged with a specific relationship, which is set to Reading List in this example.

Looking at the sample data written into the diagram, our hypothetical "related items" link for Item 1 should return Items 1 & 2, the link for Item 4 should return Items 2 & 4, and that for Item 2 should return Items 1, 2 & 4, because that is the union of List A and List B, the lists in which Item 2 occurs.

The red callouts on the far left side of the diagram show the content of a keyword index associated with each Item, which we've called readingLists. An entry in a keyword index is a simple Python list; in this case, each entry contains a list of the UIDs of the List objects that hold a reference to the Item being indexed.

For the moment, don't worry about how we're going to keep all of this information current; let's just assume that it exists in this form. If you are quicker on the uptake than I am, you will have noticed that if you take the value of readingLists for any entry, and drop it back into the same index as a search term, it will give you back exactly the union list that we are interested in. When a list is used as a search term against a keyword index, each item in the list is applied independently as a search atom (in other words, the list items are connected by OR). In the case of Item 1, this will match Item 1 and Item 2, in the case of Item 4, this will match Item 2 and Item 4, and in the case of Item 2, catalog brains for Item 1, Item 2, and Item 4 will be returned.

So that's all there is to it, as far as the design is concerned.

The Code

If you are like me, nothing looks easy until you see code examples, so here are a few. The details will depend on your local context, so this will be limited to the bare bones of the necessary code.

Creating indexes

As this is a fairly complex setup, you will want to deploy it through a product that you create for that purpose. Through the QuickInstaller, you can set up a keyword index for, say, readingLists using something like this in the install() method of the Install.py file of the product (AppInstall.py if you are working from ArchGenXML output): [1]

    from Products.CMFPlone.utils import getToolByName
    pcat = getToolByName(self,'portal_catalog')

    # Add index to catalog
    if not 'readingLists' in pcat.indexes():
        pcat.manage_addIndex('readingLists','KeywordIndex')
    
    # Include this attribute in catalog metadata
    try:
        pcat.addColumn('readingLists')
    except:
        pass

In this case, you will also want to include something like the following in the uninstall() method of the same file:

    from Products.CMFPlone.utils import getToolByName
    pcat = getToolByName(self,'portal_catalog')

    if 'readingLists' in pcat.indexes():
        pcat.manage_delIndex('readingLists')

    try:
        pcat.delColumn('readingLists')
    except:
        pass

Installing the product will now cause the index to be created and initialized; uninstalling the product will eliminate it from the catalog. So far, so good.

Calculating index values

You will also want to set up a method that provides values for your new index. The method needs to be run against every object targeted by the Reading List references. This can be done using the index method registry, available in Plone versions 2.1 and higher. The method to register might look something like this:

def _spitoutReadingListAttribute(obj, portal, vars):
    uid = obj.UID()
    res = portal.reference_catalog(targetUID=uid, 
                                   relationship='Reading List')
    return [x.sourceUID for x in res if x != None]

The index method registry is an advanced feature not covered by Andy McKay's Definitive Guide to Plone, but its use is described in a how-to, which explains how to hook this up.

Managing references

The simplest way to provide a user interface for creating references is by including a ReferencesField in Archetypes schema for the List content type. An example is given in the Archetypes chapter of McKay's Definitive Guide to Plone (see link above). You can also add and delete references programmatically, through a skin script or an object method, using the addReference() and deleteReference() methods. These are available against any Archetypes content object. Each does what its name suggests, and is documented at:

Products/Archetypes/interfaces/_referenceable.py

Searching

The code to perform a "related items" search will look something like this (assuming that the target object is available as context:

from Products.CMFPlone.utils import getToolByName
pcat = getToolByName(context,'portal_catalog')
res = pcat(readingLists=context.readingLists)

Simple, isn't it?

Updating the index

We have so far assumed away the most difficult part of the problem, of course, which is keeping the index metadata shown on the left side of the diagram above in sync with the references shown to the right. Each time a reference is created or deleted, we need to reindex the readingLists attribute on the target object.

There are two ways to do this: the old way using manage_afterAdd(); and the new way using a Zope 3 subscriber through Five. The old way will not be explained here, because (like so much else in life) it is now deprecated, and because I never understood how to use it correctly anyway.

The new way is described in a how-to on plone.org (it is so new that it is not covered in Andy McKay's Definitive Guide to Plone) (see link above). Please refer to the how-to for details on how to set up a subscriber; we will only give some essential code examples here, which should make sense in the context of that document.

The ZCML statement for the subscriber should look like this (substitute the location and name of your own handler for StupidProduct.h.readingListHandler):

  <subscriber for="Products.Archetypes.interfaces.IReference
                   zope.app.container.interfaces.IObjectAddedEvent"
          handler="Products.StupidProduct.h.readingListHandler"
      />

The handler itself might look more or less like this:

def readingListHandler(ob, event):
    ob.reindexObject(idxs=['readingLists'])

Note that the subscriber defined in the ZCML statement looks for an IReference interface on the object against which the event is triggered. The version of Archetypes that ships with Plone 2.5 does not yet have Zope 3 interface support for the Reference type enabled, but the essential infrastructure to do so is already in place; you just need to hook it up using something like this in your product's __init__.py file:

from Products.Archetypes.ReferenceEngine import Reference
from Products.Archetypes.interfaces import IReference
from zope.interface import classImplements

def initialize(context):
    classImplements(Reference,IReference)

Again, please see the how-to on creating subscribers in Five for details on how to apply the above code examples to your product. Once everything is set up, the catalog should be automatically updated every time a reference is added or deleted. "Related items" search returns should always reflect the current state of the community's reading lists.

By way of conclusion

This how-to has explained a method of setting up a communal reference helper in Plone. A great good deal can be done to improve technically on the simple system outlined here, but it should be said that no communal tool is any better than the discipline and generousity of the community that drives it. As so often in web design and administration, the boundaries of what can be done are ultimately fixed, not by your software, but by the people who come to use it.

Footnotes:

[1] See the Add indexes and metadatas to portal_catalog on install how-to in the Plone Help Center for a more general code example.