#125: Ensuring link/reference integrity (removing 404 links)

Contents
  1. Motivation
  2. Assumptions
  3. Proposal
  4. Implementation
  5. Progress log
by Andreas Zeidler last modified Jan 21, 2010 07:26 AM

One of Plone's weaknesses at the moment is the lack of resource tracking. If you delete a picture, and another document references this picture, you won't know before you look at the other document.

Proposed by
Alexander Limi
Seconded by
Andi Zeidler
Proposal type
Architecture
Assigned to release
Repository branch
plip125-link-integrity-bundle
State
completed

Motivation

One of the things that need to be solved in Plone is the ability to automatically associate objects that "touch" each other, so that you know:

  • Which items will be affected if you delete the current item
  • Which items will need updating if you move the current item.

Also, when you move an object, and an old bookmark is pointing to it, that page should automatically redirect the user to the new location (possibly with a message saying that they were redirected).

Assumptions

  • This is still very much a work-in-progress — comments appreciated.
  • We will not consider in-process / long-running approaches like CMFLinkChecker because it's not a good approach for an already busy Plone site to be doing this.
  • This proposal does intentionally not deal with outside links, you should use a normal link-checker for that.
  • The traditional way of dealing with this sort of problem has been to use hashes for object/item names, but we're not willing to sacrifice logical item naming and nice URLs for this.
  • For the move/rename case, we would like to extend the RedirectionTool, a proven, existing tool to handle these kinds of references and redirections. This is a suggestion, though - and if someone can come up with a compelling reason to not use it and rewrite from scratch, I won't let that block this PLIP. :)

Proposal

Here are some use cases to show how I envision this being handled:

Use case: Deleting an item

  • User adds a normal page
  • In that page, he references two images
  • When Page is saved, it looks for local references, and creates a isReferencedBy reference on both the page and the images
  • Time passes
  • A different user comes along, tries to delete one of the images referenced above
  • He then gets a warning saying: "The image XYZ is used in the page ABC, are you sure you want to delete it?"
  • (If we want to be a bit smart about this, the RedirectionTool — see below — could register that a page was explicitly deleted and say something like "the page you were looking for was deleted, maybe some of the following pages contain what you were looking for?" followed by a search. This is the current behaviour of RedirectionTool, minus the explicit knowledge that something was deleted.)

Use case: Renaming/moving an item

  • User creates a page
  • Some time later, the user decides to reorganize his web site, and moves the pages around, including the page created in the first step
  • Upon being moved or renamed, the item registers its old location and its new location in a list that maps old location → new location (RedirectionTool is a working implementation of this)
  • Another use that bookmarked the old page visits the old location
  • RedirectionTool sees that this is a 404, and looks up the old location in its list - finds the new location
  • User is redirected to new location, potentially with a message saying "You have been redirected, please update your bookmarks"

Implementation

I found the following note in a mail from Ben Saller (IIRC) — I'm including it here since it might be helpful in parts of the implementation:

Actually this is possible with core archetypes by doing:

 from Products.Archetypes.references import HoldingReference

...and in your reference field schema definition do):

 referenceClass=HoldingReference

This will raise a BeforeDelete Exception whenever someone tries to delete an object which is the target of an existing reference. There is also a CascadeReference which deletes all references when deleting the main object.

Progress log

Andi has implemented the integrity part of this in SVN. The redirection/move part was done by optilude. Both are pretty much finished and already merged into 3.0alpha (as the status already indicates).