Personal tools
You are here: Home Products Plone Roadmap #125: Ensuring link/reference integrity (removing 404 links)
Document Actions

#125: Ensuring link/reference integrity (removing 404 links)

Contents
  1. Motivation
  2. Assumptions
  3. Proposal
  4. Implementation
  5. Progress log
by Andreas Zeidler last modified December 27, 2006 - 15:53
One of Plone's weaknesses at the moment is the lack of resource tracking. If you delete a picture, and another document references this picture, you won't know before you look at the other document.
Proposed by
Alexander Limi
Seconded by
Andi Zeidler
Proposal type
Architecture
Assigned to release
Repository branch
plip125-link-integrity-bundle
State
completed

Motivation

One of the things that need to be solved in Plone is the ability to automatically associate objects that "touch" each other, so that you know:

  • Which items will be affected if you delete the current item
  • Which items will need updating if you move the current item.

Also, when you move an object, and an old bookmark is pointing to it, that page should automatically redirect the user to the new location (possibly with a message saying that they were redirected).

Assumptions

  • This is still very much a work-in-progress — comments appreciated.
  • We will not consider in-process / long-running approaches like CMFLinkChecker because it's not a good approach for an already busy Plone site to be doing this.
  • This proposal does intentionally not deal with outside links, you should use a normal link-checker for that.
  • The traditional way of dealing with this sort of problem has been to use hashes for object/item names, but we're not willing to sacrifice logical item naming and nice URLs for this.
  • For the move/rename case, we would like to extend the RedirectionTool, a proven, existing tool to handle these kinds of references and redirections. This is a suggestion, though - and if someone can come up with a compelling reason to not use it and rewrite from scratch, I won't let that block this PLIP. :)

Proposal

Here are some use cases to show how I envision this being handled:

Use case: Deleting an item

  • User adds a normal page
  • In that page, he references two images
  • When Page is saved, it looks for local references, and creates a isReferencedBy reference on both the page and the images
  • Time passes
  • A different user comes along, tries to delete one of the images referenced above
  • He then gets a warning saying: "The image XYZ is used in the page ABC, are you sure you want to delete it?"
  • (If we want to be a bit smart about this, the RedirectionTool — see below — could register that a page was explicitly deleted and say something like "the page you were looking for was deleted, maybe some of the following pages contain what you were looking for?" followed by a search. This is the current behaviour of RedirectionTool, minus the explicit knowledge that something was deleted.)

Use case: Renaming/moving an item

  • User creates a page
  • Some time later, the user decides to reorganize his web site, and moves the pages around, including the page created in the first step
  • Upon being moved or renamed, the item registers its old location and its new location in a list that maps old location → new location (RedirectionTool is a working implementation of this)
  • Another use that bookmarked the old page visits the old location
  • RedirectionTool sees that this is a 404, and looks up the old location in its list - finds the new location
  • User is redirected to new location, potentially with a message saying "You have been redirected, please update your bookmarks"

Implementation

I found the following note in a mail from Ben Saller (IIRC) — I'm including it here since it might be helpful in parts of the implementation:

Actually this is possible with core archetypes by doing:

 from Products.Archetypes.references import HoldingReference

...and in your reference field schema definition do):

 referenceClass=HoldingReference

This will raise a BeforeDelete Exception whenever someone tries to delete an object which is the target of an existing reference. There is also a CascadeReference which deletes all references when deleting the main object.

Progress log

Andi has implemented the integrity part of this in SVN. The redirection/move part was done by optilude. Both are pretty much finished and already merged into 3.0alpha (as the status already indicates).

prototype for using z3 events

Posted by Whit Morriss at April 24, 2006 - 17:39
https://svn.openplans.org/svn/topp.rose/

The basic architecure uses IObjectMoved and IObjectDeleted events to maintain a simple Btree storage of the paths an object once occupied (keyed by path, and oid or uid).

By means of a traversal adapter, empty paths that match old object location in the storage are redirected to new URLs.

This is much lower overhead and potential scales better than using references. The traversal adapter, storage utility, redirection view, and the event subscribers may also be overridden to easily extend behavior, if needed.

Yup, might be interesting for the redirection part

Posted by Alexander Limi at August 22, 2006 - 05:48
At the moment, we only have the link integrity code done, afaik - the redirect on move/rename isn't done yet.

For the link integrity, I believe Andi has used some Z3 stuff, but I'll let him explain that part. :)

definitely interesting...

Posted by Andreas Zeidler at August 24, 2006 - 09:46

first of all, thanks for the update here, limi. i was offline last week, so i only saw it late last night after digging through all that spam... :)

anyway, you're right, the redirecting part isn't done yet, or rather, it's not integrated yet. i've looked at topp.rose back at the island (and imho i understood it, too :)) and i think it really covers like 90% of that use case. the only thing missing is to properly hook it up with plone (or maybe rather the link integrity stuff). my plan is to try to do just that right after i've found a way to test the delete use case, which turned out to be a bit tricky...

and, as for the explanation part, that's another thing [http://dev.plone.org/collective/browser/LinkIntegrity/trunk/TODO.txt to do] before integrating topp.rose. the code does use z3 stuff indeed, and i guess on first glance it even does use some "funny" ways of getting what we want, so i'll better write a document explaining how and more importantly also why it does it like that. i think it'll look less scary then... :)

boy, would this be nice!

Posted by John DeStefano at December 4, 2006 - 20:24
... and I know I'm dreaming now, but it would be even better if there were some way to "cut" a folder in content view, "paste" it elsewhere in the navigation tree, and still have any inter- and intra-referring links intact. Maybe this could be possible if each site object were forced to have a completely unique ID value, throughout the entire Plone site?

actually...

Posted by Andreas Zeidler at December 27, 2006 - 15:55
...this should be working now thanks to optilude's code in plone.app.redirector (which will be part of 3.0)

For any issues with the web site functionality, please file a ticket.

Please consult the policy on plone.org content if you want your content published on this site.

Servers and hosting by