Migrating data from one field to another
This How-to applies to:
Any version.
This How-to is intended for:
Developers
Lets say we have content type called MyContentType that has a field category with a keyword index. Suppose we then change our minds and want to use Plone's internal subject (aka Keyword) field for this purpose instead. If there are existing content types in your site, doing this manually would be a pain. Luckily, it's quite simple to write a quick python script to migrate the data from one field to another.
Even more luckily, there is now a component out there to help you called contentmigration!
The script method below may be all you need, but contentmigration, which leverages ATContentTypes' migration framework, used internally in Plone, is very flexible and allows you to make changes like the one below with less code. The tool is described in the context of other migrations in the RichDocument tutorial - see the last section of this page in particular.
If you would rather do it manually with a script, the original steps are below:
- Backup your database
- Create a script in you custom skin folder
copyKeywords.py - Paste in the script below
- In the catalog search put the name of you contenttype where it says
MyContentType. This parameter ensures that the script only searches for your specific content type. - In this case, we are using
getCategory()as the accessor for the old value, andsetSubject()as the mutator for the new value. You may have different fields, and you may need to manipulate the value before writing it to the new field. - To run the script, click on the
testtab in the zmi
Note: If an exception was raised, the transaction will be rolled back, and no data will be changed. If the script executes successfully, you can use the "Undo" tab at the root of your Plone site in the ZMI to undo the running of the script, if you change your mind
the script:
request = container.REQUEST
RESPONSE = request.RESPONSE
brains = context.portal_catalog.searchResults( REQUEST=request,
Type = 'myContent')
for brain in brains:
obj = brain.getObject()
keywords = obj.getCategory()
obj.setSubject(keywords)
print "Updated", obj.absolute_url()
return printed
There are many variations on this script. You may for example perform calculations or other modifications before saving the new value, or save the value to a different content object altogether. Remember that you are modifying your database, so be careful, don't forget the backup, and use the Undo tab if things go wrong!
Extended version
request = container.REQUEST
RESPONSE = request.RESPONSE
brains = context.portal_catalog.searchResults(REQUEST=request,
Type = 'YOURCONTENTTYPE')
obj_count = 0
for brain in brains:
obj = brain.getObject()
obj_url = obj.absolute_url()
oldvaluelist = obj.YOUROLDFIELD()
newvaluelist = list(obj.YOURNEWFIELD())
if oldvaluelist:
print "Checking", obj_url
updated = 0
for value in oldvaluelist:
if value not in newvaluelist:
newvaluelist.append(value)
print " Adding:", value
updated = updated + 1
if updated:
print " Setting:", newvaluelist
obj.setDataTypes(newvaluelist)
print " Updated with %i value(s)" % updated
else:
print " No values were migrated for this object."
obj_count = obj_count + 1
else:
print "Skipping", obj_url
# You may want to remove this if you do not want your old field zeroed.
obj.setDataType([])
print "\nTotal objects processed:", obj_count
print "Total batch objects:", len(brains)
return printed
Reindexing
Thx a lot, exactly what I needed. After migrating something to subject aka keywords, you should reindex the catalog index "Subject".