Personal tools
You are here: Home Documentation How-tos Enable Indexing of pdf and word docs with Windows in Five steps:three minutes of your times without problems!
Support

Get Help

Join our chat rooms or support forums if you have more specific questions.

Plone Training
Learn how to design, build, and deploy a website in Plone through one of the numerous Plone training sessions around the world.
Find Plone training…
 
Document Actions

Enable Indexing of pdf and word docs with Windows in Five steps:three minutes of your times without problems!

This How-to applies to: Plone 3.0.x
This How-to is intended for: Server Administrators

Very Simple Five istructions to index pdf and word documents in Plone with Windows

Purpose

Clear written and useful istructions for indexing pdf and word docs on windows.

 

Step by step:  only Five!

 

First: install OpenOffice.org on your system. It's very simple to use and replace very good Microsoft Office (c) at least for most users.

Secondly, take the Windows xpdf  package (http://www.foolabs.com/xpdf/download.html). You can download the Windows version, following this link: ftp://ftp.foolabs.com/pub/xpdf/xpdf-3.02pl1-win32.zip

Third: unpacking the files. Zip xpdf inside C: \ WINDOWS \ system32

Fourth launch Plone, check inside Plone/portal_transform if there is the transform    word_to_html

Fifth: click on Add Transform; Enter in ID:  pdf_to_text
Enter in Module: Products.PortalTransforms.transforms.pdf_to_text

Now you can post your word and pdf documents and will be automatically indexed.

To find out what has been indexed of incorporated  documents you can look at SearchableText inside Plone/portal_catalog/Catalog/   for documents tracked in the index.

 

Further information

For a POSIX guide, see http://plone.org/documentation/how-to/enable-full-text-indexing-of-word-documents-and-pdfs-in-plone-3-0-gnu-linux/?searchterm=index%20pdf

 

For an alternative "hard" Windows guide, see: http://plone.org/documentation/how-to/enable-full-text-indexing-of-word-documents-and-pdfs-in-plone-3-0-windows/?searchterm=indexing%20windows

 

by Stefano Saltannecchi last modified November 21, 2007 - 23:01 All content is copyright Plone Foundation and the individual contributors.

I need a help

Posted by rajkumar at March 5, 2008 - 13:57
I logged into the Plone on my local computer.
I got the following exception on the step (Very Simple Five istructions to index pdf and word documents in Plone with Windows).

Can any of you help me.

Some thing wrong ?

Posted by marie christine olchanski at March 28, 2008 - 12:07
Traceback (innermost last):
Module ZPublisher.Publish, line 119, in publish
Module ZPublisher.mapply, line 88, in mapply
Module ZPublisher.Publish, line 42, in call_object
I follow the step, except for OpenOffice as I have Word.
And after the fifth step I have this error :

Module Products.PortalTransforms.TransformEngine, line 389, in manage_addTransform
Module Products.PortalTransforms.TransformEngine, line 263, in _mapTransform
Module Products.MimetypesRegistry.MimeTypesRegistry, line 218, in lookup
- __traceback_info__: ("'BROKEN'", 'BROKEN')
Module Products.MimetypesRegistry.MimeTypesRegistry, line 449, in split
MimeTypeException: Malformed MIME type (BROKEN)

Where am I wrong ?
Plone3, windows XP...

Note

Posted by Stefano Saltannecchi at March 28, 2008 - 15:46
Note that you may install OpenOffice with the complete UNO product to reach the complete transform of word files. See openoffice.org site to obtain more istructions.
All Plone versions from 3.0.0 need or OpenOffice (with UNO)or other programs to transform Word files (until word2003 version). If you want to transform Word2007 files you must use other specific Plone Products.

Good with pdf files, still problems with doc files

Posted by marie christine olchanski at April 2, 2008 - 09:47
On my PC where I have Word, the search is OK for pdf and .doc

On other machine, I have OpenOffice.
It's OK for pdf files, but not for doc files
But I install, at the five step, OpenOffice...
Can it be the reason ? Does I have to re-install in your order ?

catalog

Posted by marie christine olchanski at April 2, 2008 - 09:53
I forget to say that in both case (see above), the doc files are in the catalog...

For any issues with the web site functionality, please file a ticket.

Please consult the policy on plone.org content if you want your content published on this site.

Servers and hosting by