Enable Indexing of pdf and word docs with Windows in Five steps:three minutes of your times without problems!
This How-to applies to:
Plone 3.0.x
This How-to is intended for:
Server Administrators
Purpose
Clear written and useful istructions for indexing pdf and word docs on windows.
Step by step: only Five!
First: install OpenOffice.org on your system. It's very simple to use and replace very good Microsoft Office (c) at least for most users.
Secondly, take the Windows xpdf package
(http://www.foolabs.com/xpdf/download.html). You can
download the Windows version, following this link:
ftp://ftp.foolabs.com/pub/xpdf/xpdf-3.02pl1-win32.zip
Third: unpacking the files. Zip xpdf inside C: \ WINDOWS \ system32
Fourth launch Plone, check inside Plone/portal_transform if there is the transform word_to_html
Fifth: click on Add Transform; Enter in ID: pdf_to_text
Enter in Module: Products.PortalTransforms.transforms.pdf_to_text
Now you can post your word and pdf documents and will be automatically indexed.
To find out what has been indexed of incorporated documents you can look
at SearchableText inside Plone/portal_catalog/Catalog/ for
documents tracked in the index.
Further information
For a POSIX guide, see http://plone.org/documentation/how-to/enable-full-text-indexing-of-word-documents-and-pdfs-in-plone-3-0-gnu-linux/?searchterm=index%20pdf
For an alternative "hard" Windows guide, see: http://plone.org/documentation/how-to/enable-full-text-indexing-of-word-documents-and-pdfs-in-plone-3-0-windows/?searchterm=indexing%20windows
Some thing wrong ?
Module ZPublisher.Publish, line 119, in publish
Module ZPublisher.mapply, line 88, in mapply
Module ZPublisher.Publish, line 42, in call_object
I follow the step, except for OpenOffice as I have Word.
And after the fifth step I have this error :
Module Products.PortalTransforms.TransformEngine, line 389, in manage_addTransform
Module Products.PortalTransforms.TransformEngine, line 263, in _mapTransform
Module Products.MimetypesRegistry.MimeTypesRegistry, line 218, in lookup
- __traceback_info__: ("'BROKEN'", 'BROKEN')
Module Products.MimetypesRegistry.MimeTypesRegistry, line 449, in split
MimeTypeException: Malformed MIME type (BROKEN)
Where am I wrong ?
Plone3, windows XP...
Note
All Plone versions from 3.0.0 need or OpenOffice (with UNO)or other programs to transform Word files (until word2003 version). If you want to transform Word2007 files you must use other specific Plone Products.
Good with pdf files, still problems with doc files
On other machine, I have OpenOffice.
It's OK for pdf files, but not for doc files
But I install, at the five step, OpenOffice...
Can it be the reason ? Does I have to re-install in your order ?
I need a help
I got the following exception on the step (Very Simple Five istructions to index pdf and word documents in Plone with Windows).
Can any of you help me.