#112: XML Import / Export
- Contents
- Proposed by
- Kapil Thangavelu
- Seconded by
- Jens Klein
- Proposal type
- Architecture
- Repository branch
- plip112-xml-import-export
- State
- in-progress
Definitions
- XMLIO
- XML Import and Export
- RelaxNG
- An XML Schema Validation language.
- libxml
- A c-based implementation of many of the core XML Specs, it is the reference implementation used by much of the FOSS landscape and also has SWIG based python bindings.
- Goldegg
- A project aimed at improving Plone through funding development of the entire software stack that Plone is built on. XMLIO is a goldegg funded project.
- Marshall
- A product from Enfoldsystems, which allows for defining via configuration the import / export serialization used for a given content object.
- GenericSetup
- A framework for managing configurations within a Zope application.
- CMFSetup
- A product that represents additional policies ontop of the GenericSetup framework for serializing tool/service configuration to and from XML.
- ZUCCARO
- ZUCCARO (Zope-based Universally Configurable Classes for Academic Research Online) is a database framework for the Humanities developed by the Bibliotheca Hertziana, Max Planck Institute for Art History For further information: http://zuccaro.biblhertz.it/
Motivation
There are a number of use cases that XML Import/Export can satisfy.
Site Upgrades
Site Upgrades currently, utilize a function registry to enable upgrades, this process is labor intensive and non introspectable with regards to the changes it makes to a site. Site Import/Export offers an alternative to this mechanism via being able to export a site, manipulate/transform the XML, and import it back in as a new site.
Content Exchange
Allows for the exchange of site content with other content consumers.
Archival Purposes
Many business and governmental organization have strict legal requirements regarding the availability of content and information. XML Import/Export facilitates this with the ability to periodically export a site or subset thereof to of thatThe ability to export to perm
Application Import/Export
Applications often need to exchange data between systems in an application specific fashion, the XML Import/Export Framework (XMLIO) utilizes the Zope3 Component architecture to define adapters for content import and export as well as shipping default adapters for content. This is to facilitate integrators and developers utilizing the framework when developing their own application specific import/export adapters.
Proposal
See Implementation for details. High level overview, define zcml registrations and adapters from plone content to cmfcore filesystem import/exporter interfaces, utilize the marshall product for XML serialization.
Implementation
The XMLIO implementation tries to reuse as much as possible of extant products. Tres Seaver has landed GenericSetup infrastructure into CMF-2.0 including interfaces and adapters for content import / export. The XMLIO builds on this infrastructure to provide adapters and registrations for Plone content and utilizes the existing tool handlers in CMFSetup for configuration XMLIO. The content import/export adapters utilize the Marshall product to handle xml serialization and deserialization for a given Archetype content object. The default xml marshaller in the Marshall product uses its own format ( relaxng schemas provided) as a canonical representation.
The Marshall implementation in use for XMLIO is the one on the pluggable namespace branch, which refactors the marshall core to allow for user defined registration of xml namespaces import and exporters for content, and additionally allows includes handlers for security and workflow state associated with an archetype object. Additionally allows for runtime selection of which regstered namespaces (dublincore, workflow, etc ) to utilize while exporting.
Deliverables
ContentSetup Product
- defining adapters and registrations for content to do XMLIO.
- providing an import/export tool and a basic Plone Control Panel UI
Risks
libxml and its python bindings are needed as dependencies and will be needed in platform installers.
UPDATE: Marshall was refactored to use ElemenTree.
Progress log
- 2006-07-19 (jensens)
- For the ZUCCARO project we developed XMLForest. XMLForest uses Marshalls ATXMLMarshaller for single content objects (de-)serialization and build from all this snippets an IMS-Content-Package ZIP-File. It im- and exports it. XMLForest support Archetypes References and Relations including content-references and content-relations. It is UID-aware and supports updates.
- 2006-08-25 (jensens)
- After bugged by Alex Limi and talked to Phil Auersperg I created a bundle for this plip including XMLForest.
Participants
Jens Klein
Phil Auersperg
Gogo Bernhard
Martin Raspe
answers
> What version of Plone would you target this for?
The goal is for this to work with Plone 2.2, content import/export (as distinct from tool configuration export) is compatible with Plone 2.1. This is the last version that compatiblity with be attempted with.
>Will there be some kind of migration nescessary, or some toying with the products, to make them XMLIO compatible?
any well formed archetype based content should work without modifications. however products which store state outside of archetype fields, and which are not part of the core cmf (workflow/security) will require additional work to export/import that state, probably the most prominent, atschemaeditorng based products would need explicit work done to serialize their mutable schemas. tools also require explicitly handling and registration with the generic setup framework, as is currently being done with plone 2.1 tools.
Uploaded imports?
While imports are being looked at, I would like an improved upload facility. Currenlty, I can export to either the filesystem, or my local machine. However, I can only import from the file system. I have access to my local machine, but not to the filesystem where my plone instance resides. It would be very nice to be able to import a file from my local machine
Ta lots
not really related
the content xmlio stuff does allow for local machine uploads and is not pickle based.
The exported file format
For the File/Image content, maybe we need to export them into 2 files:
- the raw file/image itself.
Should content with attachments exported as seperate files too?
- metadata, using xml.
Also, do we need to seperate the xml file into small ones for different namespace?
I think this plip has some relationship with plip44:
http://plone.org/products/plone/roadmap/44
XMLForest: Generic IMS Content Package Im/Export for all Archetypes based content in Plone
Folks, look at http://plone.org/products/xmlforest - its exact what you all are talking about.
Further questions
This seems very interesting, I still have a few questions though:
Or is it too early to ask these questions?
Otherwise, knowing it will export not only the objects, but also their properties, relations (or references), workflow and path. Exporting the site, and thus the configuration of its tools and properties. I can only rejoyce knowig this is being made.