#112: XML Import / Export

Contents
  1. Definitions
  2. Motivation
  3. Proposal
  4. Implementation
  5. Deliverables
  6. Risks
  7. Progress log
  8. Participants
by Kapil Thangavelu last modified Aug 25, 2006 09:43 AM

Extend Plone with core functionality to export sites or content trees to a neutral format, and to import from that format. The neutral format here is an XML dialect, that captures a complete infoset regarding site configuration and content state.

Proposed by
Kapil Thangavelu
Seconded by
Jens Klein
Proposal type
Architecture
Repository branch
plip112-xml-import-export
State
in-progress

Definitions

XMLIO
XML Import and Export
RelaxNG
An XML Schema Validation language.
libxml
A c-based implementation of many of the core XML Specs, it is the reference implementation used by much of the FOSS landscape and also has SWIG based python bindings.
Goldegg
A project aimed at improving Plone through funding development of the entire software stack that Plone is built on. XMLIO is a goldegg funded project.
Marshall
A product from Enfoldsystems, which allows for defining via configuration the import / export serialization used for a given content object.
GenericSetup
A framework for managing configurations within a Zope application.
CMFSetup
A product that represents additional policies ontop of the GenericSetup framework for serializing tool/service configuration to and from XML.
ZUCCARO
ZUCCARO (Zope-based Universally Configurable Classes for Academic Research Online) is a database framework for the Humanities developed by the Bibliotheca Hertziana, Max Planck Institute for Art History For further information: http://zuccaro.biblhertz.it/

Motivation

There are a number of use cases that XML Import/Export can satisfy.

Site Upgrades

Site Upgrades currently, utilize a function registry to enable upgrades, this process is labor intensive and non introspectable with regards to the changes it makes to a site. Site Import/Export offers an alternative to this mechanism via being able to export a site, manipulate/transform the XML, and import it back in as a new site.

Content Exchange

Allows for the exchange of site content with other content consumers.

Archival Purposes

Many business and governmental organization have strict legal requirements regarding the availability of content and information. XML Import/Export facilitates this with the ability to periodically export a site or subset thereof to of thatThe ability to export to perm

Application Import/Export

Applications often need to exchange data between systems in an application specific fashion, the XML Import/Export Framework (XMLIO) utilizes the Zope3 Component architecture to define adapters for content import and export as well as shipping default adapters for content. This is to facilitate integrators and developers utilizing the framework when developing their own application specific import/export adapters.

Proposal

See Implementation for details. High level overview, define zcml registrations and adapters from plone content to cmfcore filesystem import/exporter interfaces, utilize the marshall product for XML serialization.

Implementation

The XMLIO implementation tries to reuse as much as possible of extant products. Tres Seaver has landed GenericSetup infrastructure into CMF-2.0 including interfaces and adapters for content import / export. The XMLIO builds on this infrastructure to provide adapters and registrations for Plone content and utilizes the existing tool handlers in CMFSetup for configuration XMLIO. The content import/export adapters utilize the Marshall product to handle xml serialization and deserialization for a given Archetype content object. The default xml marshaller in the Marshall product uses its own format ( relaxng schemas provided) as a canonical representation.

The Marshall implementation in use for XMLIO is the one on the pluggable namespace branch, which refactors the marshall core to allow for user defined registration of xml namespaces import and exporters for content, and additionally allows includes handlers for security and workflow state associated with an archetype object. Additionally allows for runtime selection of which regstered namespaces (dublincore, workflow, etc ) to utilize while exporting.

Deliverables

ContentSetup Product

  • defining adapters and registrations for content to do XMLIO.
  • providing an import/export tool and a basic Plone Control Panel UI

Risks

libxml and its python bindings are needed as dependencies and will be needed in platform installers.
UPDATE: Marshall was refactored to use ElemenTree.

Progress log

2006-07-19 (jensens)
For the ZUCCARO project we developed XMLForest. XMLForest uses Marshalls ATXMLMarshaller for single content objects (de-)serialization and build from all this snippets an IMS-Content-Package ZIP-File. It im- and exports it. XMLForest support Archetypes References and Relations including content-references and content-relations. It is UID-aware and supports updates.
2006-08-25 (jensens)
After bugged by Alex Limi and talked to Phil Auersperg I created a bundle for this plip including XMLForest.

Participants

Jens Klein
Phil Auersperg
Gogo Bernhard
Martin Raspe

Further questions

Posted by Louis Wannijn at Nov 09, 2005 08:35 AM
This seems very interesting, I still have a few questions though:

- What version of Plone would you target this for?

- How compatible will it be with previous versions?

- Will there be some kind of migration nescessary, or some toying with the products, to make them XMLIO compatible?

Or is it too early to ask these questions?

Otherwise, knowing it will export not only the objects, but also their properties, relations (or references), workflow and path. Exporting the site, and thus the configuration of its tools and properties. I can only rejoyce knowig this is being made.

answers

Posted by Kapil Thangavelu at Nov 09, 2005 10:29 PM
> What version of Plone would you target this for?

The goal is for this to work with Plone 2.2, content import/export (as distinct from tool configuration export) is compatible with Plone 2.1. This is the last version that compatiblity with be attempted with.

>Will there be some kind of migration nescessary, or some toying with the products, to make them XMLIO compatible?

any well formed archetype based content should work without modifications. however products which store state outside of archetype fields, and which are not part of the core cmf (workflow/security) will require additional work to export/import that state, probably the most prominent, atschemaeditorng based products would need explicit work done to serialize their mutable schemas. tools also require explicitly handling and registration with the generic setup framework, as is currently being done with plone 2.1 tools.

Uploaded imports?

Posted by Keith Kube at Dec 12, 2005 11:54 PM
While imports are being looked at, I would like an improved upload facility.
Currenlty, I can export to either the filesystem, or my local machine. However, I can only import from the file system. I have access to my local machine, but not to the filesystem where my plone instance resides. It would be very nice to be able to import a file from my local machine

Ta lots

not really related

Posted by Kapil Thangavelu at Dec 13, 2005 09:05 AM
your talking about zmi imports, which are pickle based, the fact that you can't import from your local machine is a feature here, as pickle based imports are serious security concerns. however this doesn't have anything to do with the import/export features under discussion and development. as the goal here is a human and machine readable import/export format that can be modified by common tools.

the content xmlio stuff does allow for local machine uploads and is not pickle based.

The exported file format

Posted by Junyong Pan at Dec 15, 2005 03:34 AM
For the File/Image content, maybe we need to export them into 2 files:

- the raw file/image itself.

  Should content with attachments exported as seperate files too?

- metadata, using xml.

  Also, do we need to seperate the xml file into small ones for different namespace?

I think this plip has some relationship with plip44:

http://plone.org/products/plone/roadmap/44

XMLForest: Generic IMS Content Package Im/Export for all Archetypes based content in Plone

Posted by Jens W. Klein at Mar 16, 2006 06:41 PM
Folks, look at http://plone.org/products/xmlforest - its exact what you all are talking about.

No Release of product

Posted by Darryl Latawiec at Sep 22, 2006 07:06 PM
Does this exist as of yet???