XML in Plone with Marshall
Marshall lets you save and load Plone content using XML. As a configurable system, it has lots of options. This hands-on how-to shows exactly what to do to make the basics work.
Background information on the tutorial.
As a CMS, Plone needs to fit in with other information systems. Increasingly, the XML stack is the preferred way for semi-structured content to express itself between systems. Also, customers view XML as a future-proof storage.
Can Plone give XML representations of content types defined in Archetypes? This how-to gives a hands-on treatment of Marshall, a Collective add-on which provides XML saving and loading of your content.
As background, I'm really, really dense. This how-to is written for someone like me, that wants to be told exactly what to do to achieve some initial result.
In this first draft of the how-to, the goal is to get an XML representation of a Page in the fewest possible steps. We'll also show how to create a new Page from a file on disk.
Later installments will show a more flexible configuration, where you can define the kind of thing to be added via an XML element in your file. If Sidnei has enough patience with my questions, I'll add more to this how-to.
Prerequisites and configuration for software.
Plone doesn't yet support XML saving and loading as part of its default setup. We need to add and configure some software.
Marshall has been around a while, so in a sense, it should work with semi-recent versions of software. However, to get the experience described herein, you should use this:
- Zope 2.8.4
- Plone 2.1.2
- Marshall 0.6 (link here)
You also need libxml2. This is an industrial-strength XML parser with a good Python binding. If you are on Linux or OS X, you already have libxml2, though you might not have the Python binding in the Python you are using for your Zope.
How to find out? Run the Python for your Zope and do:
If that works, you're golden. If not, you have some compiling to do.
Once your software is in place, the next step is to configure Zope to provide a WebDAV port. For example, if your ZMI port is 8080, you might want to connect to Plone on port 8880 for WebDAV. In your Zope instance, open etc/zope.conf and uncomment the webdav-server section:
# valid keys are "address" and "force-connection-close"
Make sure you restart your Zope. Next, log into Plone as a Manager.
plone setup (top right corner) and install the Marshall
Much of the remaining work is in the ZMI. (Yeh, we should provide a
configlet for this. If someone teaches me how to make a configlet,
I'll do it and maintain it.) Thus, in Plone Setup, click on the link
Zope Management Interface.
Archetypes by default uses its own "marshaller" for exporting content. This step points it at Marshaller's ATXML exporter.
The bits are now installed but not configured. We need a way to tell Archetypes to use this XML marshaller when exporting a Page's content. Specifically, we want to add a "predicate" to use the ATXML marshaller.
- In ZMI, at the portal root, click on
- Click on the
Add Marshaller Predicatebutton.
- Fill in the fields and click
- Choose an
Id, such as
- Choose a
Title, such as
My ATXML Predicate
- Set the
- Leave the
- Click the
Getting and Editing XML
Now that the ATXML marshaller is configured, let's work with it.
So good news, you already have reached a point of success! Visit Plone's
front-page in a browser and add
manage_FTPget to the URL, as shown in the URL bar below:
Ahh, look at all that XML goodness, ain't she beautiful? But can we edit an existing entry? Let's use the cadaver command-line DAV tool and find out:
$ cadaver http://localhost:8880/atxml/
$ edit front-page
Note the usage of the WebDAV port number!! Depending on your editor settings, the second command will give you an editor such as vi. Change the value of the <dc:title> element, then save and exit the editor. cadaver will send the changes to Plone and unlock the resource.
Re-open the Plone front page in a web browser. You should see your new title. Cool, eh?
Some notes on this:
- Eagle-eyed observers will note that the body field was, errm, how shall we say this....encoded. In the case of Marshall 0.6, if a field's contents are HTML, the Marshaller puts the content in a CDATA. Why? Because we can't promise that the HTML is well-formed XML. Future versions of Marshall might revisit this policy. Note that image and file field contents are currently not serialized.
- cadaver has a helpful readline mode, if your compilation supported it.
- This XML format is in flux! Sidnei and I are discussing how to get more "meaning" out of Plone, in a way that fits the median of expectations.
I like using the oXygen editor, as noted above. With oXygen I can open the Plone site, browse to front-page, and edit the content in a real XML editor. I can even download the Relax NG schema for Marshall and validate before saving.
In this next screenshot, I show browsing the contents of the Plone site using oXygen's WebDAV browser:
I double-click to open front-page and tell oXygen that this is an XML document (lack of file extension means it couldn't guess). In the next screenshot, I have the front-page, exported as ATXML, open for editing. I have changed the dc:title and I have also associated the Relax NG schema with this page, thus giving me the inspector on the right:
Finally, I show the schema validation in action. I mistakenly change the id attribute on a field to be xid, which is not allowed in the schema. Note the red underline, the completely accurate warning message in the status bar, and the appearance of the right-hand inspector pane:
Creating New Entries with CTR
Loading a new XML file should create the correct content type. The Content Type Registry helps us.
As shown, editing existing entries was straightforward. Creating new content in Plone based on external XML files is more problematic. Namely, what content type should we use for the new resource?
As this isn't a one-size-fits-all situation, Marshall approaches this with configurability in mind. For mortals like me, choice means confusion. So this first example shows the simplest possible way to make it work, albeit in a clumsy-to-use fashion.
For this example, we will set a policy that any XML file ending in .atxmlpage will be used to create a Page resource. The id of that resource will come from the rest of the filename.
The CMF's Content Type Registry is responsible for policies related to file extensions. This tool can be reached via the ZMI in the portal root under the name "content_type_registry".
- Click on content_type_registry tool in ZMI.
- Scroll to the bottom.
- Add a predicate with a name such as atxmlpage, using
Extensionas the predicate type in the drop down, and click Add.
- After saving, change the settings. Set the extensions value to
atxmlpage and the content type in the drop-down to Page, then click
The content type registry is now setup. On your local disk, create a file somewhere named
mynewxmlpage.atxmlpage and give it contents as shown below:
<?xml version="1.0" ?>
My first page from XML
Congratulations! You have successfully installed Plone.
<p>This content came from an XML file on disk.</p>
Save this file, then return to cadaver. In the top folder of your Plone site, use cadaver to add the file:
cadaver uploads the file to Plone. The CTR sees the extension and knows to create a Page (Document) using the ATXML marshaller, which reads the XML file for all the initial settings.
In Plone you can now go to the URL
http://localhost:8080/atxml/mynewxmlpage.atxmlpage and see your new page.
Sure, there are a bunch of caveats to note:
- Be very careful to ensure you don't have a field with id="id" in your upload, nor a uid entry.
- Would be nice if the .atxmlpage extension disappeared.
- If you want to see all the settings, such as dc:subject and workflow state, that you can serialize to XML, go change some things on an existing Page and open it via cadaver. There's lots there! Sidnei's XML format was meant to capture lots of semantics.
- In a potential future addition to this how-to, I'll cover how to use cmf:type in the XML to create different types without the use of crazy file extensions.
What we did and what more we could do.
Plone is a CMS, and a CMS should have good facilities for getting stuff in and out. Plone is especially neat, in that Archetypes lets you define new kinds of semi-structured content types. Those, also, should provide a nice way to talk to the outside world.
Marshall is one approach to doing this. Hopefully this how-to provided enough information to get started.