Personal tools
You are here: Home Products Plone Roadmap #154: Large file handling
Document Actions

#154: Large file handling

Contents
  1. Motivation
  2. Proposal
by Martin Aspeli last modified June 11, 2006 - 00:20
It is possible to configure Zope to work with very large files, but the out-of-the-box story is not terribly great. It should be obvious how to configure Plone so that it can handle large volumes of MS Office, PDF or media files, for example.
Proposed by
Martin Aspeli
Seconded by
Martijn Pieters
Proposal type
Architecture
State
being-discussed

Motivation

In many ways, Plone is well-suited to document mangement and the management of files in general. Tools such as ExternalEditor and Enfold Desktop makes this even more true. However, due to the way the ZODB works, large files can be problematic to work with - if you're not careful, your ZODB could baloon because each change to a file revisions the whole object.

There are solutions to this problem, which usually involve storing some content outside the ZODB. However, the out-of-the-box story in Plone isn't good enough. It needs to be clear how to set up a site to support large files, and as far as possible this should work transparently whether enabled or disabled.

Proposal

There are several possible solutions. The first step is to explore them and find workable alternatives. In doing this evaluation, a few principles are important to keep in mind:

Transparency
It needs to be easy to turn large file optimisations on and off, and it should work seamlessly with existing products and tools.
Performance
Dealing with large files is typically a performance problem. Loading a 1Gb video file into memory every now and then is not acceptable!
Ease of set-up
It should be easy and obvious how large file optimisations are enabled, and what implications any configuration changes have.

There are four main avenues of exploration:

  1. ZODB BLOB support. Christian Theune and Chris McDonough have presented work that should make the ZODB itself better able to deal with BLOBs (binary large objects). This work is presently on hold pending resources. This is a very low-level, but potentially highly transparent solution. It is unknown when it may be possible to land this work in an actual ZODB release that Plone could depend on.
  2. External storage. Archetypes has the option of defining storages for fields. Two storages exist that place the contents of files on the filesystem instead of in the ZODB. Unfortunately, both of these have difficulties with use cases where content is moved or deleted, and are very Archetypes-specific.
  3. Specific content types. CMFExtFile, PloneExFile and ATManagedFile are three content types that handle external storage of files on the filesystem. Again, these are not as transparent as would perhaps be desirable, because they require the use of specific content types, but they appear to work well.
  4. Tramline. Tramline is a product from Infrae that uses a mod_python plug-in to Apache to intercept file upload and download requests, storing the files on the filesystem where Apache is running instead of passing the whole file to Zope. PloneTramline provides a prototype for how this may be integrated into Archetypes, and attramline provides a special Archetypes field and storage for Tramline integration. Tramline integration is a very interesting approach, and should work transparently when Apache/Tramline is not present. However, some more work is required around issues such as indexing of file content and making it sufficiently easy and transparent to configure this partcular setup.

The goal for this PLIP would be to find a solution or combination of options that work well enough to be recommended as an out-of-the-box, documented solution. In particular, exploring Tramline use cases and getting transparent Tramline support into Archetypes (and Zope3/Plone in general) is envisaged as an important route to explore.

Please also see PLIP 155.

Planned for 3.5

Posted by Alexander Limi at February 22, 2007 - 16:19
Just to have a note that reflects our current "unofficial consensus": Zope 2.11 has BLOB support now, and we're aiming to let that be the way to solve this in Plone 3.5.

FlexStorage

Posted by Justin Ryan at March 20, 2007 - 07:03
Howdy..

A while back it came to my attention that AttachmentField grew a tool called FlexStorage which allows one to flip a switch and toggle all of the attachment fields in a site between storage, including moving the content around. The code for FlexStorage allows it to support multiple storages, though I haven't had a chance to really try it.

I did spend some time trying to split this out some time ago but could use a hand. The presence of this sort of tool in Plone could really help to ease the debate on what sort of storage to use. There are and have been many great storages, BLOB storage in Zope 2.11 and on should be great, but people may come up with better ideas, etc..

I think if we're going to prepare Plone to handle this OOTB it's worth the trouble to keep from tying to a single implementation.

This work which I semi-abandoned a while back can be found at:

http://svn.plone.org/svn/collective/FlexStorage

Cheers..

For any issues with the web site functionality, please file a ticket.

Please consult the policy on plone.org content if you want your content published on this site.

Servers and hosting by