Personal tools
You are here: Home Products Plone Roadmap #93: Optimize Plone for speed
Document Actions

#93: Optimize Plone for speed

Contents
  1. Motivation
  2. Assumptions
  3. Proposal
  4. Progress log
  5. Participants
by Alexander Limi last modified June 11, 2006 - 00:22
Plone 2.0 had a number of bottlenecks that have mostly been resolved in the current 2.1 branch with the new navigation tree and catalog-based folder listings. However, there are a few other areas that can be improved.
Proposed by
Alexander Limi
Proposal type
Architecture
Repository branch
plip93-optimize-templates
State
completed

Motivation

It's important that Plone becomes faster - although Plone is made to be fronted with Apache/Squid caching, it makes sense to optimize page load speeds for big deployments - and it also makes a difference in the day-to-day development usage.

Assumptions

This proposal mostly concerns itself with optimization of page views for anonymous users. While editing speed matters too, this is a more complex part to optimize, and is outside the scope of this PLIP, although should obviously be a focus area in the future.

Proposal

After doing some initial, simple profiling, I identified some attack vectors that would give a lot of speed-up with little effort.

The areas where I think we should focus our attentions are:

  1. The breadcrumbs code
  2. The 'listMetaTags' method
  3. The calendar portlet

Here is some output from 10 page loads as Anonymous User of the front page of a newly created Plone site on my 1.5GHz PowerBook G4 running in debug mode - Plone 2.1 branch (Revision: 6338). I use PTProfiler as my main analysis tool for this.

The table has been edited to show only the main offenders speed-wise. I have highlighted cases of very expensive single calls or with an excessive number of calls to methods.

Expression (partial listing) Total time Number of calls Time per call
Total rendering time 8.9213 10 0.89213
python: portal.portal_actions.listFilteredActionsFor(here) 0.7863 10 0.07863
path: here/listMetaTags|nothing 0.4551 10 0.04551
python: portal.breadcrumbs(here) 0.2081 10 0.02081
python: here.getBeginAndEndTimes(day=daynumber, month=month, year=year) 0.1694 60 0.00282
path: day/event 0.0672 700 0.0001
python: current.year()==year and current.month()==month and current.day()==int(daynumber) 0.0558 580 0.0001
python: here.portal_url() + '/search?review_state=published &start.query:record:date=%s&start.range:record=max &end.query:record:date=%s &end.range:record=min'%(pss.url_quote(begin), pss.url_quote(end)) 0.0455 60 0.00076
path: day/day 0.0384 350 0.00011
python: '%d%0.2d%0.2d' % (year, month, daynumber) 0.0258 350 7e-05
python: test(current.year()==year and current.month()==month and current.day()==int(daynumber), 'todayevent', 'event') 0.0186 60 0.00031
python: DateTime(begEndTimes[0].timeTime()+86400).ISO() 0.0184 60 0.00031

As you can see, the main offender speed-wise is the listFilteredActionsFor method - although there's not much we can do about this, since it's a CMF construct. Initial testing with CMF 1.5 (which has lazy action evaluation) didn't show any improvement here - so we'll skip this as a target for our optimizations.

I have grouped some of the methods from the calendar portlet at the bottom of the table, and given them an alternate background color. As you see, there are a number of calls here that are done an excessive amount of times.

Our main targets and some comments about each one:

Breadcrumbs
This code is essentially trying to do the same as the nav tree - just in a flat, depth-only way. It has lots of exception handling, "clever" and unnecessary code, and things that are total overkill for a breadcrumb implementation. If people want to support all the special cases, that's fine - but the default implementation should not. The fact that breadcrumb.py is 166 (!) lines long with permission checks and multiple conditional branches should be a good indicator of this.
My suggestion: See if we can re-use code from the nav tree implementation and make it more efficient, and make it a bit stupider if necessary.
listMetaTags method
I have no idea why this thing is so expensive, but it is. Tiran recently moved it to a tool to see if running it in unrestricted code would make it faster, but it only made a marginal difference, easily attributable to testing variations.
Calendar portlet
This thing is a chapter in itself. It has an incredible amount of calls (I have only included the most exceptional ones, there are lots of others), and does multiple tal:defines inside tal:repeats, among other things.
My suggestion: This code should be rewritten. It's currently building the table for the calendar in a very inefficient way, and we should also remove the pop-up divs and let it use the HTML title attribute instead, like the rest of Plone. This is also better for accessibility, and will remove half (well, almost ;) of the excessive white space in the Plone HTML output.

Progress log

April 3rd, 2005, limi:

Some interesting numbers from Plone 2.0.5, mainly showing that listMetaTags is less expensive here, and also that listFilteredActions takes less time here (what is adding a lot of actions in 2.1, and how can we minimize the impact?):

Expression Total time Number of calls Time per call
Total rendering time 8.3012 10 0.83012
python: portal.portal_actions.listFilteredActionsFor(here) 0.5233 10 0.05233
python: here.plone_utils.createNavigationTreeBuilder(portalObject,navBatchStart) 0.4885 10 0.04885
path: here/getAllowedTypes 0.4598 10 0.04598
python: here.CookedBody(stx_level=2) 0.3094 10 0.03094
path: here/listMetaTags|nothing 0.1162 10 0.01162
path: day/event 0.1153 1050 0.00011

On a positive note, we see that getAllowedTypes takes up a lot of time in 2.0.5, and that it has been totally eliminated from the 2.1 anonymous view. Also eliminated is the nav tree cost, which is negligible in 2.1. All in all, we've eliminated about 1 second on the 10 page loads with the new stuff in 2.1, but something else is bogging us down.

Unfortunately, something is sucking up the CPU time we won with the optimizations. The AT-based types?

April 3rd, 2005, limi:

Investigated the listMetaTags part after Tiran moved it to unrestricted code - it is a bit faster, but not a lot. There is a lot of crazy checks and conditionals going on to produce DC.* tags that none of the web crawlers use, and most interpret as line noise.

My proposal is to introduce a switch in site_properties called exposeDCMetaTags that is off by default, since no search engines or crawlers use it, and let the 3 people in the world (of which 2 are librarians and the last one flunked librarian school ;) turn it on with a performance penalty if they need it.

We only want meta name="description" in Plone by default, as this is the only one used by search engines - even keywords are of questionable usefulness.

April 4th, 2005, limi:

listMetaTags - before: 0.45s - after: 0.09s

Yay for Alec! His implementation of my suggestion slices the time to 1/5 of the previous usage and introduces the DC metadata switch. Next target is breadcrumbs. Tesdal has added breadcrumb support to ExtendedPathIndex, so it should be possible to make it significantly cheaper than it is now.

April 5th, 2005, limi:

breadcrumbs - before: 0.20s - after: 0.06s

Nice. Using Helge Tesdal's new ExtendedPathIndex (that also powers the new nav tree and the site map), Alec Mitchell implemented a version of the breadcrumbs that cuts rendering time to a third of the original.

April 6th, 2005, limi:

listFilteredActionsFor - before: 0.78s - after: 0.56s

A simple, but effective speedup was to remove an unnecessary loop in listFilteredActionsFor, and it's by almost a third, and is more effective the more actions you have, so it should really make a difference if you are logged in too. This is the final change we're doing on the branch, merge time! May 25th, 2005, limi:

We had an interesting use case at a client site where they had 113(!) content types. This lead us to a quite interesting discovery that listTypeInfo in CMF is extremely expensive when you get a lot of types.

Alec stepped up (as usual ;) and helped out. There is now a new method override in Plone's type tool that gets rid of the madness and uses a much more light-weight method to do the exact same job.

The result? listTypeInfo - before: 1.20s - after: 0.40s

Participants

Alexander Limi
Alec Mitchell

maybe a small solution to improve speed...

Posted by Andreas Jung at April 3, 2005 - 17:28

I've made some benchmarks on listFilterActionsFor().

60-70% of the time is spend here:

# Include actions from specific tools. for provider_name in self.listActionProviders(): provider = getattr(self, provider_name) self._listActions(append,provider,info,ec)

This code asks every single action provider for a list of matching actions. As expected most of the time is used to evaluate the conditions for every single action, security checks etc.

We can't optimize the checks for the action conditions. I might be possible to speed up the check for the action permissions. I assume that most actions share a small number of permissions e.g. View, Manage Portal or so. Instead of checking the same permission over and over again against the context object one could build up a cache that caches the permission check for the context object and a permission within one request. This could speed up things a bit...

We tried this, but didn't affect the speed

Posted by Alexander Limi at April 4, 2005 - 05:39

It didn't make a big difference - small enough to be just an artifact of slightly different conditions.

a few quick fixes

Posted by Geoff Davis at April 3, 2005 - 21:04

These two snippets probably appear in the calendar:

python: current.year()==year and current.month()==month and current.day()==int(daynumber)

python: test(current.year()==year and current.month()==month and current.day()==int(daynumber), todayevent, event)

The calendar is iterating over all days in the current month. The condition will fail about 97% of the time, but the way it is written, it won't fail until the third test. If you reverse the order of the tests, i.e. check current.day() == int(daynumber), you will reduce the computation required by about 2/3. It would be even smarter to define variables current_year = current.year(), etc, int_day_number = int(daynumber) and pull them out of whatever loop is being iterated over.

listFilteredActionsFor and worklist action

Posted by Helge Tesdal at April 3, 2005 - 21:22

I believe the workflow tool does a catalog search for each worklist when retrieving the global workflow actions. Maybe there should be a setting somewhere indicating if the global workflow actions should be included.

Shouldn't matter here

Posted by Alexander Limi at April 4, 2005 - 05:38

This benchmark is for anonymous users, and indeed it made no perceptible difference when the code was removed.

That being said, if the worklists have no purpose in the actions, they should go away.

We should do some profiling for logged-in users with review queue, editing a document etc - but I was trying to make these targets small and easy to knock down in time for 2.1.

pts

Posted by Lalo Martins at April 4, 2005 - 01:45

is it possible to assess how much the PTS is responsible for any slowness?

Yes, but not from inside PTProfiler

Posted by Alexander Limi at April 4, 2005 - 05:33

You'll have to use the Python profiler or similar for that. I'm not worthy. ;)

I guess a simple benchmark would be to remove PTS from a Plone install and benchmark it before/after.

Schwartzian Transform on sorts

Posted by Helge Tesdal at April 4, 2005 - 12:31

This might only be noticeable in bigger sites, but still.

We can use Schwartzian Transform when sorting. That is to precompute the keys, and put it in a list of tuples, instead of having the sort look up the attributes of objects for every comparision.

unsorted = [object1, object2] sortlist = [(o.sortkey, o) for o in unsorted] sortlist.sort() sorted = [x[-1] for x in sortlist]

sortlist becomes a list of tuples, like [(sortkey1, object1), (sortkey2, object2)]

This was mentioned by the Reflab guys some time ago, and they did some profiling indicating noticeable difference in bigger lists.

Schwartzian tranforms

Posted by Alec Mitchell at April 4, 2005 - 13:29

I love Schwartzian transforms and try use them everywhere they make sense. (Un)fortunately, I don't see very many unoptimized sorts in CMFPlone. The few that are around are in python scripts and tend to operate on small lists (allowedTypes, availableLanguages, RoleMap, worklists, configlets). Not much to be gained here I fear.

Faster calendar

Posted by Gilles Lenfant at April 6, 2005 - 23:15

Hi,

I spent some couple of hours splitting the calendar portlet in 2 parts:

  • A python scripts that builds the data
  • A new calendar portlet with only simple path expressions to render those data (no complex nested expressions)

On my development box :

  • standard calendar portlet with 4 events in current month is built in 71 ms
  • the faster calendar portlet with same events is built in 49 ms (saving about 30%)

This is only a direct translation to Python of the logic found in the standard portlet. Things could certainly be faster with a better optimized code.

This is poorly tested but behaves exactly as the standard portlet.

Want the code ? Where should I post it (not a Plone commiter :o) ?

Cheers

Metadata

Posted by Eric W. Brown at August 20, 2005 - 14:16

Quoting from above:

My proposal is to introduce a switch in site_properties called exposeDCMetaTags that is off by default, since no search engines or crawlers use it, and let the 3 people in the world (of which 2 are librarians and the last one flunked librarian school ;) turn it on with a performance penalty if they need it.

We only want meta name="description" in Plone by default, as this is the only one used by search engines - even keywords are of questionable usefulness.

There's actually no reason to dis the librarians in the name of performance since one can have the best of both worlds. There is a relatively standard way of handling Dublin Core metadata that makes it available to tools that want it without making the main pages load significantly more slowly. It's done by making an RDF of metadata available, but only linking to it from the source XHTML. Thus humans browsing don't spend any time waiting for the invisible metadata to be generated / downloaded, but savvy search engines (and yes, some are already deliberately scooping up external metadata -- you can check your own logs once you've implemented it) and Semantic Web tools (as well as presumably librarians) can specifically request it. Plus, this technique is open to easy future expansion as it's not at all restricted to Dublin Core metadata. PRISM and FOAF metadata are currently found in such RDF stores in the wild now, too.

The Dublin Core site itself makes use of this technique (although they still embed the data in the XHTML, too, even though that's overkill -- of course, being the Dublin Core site it makes sense for them to really show off Dublin Core metadata).

We've had a working dynamic metadata generator in place at Saugus.net too for quite a long time built using only raw Zope CMF and not requiring anything from Plone. I'm sure it could very easily be ported to Plone, though. The general idea is to make a page template called metadata.rdf that can be accessed as a method of any object in the tree, dynamically load it up with Dublin Core goodness, and a single line like:

to the standard head section macro. It's fast and easy and doesn't sacrifice any functionality for tools that do make use of metadata. We're still using CMF 1.4, but it should be a trivial port to 1.5. I should check on getting it added into the base CMF.


For any issues with the web site functionality, please file a ticket.

Please consult the policy on plone.org content if you want your content published on this site.

Servers and hosting by