Optimizing Plone Performance
This tutorial will show you a simple and effective way to use caching to make your Plone site a production-worthy setup capable of delivering in excess of 100 pages per second given proper hardware. (In progress)
Introduction, goals and credits
What this tutorial does and does not cover, and who's responsible for it.
In this tutorial, we will teach you how to cache your Plone site in a good way to get it responsive and fast for anonymous users visiting your site.
The strategy presented is a simple and efficient one, but has some caveats, and is not the magic silver bullet to solve any performance problem you may have. Very often, performance problems in Plone sites are caused by improper setup of Plone or the server it is running on, or badly written third-party products. It's important to know that there are more aspects that have to be taken into consideration when optimizing the performance of a Plone site.
This tutorial aims to do the following:
- Set up a web server in front of Plone that can cache elements that are marked up as being cachable - so they don't hit the Plone instance at all - but are rather served by the web server cache in front of it. This is especially efficient for assets like files, CSS and Javascript, but you can also apply it to entire pages if you need to serve up pages extremely fast.
- Set up a different virtual host with no caching for editor usage.
Credits
The Apache 1.3 setup was originally written up by Seb Potter, and was adapted to the updated Plone and Zope versions by Alexander Limi.
Common elements to caching
A brief explanation about what we are going to set up.
Plone is a very complex system. Whereas a flat HTML site might only take a hundredth of a second to load from a server, Plone's main page is fully dynamically rendered for each and every request. This might mean it takes up to 3 or 4 seconds to load a particularly expensive page if you have a slow server or and no caching.
Adding on top the fact that, including javascript and css the page size exceeds 100KB and can take more than 30 seconds to download, this can make for a pretty slow experience. Obviously, if you've got more than a couple of users on your site at once, it's going to be unusable.
That's the bad news.
However, it's not all bad. The Zope application which underlies
Plone was never meant to be a web server. Fortunately, we can use
Apache as a front-end web server to handle all of the tedious
connections from the web through a process called reverse
proxying.
To setup a reverse proxy is pretty easy. First, configure your Zope so that it's not running on port 80. By default, Zope runs on port 8080, but some installers and distributions use other ports - so substitute 8080 for whatever port your instance is running on. The important thing is that it should not be running on port 80, where we'll add the web server that should be in front of Plone.
Test this by accessing your site at http://yoursite.com:8080.
For the proxying to work, you need to add a Virtual Host Monster to the
root folder of your Zope installation. Call it anything you want, but
it must be added to the root of your Zope, and not the root
of your Plone site.
Plone 2.0 only: If you are running Plone 2.0.x, you will also need to edit the portal_skins/plone_templates/global_cache_settings
template to prevent Plone from sending out a Pragma: no-cache
HTTP header. By default, Plone was setup to disable all HTTP caching to ease
development.
Simply locate the template in your Plone site through the ZMI, and customise it into your custom skin folder. Now, edit it so that the contents are::
You are now ready to set up the web server.
Caching setup using Apache 1.3
How to make the server ready to cache - using Apache 1.3.
Make sure you have an installed Apache web server. This particular setup is based on Apache 1.3. For other web servers, see the next pages in this tutorial.
Make sure Apache is be running on port 80 by checking that you can see the default Apache page at http://site.com.
We need to edit your Apache configuration file (normally located at /etc/httpd/conf/httpd.conf - use locate httpd.conf if you don't know where the configuration file is), and there
are some things you'll need to check:
- Ensure that
mod_proxyis enabled. Search the list of loaded modules for libproxy.so, and ensure that the following lines are not commented out:LoadModule proxy_module modules/libproxy.so AddModule mod_proxy.c
- Setup a VirtualHost for your domain name that uses a
reverse-proxy in order to pass through requests to
Zope. The following example can be cut and pasted into
your configuration file at the bottom, and you'll only
need to edit a couple of lines:
<VirtualHost *> # A sample VirtualHost section for using Apache as a webserver # instead of Zope. # ServerName is the url of your website. ServerName site.com # Add serverAlias lines for other domain names that should # point to this website. They will be rewritten by Apache to # the ServerName, so that anyone going to www.site.com # will be invisibly redirected to site.com in their browser. ServerAlias www.site.com # ServerAdmin is your email address, which shows up on error # pages when Apache cannot connect to Zope. ServerAdmin webmaster@site.com # The ProxyPass and ProxyPassReverse lines are the magic # ingredients. They rewite requests to http://site.com and # pass the entire request through to Zope on # http://site.com:8080. The VirtualHostBase ensures that # when the page goes back to the browser, it goes out through # Apache, and appears to have come from http://site.com. # The line is made up from: # ProxyPass or ProxyPassReverse # / is the url at http://site.com that you wish to use to # point to the Zope site. You could keep http://site.com as a # flat HTML site in Apache, and replace / with /zope to make # http://site/com/zope point to your zope site. # http://site.com:8080 is the address that your zope is # running on. # /VirtualHostBase/http/site.com:80 makes sure that zope # *thinks* it is running at http://site.com instead of at # http://site.com:8080. You don't have to do anything else # in Zope to make this work. # /yourplonesite is the location of your Plone Site within Zope. # If you added a Plone Site into the root of your Zope with an id # of 'mysite', then you just change this bit to /mysite # /VirtualHostRoot/ makes your Plone site think it is the root of the site. ProxyPass / http://site.com:8080/VirtualHostBase/http/site.com:80/plonesite/VirtualHostRoot/ ProxyPassReverse / http://site.com:8080/VirtualHostBase/http/site.com:80/plonesite/VirtualHostRoot/ </VirtualHost>
Now restart apache, and you should find that http://site.com is
now your Plone website. (If you have any problems, make sure that
libproxy.so is present in your /etc/httpd/modules/ directory.)
Now you've got a fairly respectable setup: Apache is serving web requests, and Zope is a backend server. As a bonus, you also have a complete virtual hosting setup, so you can run multiple different sites with multiple different domains on a single server with a single IP address. All you need to do is dupliate and edit that VirtualHost section for each site.
You might want to make a note of how fast this setup is running at this point. Note that we haven't gained any real speed advantage yet, we've just laid the foundations for it. In a terminal, use the Apache Benchmark application to test the speed of your site. The application is normally in '/usr/sbin/ab':
/usr/sbin/ab -n 100 http://yoursite.com/
This will time 100 consecutive requests to your server. You should note that requests all take about 0.5 to 1.0 seconds, with not a great deal of variance. Whilst this might not seem too bad for a dynamic page, remember that this is just the HTML. For a whole page with CSS, Javascript, and images, the processing time can be a lot longer.
Typical output might look like this:
Benchmarking site.com (be patient).....done Server Software: Zope Server Hostname: site.com Server Port: 80 Document Path: / Document Length: 32560 bytes Concurrency Level: 1 Time taken for tests: 68.901 seconds Complete requests: 100 Failed requests: 0 Broken pipe errors: 0 Total transferred: 3293500 bytes HTML transferred: 3256000 bytes Requests per second: 1.45 [#/sec] (mean) Time per request: 689.01 [ms] (mean) Time per request: 689.01 [ms] (mean, across all concurrent requests) Transfer rate: 47.80 [Kbytes/sec] received
So, what can we do to make it faster?
The first thing we can do is to allow Apache to cache the results of
pages. This can happen because in configuring Apache to be the
front-end server, we've actually created a caching reverse-proxy,
meaning that all of the pages that Zope produces are now going back out
to the browser through Apache, and can be cached to dramatically
increase performance. It's just that we haven't told Apache to cache
anything yet.
To let Apache know that we wish to cache content with a certain
expiry time, mod_expires must be installed. Check your Apache modules
directory (normally /etc/httpd/modules) for
mod_expires.so, and then make sure that you have the following lines in
your Apache configuration file:
LoadModule expires_module modules/mod_expires.so AddModule mod_expires.c
At the end of your VirtualHost section, just before the </VirtualHost>
add the following lines:
# CacheRoot is the location on the filesystem to store files that # Apache caches. This directory must be created, and the user that # Apache runs as must have full write permissions to it. # It's a bad idea to create this in the /tmp directory, as the # directory itself will then be deleted when you reboot. CacheRoot "/var/cache/site.com" # CacheSize determines how big this cache can get in KB. It's a # good idea that this number is about 30% less than the available # space in the CacheRoot directory. Here we choose to cache 10MB # of data, which is enough for a personal website, but not for # anything larger. CacheSize 10000 # CacheGcInterval specifies how often (in hours) to examine the # cache and delete obsolete files. CacheGcInterval 2 # CacheLastModifiedFactor allows the estimation of an expiry date # for a page if it doesn't have an expiry-date specified in the # HTTP headers returned from Zope. This is based on (time since # last modification * CacheLastModifiedFactor), so that content # that is ten hours old would be given an expiry date of 1 hour in # the future. CacheLastModifiedFactor 0.1 # CacheDefaultExpire sets a default expiry time of 1 hour into the # future for cached pages. CacheDefaultExpire 1 # CacheDirLength sets the number of characters used in directory # names for subdirectories of CacheRoot CacheDirLength 2 # The following definitions set expiry times for various content # types. In this list, each content type defined is cached for a # maximum period of 1 hour (3600 seconds) before it must be checked # again. Non-listed content types are not cached. ExpiresActive On ExpiresByType image/gif A3600 ExpiresByType image/png A3600 ExpiresByType image/jpeg A3600 ExpiresByType text/css A3600 ExpiresByType text/javascript A3600 ExpiresByType application/x-javascript A3600 ExpiresByType text/html A3600 ExpiresByType text/xml A3600
Once you've finished adding this to the VirtualHost section, save
the config file. Now, go into your /var/cache folder, and
create the directory defined as the CacheRoot in the
configuration that you have just edited, and make it writable by the
apache user. In the case of our example, this would be:
mkdir /var/cache/site.com chown -R apache:apache /var/cache/site.com
Now restart Apache to ensure that these changes take effect:
apachectl graceful
Up until this point, although Apache is capable of performing caching, none of your pages are actually being cached. We can test this by using wget, and outputting the HTTP Response headers:
wget -sS --delete-after http://site.com/ --03:16:51-- http://site.com/ => `index.html' Resolving site.com... done. Connecting to site.com[127.0.0.1]:80... connected. HTTP request sent, awaiting response... 1 HTTP/1.1 200 OK 2 Date: Mon, 19 Jan 2004 03:16:51 GMT 3 Server: Zope/(unreleased version, python 2.3.2, linux2) ZServer/1.1 Plone/2.0-RC3 4 Vary: Accept-Encoding 5 Content-Length: 32560 6 Content-Language: 7 Expires: Mon, 19 Jan 2004 04:16:52 GMT 8 Etag: 9 Cache-Control: max-age=3600 10 Content-Type: text/html;charset=utf-8 11 X-Cache: MISS from site.com 12 Connection: close
Notice the X-Cache header showing a cache MISS. While we expect
this the first time we hit a page (as it has not yet already been
cached), we would expect a HIT from subsequent requests to that page if
caching is working properly.
Your system is now at the point where it is fully capable of caching content, and all that remains is to tell your Plone site's Accelerated HTTP Cache Manager what it should cache.
Caching setup using Apache 2.0
How to set up the caching using Apache 2.0.
The major differences from Limi's instructions on Apache 1.3 were:
- In order to get apache proxying the requests the following line needs to be uncommented in the httpd.conf::
LoadModule proxy_http_module modules/mod_proxy_http.so
- The following modules are also needed to get the caching working. They may not appear in your httpd.conf so if not, you'll have to add them::
LoadModule disk_cache_module modules/mod_disk_cache.so
LoadModule cache_module modules/mod_cache.so
- The following setting in httpd.conf file was not enough::
CacheRoot "/var/cache/artpropensity.com"
It must be proceeded by: 'CacheEnable disk /'
so that it looks like::
CacheEnable disk /
CacheRoot "/var/cache/artpropensity.com"
- I also had to set explicit IP's and hosts in my /etc/hosts file, as
in::
192.168.10.10 www.artpropensity.com artpropensity.com
This is so that the following settings in the httpd.conf file will
get resolved properly::
ProxyPass / http://www.artpropensity.com:9080/VirtualHostBase/http/www.artpropensity.com:80/artpropensity/VirtualHostRoot/
ProxyPassReverse / http://www.artpropensity.com:9080/VirtualHostBase/http/www.artpropensity.com:80/artpropensity/VirtualHostRoot/
- Using "wget -sS ..." as command line also did not work for me. I
ended up running a browser on an external server using X-windows. I set the
browser to no-cache (check for new page each time) to make sure that it did
not cache items and instead made a full request to Apache each time it
accessed the site. I also used your advice and performed::
$ tail -f Z2.log
to make sure that Plone was not serving up .js, .css, .gif, etc. and
instead Apache was serving them up once cached. I did notice 304's on some
items, but I assume that's because the browser was set to no-cache and it
forced Apache to check with Plone to see if the file had changed.
**I noticed a speed increase from 4 requests/sec to an average of 300 using Apache 2 caching**
Deciding what to cache
What content can be cached, and how do you set it up for an optimal experience?
