Caching setup using Apache 1.3

by Alex Limi last modified Mar 18, 2010 05:36 PM
How to make the server ready to cache - using Apache 1.3.

Make sure you have an installed Apache web server. This particular setup is based on Apache 1.3. For other web servers, see the next pages in this tutorial.

Make sure Apache is be running on port 80 by checking that you can see the default Apache page at http://site.com.

We need to edit your Apache configuration file (normally located at /etc/httpd/conf/httpd.conf - use locate httpd.conf if you don't know where the configuration file is), and there are some things you'll need to check:

  • Ensure that mod_proxy is enabled. Search the list of loaded modules for libproxy.so, and ensure that the following lines are not commented out:
      LoadModule proxy_module modules/libproxy.so
      AddModule mod_proxy.c
    
  • Setup a VirtualHost for your domain name that uses a reverse-proxy in order to pass through requests to Zope. The following example can be cut and pasted into your configuration file at the bottom, and you'll only need to edit a couple of lines:
     <VirtualHost *>
    
     # A sample VirtualHost section for using Apache as a webserver 
     # instead of Zope.
     # ServerName is the url of your website.
    
     ServerName site.com
    
     # Add serverAlias lines for other domain names that should 
     # point to this website. They will be rewritten by Apache to 
     # the ServerName, so that anyone going to www.site.com 
     # will be invisibly redirected to site.com in their browser.
    
     ServerAlias www.site.com
    
     # ServerAdmin is your email address, which shows up on error 
     # pages when Apache cannot connect to Zope.
    
     ServerAdmin webmaster@site.com
    
     # The ProxyPass and ProxyPassReverse lines are the magic 
     # ingredients. They rewite requests to http://site.com and 
     # pass the entire request through to Zope on 
     # http://site.com:8080. The VirtualHostBase ensures that 
     # when the page goes back to the browser, it goes out through 
     # Apache, and appears to have come from http://site.com.
    
     # The line is made up from:
    
     # ProxyPass or ProxyPassReverse
    
     # / is the url at http://site.com that you wish to use to 
     # point to the Zope site. You could keep http://site.com as a 
     # flat HTML site in Apache, and replace / with /zope to make 
     # http://site/com/zope point to your zope site.
     # http://site.com:8080 is the address that your zope is 
     # running on.
    
     # /VirtualHostBase/http/site.com:80 makes sure that zope 
     # *thinks* it is running at http://site.com instead of at 
     # http://site.com:8080. You don't have to do anything else 
     # in Zope to make this work.
    
     # /yourplonesite is the location of your Plone Site within Zope. 
     # If you added a Plone Site into the root of your Zope with an id 
     # of 'mysite', then you just change this bit to /mysite
    
     # /VirtualHostRoot/ makes your Plone site think it is the root of the site.
    
     ProxyPass / http://site.com:8080/VirtualHostBase/http/site.com:80/plonesite/VirtualHostRoot/
     ProxyPassReverse / http://site.com:8080/VirtualHostBase/http/site.com:80/plonesite/VirtualHostRoot/
     </VirtualHost>
    

Now restart apache, and you should find that http://site.com is now your Plone website. (If you have any problems, make sure that libproxy.so is present in your /etc/httpd/modules/ directory.)

Now you've got a fairly respectable setup: Apache is serving web requests, and Zope is a backend server. As a bonus, you also have a complete virtual hosting setup, so you can run multiple different sites with multiple different domains on a single server with a single IP address. All you need to do is dupliate and edit that VirtualHost section for each site.

You might want to make a note of how fast this setup is running at this point. Note that we haven't gained any real speed advantage yet, we've just laid the foundations for it. In a terminal, use the Apache Benchmark application to test the speed of your site. The application is normally in '/usr/sbin/ab':

 /usr/sbin/ab -n 100 http://yoursite.com/

This will time 100 consecutive requests to your server. You should note that requests all take about 0.5 to 1.0 seconds, with not a great deal of variance. Whilst this might not seem too bad for a dynamic page, remember that this is just the HTML. For a whole page with CSS, Javascript, and images, the processing time can be a lot longer.

Typical output might look like this:

 Benchmarking site.com (be patient).....done
 Server Software: Zope
 Server Hostname: site.com
 Server Port: 80

 Document Path: /
 Document Length: 32560 bytes

 Concurrency Level: 1
 Time taken for tests: 68.901 seconds
 Complete requests: 100
 Failed requests: 0
 Broken pipe errors: 0
 Total transferred: 3293500 bytes
 HTML transferred: 3256000 bytes
 Requests per second: 1.45 [#/sec] (mean)
 Time per request: 689.01 [ms] (mean)
 Time per request: 689.01 [ms] (mean, across all concurrent requests)
 Transfer rate: 47.80 [Kbytes/sec] received

So, what can we do to make it faster?

The first thing we can do is to allow Apache to cache the results of pages. This can happen because in configuring Apache to be the front-end server, we've actually created a caching reverse-proxy, meaning that all of the pages that Zope produces are now going back out to the browser through Apache, and can be cached to dramatically increase performance. It's just that we haven't told Apache to cache anything yet.

To let Apache know that we wish to cache content with a certain expiry time, mod_expires must be installed. Check your Apache modules directory (normally /etc/httpd/modules) for mod_expires.so, and then make sure that you have the following lines in your Apache configuration file:

 LoadModule expires_module modules/mod_expires.so
 AddModule mod_expires.c

At the end of your VirtualHost section, just before the </VirtualHost> add the following lines:

 # CacheRoot is the location on the filesystem to store files that 
 # Apache caches. This directory must be created, and the user that 
 # Apache runs as must have full write permissions to it.
 # It's a bad idea to create this in the /tmp directory, as the 
 # directory itself will then be deleted when you reboot.
 CacheRoot "/var/cache/site.com"

 # CacheSize determines how big this cache can get in KB. It's a 
 # good idea that this number is about 30% less than the available 
 # space in the CacheRoot directory. Here we choose to cache 10MB 
 # of data, which is enough for a personal website, but not for 
 # anything larger.
 CacheSize 10000

 # CacheGcInterval specifies how often (in hours) to examine the 
 # cache and delete obsolete files.
 CacheGcInterval 2

 # CacheLastModifiedFactor allows the estimation of an expiry date 
 # for a page if it doesn't have an expiry-date specified in the 
 # HTTP headers returned from Zope. This is based on (time since 
 # last modification * CacheLastModifiedFactor), so that content 
 # that is ten hours old would be given an expiry date of 1 hour in 
 # the future.
 CacheLastModifiedFactor 0.1

 # CacheDefaultExpire sets a default expiry time of 1 hour into the 
 # future for cached pages.
 CacheDefaultExpire 1

 # CacheDirLength sets the number of characters used in directory 
 # names for subdirectories of CacheRoot
 CacheDirLength 2

 # The following definitions set expiry times for various content 
 # types. In this list, each content type defined is cached for a 
 # maximum period of 1 hour (3600 seconds) before it must be checked 
 # again. Non-listed content types are not cached.

 ExpiresActive On
 ExpiresByType image/gif A3600
 ExpiresByType image/png A3600
 ExpiresByType image/jpeg A3600
 ExpiresByType text/css A3600
 ExpiresByType text/javascript A3600
 ExpiresByType application/x-javascript A3600
 ExpiresByType text/html A3600
 ExpiresByType text/xml A3600

Once you've finished adding this to the VirtualHost section, save the config file. Now, go into your /var/cache folder, and create the directory defined as the CacheRoot in the configuration that you have just edited, and make it writable by the apache user. In the case of our example, this would be:

 mkdir /var/cache/site.com
 chown -R apache:apache /var/cache/site.com

Now restart Apache to ensure that these changes take effect:

 apachectl graceful

Up until this point, although Apache is capable of performing caching, none of your pages are actually being cached. We can test this by using wget, and outputting the HTTP Response headers:

 wget -sS --delete-after http://site.com/

 --03:16:51-- http://site.com/
 => `index.html'
 Resolving site.com... done.
 Connecting to site.com[127.0.0.1]:80... connected.
 HTTP request sent, awaiting response...
 1 HTTP/1.1 200 OK
 2 Date: Mon, 19 Jan 2004 03:16:51 GMT
 3 Server: Zope/(unreleased version, python 2.3.2, linux2) ZServer/1.1 Plone/2.0-RC3
 4 Vary: Accept-Encoding
 5 Content-Length: 32560
 6 Content-Language:
 7 Expires: Mon, 19 Jan 2004 04:16:52 GMT
 8 Etag:
 9 Cache-Control: max-age=3600
 10 Content-Type: text/html;charset=utf-8
 11 X-Cache: MISS from site.com
 12 Connection: close

Notice the X-Cache header showing a cache MISS. While we expect this the first time we hit a page (as it has not yet already been cached), we would expect a HIT from subsequent requests to that page if caching is working properly.

Your system is now at the point where it is fully capable of caching content, and all that remains is to tell your Plone site's Accelerated HTTP Cache Manager what it should cache.