Warning

This document hasn't been checked for compatibility with current versions of Plone. Use at your own risk.

Caching and purging content

by Robert Nagle last modified Dec 30, 2008 03:06 PM
How Enfold Proxy caches and purges Plone content in Windows

In the previous section, we saw how the commercial product  Enfold Proxy can be configured to proxy one or more Plone sites. This section examines how the Microsoft system administrator can use Enfold Proxy to  implement caching  (using the GUI control panel) and   purge content when needed.

To summarize the main steps:

  1. Create an EP proxy definition to link your Plone site with IIS. (That's what you did in the previous section).
  2. Select one of the available caching products for Plone. Install and enable it.
  3. Edit the cache settings for your proxy definition.
  4. Verify that Your Cache Settings are in Effect

Caching increases Plone's performance significantly, so most deployments feature a caching solution of some kind. Enfold Proxy (EP) offers a caching solution which is easy to configure and track. They can't handle all requests -- personalized pages and dynamic content require Zope -- but when they can handle a request, they should.

Before trying caching, you should review the tutorial about how to view and test cache settings with your browser. That can get tricky.

Enfold Proxy lets you set up caching for each proxy definition.  Each proxy definition will store the files for caching in a separate directory (typically C:\Program Files\Enfold Proxy\cache). This table of caching options contains a description of each option which you can configure through EP's configuration utility.

 

What is a cacheable item?

First, let's cover the kind of web items that can and should be cached.

  • static content (images and JavaScript)
  • content for anonymous users. Items (such as Plone-generated HTML) can be cached with little danger.
  • selected content for logged in users.

Warning Incorrect configuration of caching has the potential for authenticated requests being cached (which is bad). Generally, the two Plone products discussed below have sensible policies about not caching authenticated requests. Nonetheless, if you are creating a customized CacheFu policy or using another caching product for Plone, it's a good idea to confirm that authenticated requests are not being cached.

First Step: Install and Activate a Caching Product

Before you can cache, you must install and activate a product in Plone specifically for caching. With this product you can configure and customize cache settings.

Install a Caching Product

(these are generic instructions for installing any Plone product).

  1. Choose which caching product you wish to use (see below).
  2. If necessary, download the product and place inside the Products directory of Plone/Enfold Server(typically C:\Program Files\Enfold Server\Products in Windows).
  3. Go to the Plone control panel (i.e., http://www.originalfunsite.com/plone_control_panel) and select Add/Remove Products. You should see the name of your Product(s) on the Products available for install list.
  4. In the Plone control panel, look for a link under Add-on Product Configuration. One of these links should lead you to a configuration menu for the caching product you have chosen to install.
  5. From the product configuration options, make sure caching has been enabled/turned on.

Available Caching Products

 

  • CacheFu is an open source product with a wide variety of features. Actually, it consists of 4 separate products (CacheSetup, PageCacheManager, CMFSquidTool, and PolicyHTTPCacheManager) which are dropped into the Products directory. CacheFu offers a lot more granular control over caching. It lets you set policies and configure headings and cache rules for different content types in Plone.
  • Important: you need to enable a caching policy for CacheFu to work in EP. On the first tab, you need to choose Zope Behind Squid as the Active Cache Policy.

 

images/cacheconfigtool1.png
  • Chasseur is a Plone product included with Enfold Server (a commercial product). It was designed for ease-of-use and simplicity.  (Read more about Chasseur). The main two settings are Normal (cache images, js, css for one hour) and Aggressive (cache js, css, images for one hour and also cache content for anonymous users for 1 hour).

    • CacheFu will use cache headers different from chasseur. But EP is able to process both of them.
    • Normal is generally risk-free and unlikely to cause problems, especially if image files are unlikely to change.
    • Aggressive is generally not recommended because it interferes with EP's ability to purge content. When that happens, an individual browser might use private cache instead of checking with EP/ES to make sure it has fresh content.

 

Enable Cache in your Proxy Definition

By default, when you create a proxy definition in EP, caching is enabled. However, you might need to change your settings to suit your hardware.

Enfold Proxy will keep a lot of "stale" cached files inside the cache directory corresponding to a specific proxy definition. These stale cache files will accumulate in the cache directory until it approaches the maximum cache size. That is actually a good thing. In Chasseur, for example (Enfold's own caching product in Enfold Server), a lot of content is cached for one hour. After one hour (or even after five years), whenever the browser requests the same resource, EP will check with Plone and ask, "is this stale item still valid?" Plone will give one of two responses. Either:

  • Plone will reply, "it's still good." (In this case, EP will return the stale item to the browser and update its own records to indicate that this stale cache is now valid. This process is called, revalidating the cache). The HTTP headers to the user will say: X-Cache: HIT from www.originalfunsite.com
  • Plone will reply, "Nope. That's an outdated version." (In this case, Enfold Server or Plone will send the updated version to Enfold Proxy, which will pass it on to the browser and also replace the outdated version with the newer version). X-Cache: MISS from www.originalfunsite.com .

Next Step: Verify that Your Cache Settings are in Effect

To verify that caching is taking place, look for this line in your HTTP response header: X-Cache: HIT from www.originalfunsite.com

  • If you see X-Cache: HIT, then yes, caching is occurring.
  • If you see X-Cache: MISS, then this particular item was NOT cached.

Generally HIT's indicate successful caching.

Keep in mind that X-Cache: MISS is not always a bad thing. For example, if logged in as an administrator, many resources will not be cached on purpose. It's best to test as an anonymous web surfer (i.e., someone who is not logged in). If an item contains a modification, then a MISS reply is mandatory.

Careful: When testing, it's a good idea to use two different browsers. In Browser A (such as Internet Explorer), log in as Admin. In Browser B (i.e., Firefox), log in as anonymous or as a logged in user. It's also important to clear cache the right way and even to close the browser entirely (if you don't, the http headers you see won't be accurate). Consult the troubleshooting checklist for caching.

To see the HTTP headers directly, see Tools for Viewing Headers . You should probably pick two or three sample items to check for headers. Anything will do, but at least one should be a graphic and one should be a Zope page (such as http://www.originalfunsite.com/events ). In Chasseur, the maximum age is

Cache-Control: max-age=3600 X-Cache: HIT from www.originalfunsite.com

Optimize Cache Settings and Measure Caching Performance

Many of your cache settings can be tweaked in whatever Plone product you are using to cache web items. The CacheFu Plone product, for example, offers a lot of fine-grained controls and the ability to create caching rules.

But EP also offers some cache settings to tweak. (By the way,  there are actually separate screens for the Caching and the (Adv) link at the top. The screenshot below only shows what you see when you press the (Adv) screen.   

advanced cache settings Enfold proxy

Here are some default settings for each proxy definition as set by Enfold Proxy:

  • Maximum Size of the Cache: 10 gigabytes for each proxy definition
  • Maximum number of items in the cache for each proxy definition: 50,000
  • Maximum size of an item which can be cached: 100K
  • Default-Age: 0

The main thing you can adjust here is storage amount and item size. The larger the storage amount you have, the more cacheable items EP can keep at once. Increasing maximum item size is recommended if your Plone server returns a lot of web items (which is not the same thing as "web pages") over 100K. If the maximum is increased to a very large size, this will also consume RAM on the machine with EP. Also, when you cache larger files, that will cause the maximum size of the cache to fill up more quickly. Theoretically, if your cache directory for your proxy definition is always at maximum and the hit percentage is declining, that could indicate you need to increase your maximum here.

Default-age=0 will cause EP to store cacheable items in the cache and always check to make sure if it is current before serving cached content to the browser. (This is the default and generally recommended). Even so, EP will not cache an item which is forbidden to be cached (according to Plone or whatever Plone Product is setting these commands).

If you check Enfold Proxy's cache.log messages for your proxy definition (and set the log level to Information), you will see every minute or so some statistics about caching:

2008-02-18 10:31:45,250|cache.host originalfunsite|STATS|3500|2856|Cache statistics:
        gets: 88, hits: 65 (33 validated), misses: 23 (0 uncachable)
        hitrate: 73% (58% excluding validations)
        size: 1943668 bytes, 324 items

The most important number here is the 58% in parenthesis.  The higher, the better.  This number refers to requests that EP returned without forwarding it to Enfold Server or Plone (thus resulting in the most time-saving). The 73% number includes those occasions when the item was stale and EP needed to make a conditional request to Enfold Server/Plone. So 15% (i.e., 73-58) is the percentage of times when EP had to revalidate stale cache by checking with Enfold Server or Plone. These revalidation requests generally do not result in significant time savings, so for all practical purposes the only number you need to worry about is the number in parenthesis.

The cache directory

The proxy allows you to configure the directory where cached items are stored for a particular proxy definition.

  1. Open EP configuration tool.
  2. Select your proxy definition. Select the Caching tab.
  3. Leave this blank if you want your cache directory to exist under the application directory. The directory will be created if it does not exist. If you wish to put your cache files elsewhere, you should copy the complete path to the directory here (e.g., C:\My Documents\proxy definition 1\cache).

IIS must be restarted before a change to this option will take effect.  

When you change the cache_directory, the old cache is not migrated to the new location (i.e., the old cache remains and a new empty cache is created at the new location.)

Migrating your cache to a different place:

  1. Stop IIS.
  2. Move the cache directory into a new location.
  3. Copy the absolute path into the Cache Directory field of your proxy definition.
  4. restart IIS.

 

Purging from a Remote Machine

PURGE is a specialized HTTP command. This command is comprehensible to proxy servers like Squid, Varnish and Enfold Proxy; it orders the proxy server to evict the item from cache so future requests will fetch a fresh new copy. On the machine with Enfold Proxy, files will essentially be deleted from C:\Program Files\Enfold Proxy\cache\host www.originalfunsite.com\data .

Important: Purging will not work if you are using Chasseur  caching product and using   the Aggressive caching profile.

Enfold Proxy (EP) has controls to prevent purging commands from being issued by anyone. Generally, before you can do any purging, you need to explicitly declare which I.P. addresses are allowed to issue a PURGE command.

To do this,  start the EP configuration utility from the start menu,  select  Settings --> Caching purge sources and type the I.P. address of any computer which will issue a PURGE command. At minimum, you need to type the IP address corresponding to your Plone or Enfold Server. This is required if you are going to run a PURGE command from within a Plone product. If you intend using a local command-line tool (like cURL) along with a remote Plone server, you must include the I.P. addresses of both the localhost (127.0.0.1) and the Plone host.

Note In many cases, it is necessary to type in 2 IP addresses here. First, 127.0.0.1 (which is localhost) and second, the actual IP address of your machine (such as 192.168.1.1). If you are testing when logged on as a domain user on the machine itself, sometimes IIS will treat domain requests as coming from outside the machine itself. If you are unsure which IP address is making the PURGE request, examine your IIS HTTP access logs and verify the IP address which made the PURGE request.

Special Note about Windows authentication (NTLM) and Purging. Enfold Server (ES) lets you use NTLM authentication to automatically log users in. If you are using NTLM with Enfold Server, you might not be able to run purge commands successfully. This is a known bug with Enfold Proxy 4.

Purge Cache within Enfold Proxy

Version 4x of Enfold Proxy includes a way to purge cache. To do this, select the Purge option on the left panel of EP's configuration utility. By checking or unchecking each proxy definition, you can delete the cache for one or multiple proxy definitions. Here is the result screen after you click the purge button.   

 

 

Contribute

Something wrong or out of date? Anybody can edit or create a new article in the knowledge base. Simply create an account on this site, log in, and click the Edit button to contribute.