There are many things you can do to improve the performance and scalability of a Drupal powered website. Before adding or upgrading servers, applying performance oriented patches, or any of the many other topics of varying complexity that will be discussed in this book, you should first enable all of Drupal's relevant built-in performance options.
Find Drupal's performance configuration options by navigating to the Performance page in the Site Configuration section of your website's administration pages. When the page cache is enabled, Drupal will save a fully rendered copy of each page accessed by anonymous visitors in the cache_page database table. When the same page is subsequently visited by the same or another anonymous user, the pre-rendered, cached copy is quickly and efficiently served directly out of the cache_page table. This cached copy will not be served to logged in users because pages are usually customized for logged in users. As most public web pages see significantly more anonymous traffic than logged in traffic, enabling the page cache generally results in a very significant performance improvement.
Drupal's page cache only caches pages accessed by anonymous visitors utilizing the HTTP GET method.

The page cache has three modes, disabled, normal and aggressive. Drupal has many built in caches, but most can not be disabled. It is a common misconception to assume that when you set the cache mode to disabled that you are turning off all caches, but this is not the case -- you are only disabling the page cache.

In the code, the three cache levels are defined as constants in the includes/bootstrap.inc include file. These constants are CACHE_DISABLED, CACHE_NORMAL, and CACHE_AGGRESSIVE. When the Drupal page cache is enabled, whether it is in normal mode or aggressive mode the same content is cached for anonymous visitors. The primary difference between these two cache modes is that Drupal does not invoke the _boot() or _exit() hooks defined by some modules when in aggressive mode.
The first time a page is visited by an anonymous visitor, Drupal includes all necessary modules and invokes a series of functions, hooks and database queries in these modules to generate the page. If the Drupal page cache is enabled, whether normal or aggressive mode, this resulting output will generally be stored in the page_cache database table. As to how this actually happens, the last line of index.php calls the function drupal_page_footer() which is defined in includes/common.inc. This function calls page_set_cache() in the same file where logic checks if the page is being served to an anonymous visitor using the HTTP GET method, and that there haven't be any Drupal messages set in the current session. If these three conditions are true, Drupal invokes PHP's built in ob_get_contents() function which returns a complete copy of the current page which is in PHP's buffers. This content may optionally be compressed depending on your configuration, and then ob_end_flush() is invoked telling PHP to flush its buffers, actually sending the generated page to the remote web browser. Finally, a call is made to cache_set() which save a complete copy of the generated page into the cache_page database table for future reuse.
The next time this exact same URL is visited by the same or a different anonymous visitor, the already generated copy of the page is efficiently retrieved from the cache_page database table, bypassing the need to include all the modules and regenerate the page. Once again starting in index.php, toward the beginning of the file there is a call to the drupal_bootstrap() function which is defined in includes/bootstrap.inc. This bootstrap function loops step by step through a series of "phases".
The first phase, DRUPAL_BOOTSTRAP_CONFIGURATION, locates and reads the correct settings.php configuration file, initializing Drupal's configuration array. The second phase, DRUPAL_BOOTSTRAP_EARLY_PAGE_CACHE, reads the cache_inc variable do determine which cache handler should be used. By default Drupal uses its own core includes/cache.inc cache handler which stores cache data in the database, but it's also possible to use a contributed handler which stores cache data elsewhere, such as in memcache. This second phase also provides a fastpath mechanism for simply displaying the cached copy of the page and exiting without invoking any further phases. The third phase, DRUPAL_BOOTSTRAP_DATABASE, opens a connection to the database. The fourth phase, DRUPAL_BOOTSTRAP_ACCESS, checks if the IP address of the remote host has been banned by the site administrator, and if so displays a terse explanatory message and exits. The fifth phase, DRUPAL_BOOTSTRAP_SESSION, loads the session data into memory. And finally, in the sixth phase, DRUPAL_BOOTSTRAP_LATE_PAGE_CACHE, Drupal calls the page_get_cache() function to load the cached page from the cache_page table. If a valid copy of the page exists in the cache, no further bootstrap phases are invoked.
When in normal page caching mode, the sixth bootstrap phase will first invoke the _boot() hook in all enabled modules where it is defined. Then, it will send the actual cached page to the remote web browser of the anonymous visitor. Finally, it will invoke the _exit() hook in all enabled modules where it is defined, and then Drupal will exit.
When in advanced page caching mode, neither the _boot() nor _exit() hooks are invoked, which means Drupal does not have to include these module files when displaying the cached page and can instead simply send the cached page to the remote web browser of the anonymous visitor.
In Drupal 6, the statistics module and the throttle module are the only two modules that define the _exit() hook. No core modules define the _boot() hook. In the statistics module the _exit() hook is used to count how many times each node is viewed and to update the access log. In the throttle module the _exit() hook is used to detect surges in site traffic and enable or disable the automatic throttling mechanism. If you put Drupal into advanced page caching mode the _exit() hooks are not invoked so none of this functionality will work.
Configuration of the minimum cache lifetime is found in the page cache section of the performance administration page, however it actually affects all of Drupal's caches. In regards to the page cache, the idea is to ensure that some benefit is gotten from caching generated pages. By default Drupal only caches page content as long as it is known to be valid. As soon a new comment or node is posted or updated the entire page cache has to be flushed as there is no way to determine which pages are affected by the changed content.

The minimum cache lifetime enforces a configurable amount of time that any given page will live in the cache even if new content is posted during that time. The longer pages live in the cache, the higher the "hit rate" and thus the more effective the cache can be.
Enforcement of the minimum cache lifetime happens globally on a per cache table basis, so once any new user posts or updates content the countdown to flushing the page cache begins. A variable is also tracked in each user's session when they post new content, simulating a cache flush only for these users. This allows anonymous users to see their own comments immediately when posted rather than waiting for the page cache to first expire and be flushed. When any page is regenerated for a specific user that has posted new content, this new version of the page will update the version in the cache.
When trying to scale a website, the minimum cache lifetime should be enabled and set to the largest time you are willing to make anonymous visitors wait before seeing newly posted content. When determining how long this is, remember that anonymous users will still see content that they have posted themselves immediately.
When this option is enabled, cached pages are compressed with gzip before they are stored in the cache_page database table. Then, when these pages are served to anonymous visitors Drupal confirms that the remote client supports gzip encoded pages, and if so it quickly serves the pre-compressed page to the remote client. Fortunately most web browsers do support gzip encoded pages. For the few that do not, Drupal will uncompress the cached page before sending it the remote client.

Actual compression of cached pages happens in includes/common.inc in the function page_set_cache() with the following code:
$data = gzencode($data, 9, FORCE_GZIP);
When serving cached pages, Drupal detects whether or not the remote client supports gzip encoding in includes/bootstrap.inc in the function drupal_page_cache_header() with the following code:
if (@strpos($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip') === FALSE &&
function_exists('gzencode')) {In the rare case where the remote client does not support gzip encoding the page is uncompressed in the same function with the following code:
cache->data = gzinflate(substr(substr($cache->data, 10), 0, -8));
While the page cache offers impressive performance gains for anonymous users, it does not improve performance for logged in users. The block cache is a new feature in Drupal 6 which improves performance for logged in users. There are several different ways that the block cache can cache blocks, all controlled by module developers when creating the blocks. By default, when the block cache is enabled one copy of the block is cached per role.

The various block caching modes are defined in modules/block/block.module. Available caching modes are BLOCK_NO_CACHE, BLOCK_CACHE_PER_ROLE, BLOCK_CACHE_PER_USER, BLOCK_CACHE_PER_PAGE, and BLOCK_CACHE_GLOBAL.
When a block sets BLOCK_NO_CACHE, the block will not ever be cached. This cache mode is generally used either when the cache is so simple that it's more efficient to regenerate it each time it is displayed, or when the block changes so frequently that there's no benefit from caching it.
As noted earlier, BLOCK_CACHE_PER_ROLE is the default mode, and means that multiple versions of the block will be cached, one for each role. The BLOCK_CACHE_PER_USER mode means that a unique version of the block will be cached for each user. The BLOCK_CACHE_PER_PAGE tells Drupal to cache a unique version of the block for each page it is displayed on. And finally, the BLOCK_CACHE_GLOBAL mode means to cache a single version of the block displayed on all pages to all users and all roles.
The cache modes are defined in the code as bitwise flags allowing devlopers to set multiple cache modes. For example, the core profile.module defines an 'Author information' block which sets two flags, both BLOCK_CACHE_PER_PAGE and BLOCK_CACHE_PER_ROLE. This means that a unique version of the block is generated on each page and for each role viewing that page. The core book.module sets the same two block caching flags for the 'Book navigation' block.
When programming Drupal modules, you can control the block cache in hook_block() when defining your block. In the following example, we configure our block to be cached on a per page and per roles basis:
function example_block($op = 'list', $delta = 0, $edit = array()) {
if ($op == 'list') {
$blocks[0]['info'] = t('Example block');
$blocks[0]['cache'] = BLOCK_CACHE_PER_PAGE | BLOCK_CACHE_PER_ROLE;
return $blocks;
}
else if ($op == 'view') {
// Output the actual block here.
}
The next section on the Performance administration page is titled bandwidth optimizations.

