Create a simple caching feature in your PHP script to cache your HTML output

Large number of AA batteries standing upright.
Are batteries just caching systems?

Every major site caches content in one way or another, whether it’d be entire pages, a subset of content, or only database queries. Having a caching system in place allows you to not only serve content to the user faster, but also reduces the strain on your server. The thing to remember is that just because you can cache content, doesn’t mean you should.

Let’s say you’re parsing a complex XML file and display its contents to the user, but the XML file only changes on a daily basis. If 100 users view that page and your server has to run that parsing script 100 times, it will have more of an impact on the server than sending back a static file. You may want to cache the results of the parsing script for 24 hours and then run the parsing script again to update the cache.

On the other hand, let’s say you’re running a daily deal website that displays a counter of how many users bought the deal. If you have a limited number of deals available, it’s in your and the user’s best interest to know when a deal has sold out. That number has to be pretty much up to date and you can’t reliably determine when to update a cache, so in that case, caching may not be desired.

Lastly, there’s something called triggers. My professor used to say that a computer in itself is pretty dumb and that it needs to be told exactly what to do. A cache doesn’t get updated unless something triggers it. If I want something to happen every 24 hours, for example, that’s what you’d use a cron job for, but in our instance, we only want to update the cache when someone is actually trying to retrieve the data and the cache is expired, so for our project below, the vistor will be the trigger.

Project

The goal is to build a simple caching feature into a script that will cache the results for a predefined duration.

Requirements

  • Allow caching to be turned on and off.
  • Ability to set the cache’s duration before it expires.
  • Set a location on where the cache will be stored.

Solution

When a user requests a page that contains the script, it will first check whether caching is enabled. If it is, it will check to see if a cached version already exists that can be displayed. If that is also true, it will compare the cache’s creation time with the current time and determine whether the cache is expired or not. If it is, it will execute the script and cache its content, but if it isn’t, it will display the cache to the user. If caching is turned off or the cache file doesn’t exist, the script will always execute the full script.

Implementation

First we’re going to setup a few variables to use in the script:

$ENABLE_CACHE = true;
$CACHE_TIME_HOURS = 12;
$CACHE_FILE_PATH = './.cache/news.txt';
  • On line 1 we have a bool (true/false) that determines whether caching should be on or off.
  • On line 2 we’ll use an integer that tells the script after how many hours the cache is expired.
  • On line 3 is a string that indicates the path and filename of the cache file. Make sure that directory is writable by apache. Also notice that I created a new folder called “.cache” that begins with a period. I do that for two reasons: (1) all folders with a symbol, such as a period, are sorted higher than the other folders, so I can find it easily and (2) a folder with a period is a system folder to me, which means it should never be directly accessed by the public. By the way, the “./” at the beginning of the string means “this directory” and refers to the directory the script is executing from — it’s a relative path.

Now that we have our variables, let’s write a condition in which to display the cache:

if ($ENABLE_CACHE && file_exists($CACHE_FILE_PATH) && (time() - filemtime($CACHE_FILE_PATH) < ($CACHE_TIME_HOURS * 60 * 60))) {
  // display cache
} else {
  // run script and save cache
}
  • On line 1 we do the following checks: if the variable $ENABLE_CACHE is set to true, and the file at ./.cache/news.txt exists, and the current time (let’s say 1/4/12 4:00pm) minus the file’s creation time (let’s say 1/4/12 3:00pm) — which equals 1 hour — is less than 12 hours (1 is less than 12), then display the cache.

Let’s complete the script as follows:

$ENABLE_CACHE = true;
$CACHE_TIME_HOURS = 12;
$CACHE_FILE_PATH = './.cache/news.txt';

if($ENABLE_CACHE && file_exists($CACHE_FILE_PATH) && (time() - filemtime($CACHE_FILE_PATH) < ($CACHE_TIME_HOURS * 60 * 60))) {
  echo @file_get_contents($CACHE_FILE_PATH);
} else {
  // your script runs here and the result is stored in a variable called $output
  @file_put_contents($CACHE_FILE_PATH, $output);
  echo $output;
}
  • On line 6 I use an @ sign to suppress any errors (error control operator).
  • On line 8 is where you would do your XML parsing and then store the output in a variable called $output.
  • On line 9 you save the entire output to the file.
  • On line 10 you display all of the output on the screen.

Conclusion (2)

As you can see, in its simplest form, it’s quite easy to implement. Now, if you had multiple scripts that required the caching of data, you might want to create functions or even an object class and with member functions, so that you can streamline changes, but I’ll save that for another how-to.

If you have any questions or comments, as always, feel free to leave them below.

Featured image by Roberto Sorin.


Comments

Previously posted in WordPress and transferred to Ghost.

Frank Jaeger
January 6, 2012 at 2:53 pm

A quick question Mr. Sechrest, on line 9 and 10 of your code… More specifically on line 10, wouldn’t you want to echo $CACHE_FILE_PATH instead of $output? Just to make sure it’s working properly?

My concern is if something happened and $CACHE_FILE_PATH was no longer writable or became broken somehow, the echo to $output would still function and display the content to your users but at the loss of more server resources and speed since the script would continue “trying” to generate a file at $CACHE_FILE_PATH right?

The sum of my concern is that:

1.) The clients/users will be unaware of this. (maybe noticing slow content loading at it’s worst)

2.) The webmaster will be unaware of this. (no clients/users can report a problem they can’t see)

I am intrigued! Please let us know what you think.

Ryan Sechrest
January 6, 2012 at 3:18 pm

That’s a good question. The server would not be impacted any more if the file generation failed, because no matter what, if the full script executes, it will attempt to create the cache file and it doesn’t care whether it actually worked or not, however, in terms of server resources, if the cache file is expired or can’t be read, your server will be executing the entire script all the time, which will increase the load.

You have to ask yourself whether you’d prefer a small increase in load temporarily or whether your content should not display at all.

A possible solution would be this:
if(file_put_contents($CACHE_FILE_PATH, $output) === false) {
  // send email to webmaster and/or display message to user
}
This piece of code will always check to see if the file was created successfully. If the file couldn’t be created, you would either be notified and/or a notice would display on your website, solving both your concerns.