Tag Archives: Cron Job

WordPress taxonomy terms don’t insert when cron job executes wp_insert_post()

I have a PHP script that connects to a third-party application, retrieves data, and then inserts it into WordPress using wp_insert_post(). The data also has a custom taxonomy associated with it, which is being passed into the wp_insert_post() function via the $args array. Lastly, the script executes every hour via a cron job.

Here is the WordPress insert:

1
2
3
4
5
6
7
8
9
10
11
12
13
$args = array(
  'comment_status'  => 'closed',
  'ping_status'     => 'closed',
  'post_author'     => 1,
  'post_content'    => $content,
  'post_status'     => 'publish',
  'post_title'      => $title,
  'post_type'       => 'movie',
  'tax_input'       => array(
    'genre' => array('term1', 'term2', 'term3')
  )
);
$wp_movie_id = wp_insert_post($args);
$args = array(
  'comment_status'	=> 'closed',
  'ping_status'		=> 'closed',
  'post_author'		=> 1,
  'post_content'	=> $content,
  'post_status'		=> 'publish',
  'post_title'		=> $title,
  'post_type'		=> 'movie',
  'tax_input'		=> array(
    'genre' => array('term1', 'term2', 'term3')
  )
);
$wp_movie_id = wp_insert_post($args);

And here the cron job:

1
@hourly /usr/bin/curl --silent http://example.com/cron.php
@hourly /usr/bin/curl --silent http://example.com/cron.php

The problem was that every time the cron job ran, my custom taxonomy terms were missing, while everything else inserted just fine.

Whenever I executed the script manually from my browser though, the taxonomy terms were present, so I was sure that it was related to the cron job.

Turns out, the problem was within the wp_insert_post() function. Prior to inserting the taxonomy terms, WordPress checks whether that user has permission via current_user_can():

1
2
3
4
5
6
7
8
9
10
if(!empty($tax_input)) {
  foreach($tax_input as $taxonomy => $tags) {
    $taxonomy_obj = get_taxonomy($taxonomy);
    if(is_array($tags)) {
      $tags = array_filter($tags);
    }
    if(current_user_can($taxonomy_obj->cap->assign_terms)) {
      wp_set_post_terms($post_ID, $tags, $taxonomy);
    }
}
if(!empty($tax_input)) {
  foreach($tax_input as $taxonomy => $tags) {
    $taxonomy_obj = get_taxonomy($taxonomy);
    if(is_array($tags)) {
      $tags = array_filter($tags);
    }
    if(current_user_can($taxonomy_obj->cap->assign_terms)) {
      wp_set_post_terms($post_ID, $tags, $taxonomy);
    }
}

The cron job didn’t have authority to insert taxonomy terms. This also explains why it always worked in my browser, because I was logged in as an admin in the background.

Luckily there is a quick solution that solves this problem. Instead of inserting the taxonomy terms using the $args array, you can use another WordPress function called wp_set_object_terms() to perform the insert separately:

1
2
$terms = array('term1', 'term2', 'term3');
wp_set_object_terms($wp_movie_id, $terms, 'genre');
$terms = array('term1', 'term2', 'term3');
wp_set_object_terms($wp_movie_id, $terms, 'genre');

Hopefully this will save someone a couple hours of research!

Create a simple caching feature in your PHP script to cache your HTML output

Preface

Every major site caches content in one way or another, whether it’d be entire pages, a subset of content, or only database queries. Having a caching system in place allows you to not only serve content to the user faster, but also reduces the strain on your server. The thing to remember is that just because you can cache content, doesn’t mean you should.

Let’s say you’re parsing a complex XML file and display its contents to the user, but the XML file only changes on a daily basis. If 100 users view that page and your server has to run that parsing script 100 times, it will have more of an impact on the server than sending back a static file. You may want to cache the results of the parsing script for 24 hours and then run the parsing script again to update the cache.

On the other hand, let’s say you’re running a daily deal website that displays a counter of how many users bought the deal. If you have a limited number of deals available, it’s in your and the user’s best interest to know when a deal has sold out. That number has to be pretty much up to date and you can’t reliably determine when to update a cache, so in that case, caching may not be desired.

Lastly, there’s something called triggers. My professor used to say that a computer in itself is pretty dumb and that it needs to be told exactly what to do. A cache doesn’t get updated unless something triggers it. If I want something to happen every 24 hours, for example, that’s what you’d use a cron job for, but in our instance, we only want to update the cache when someone is actually trying to retrieve the data and the cache is expired, so for our project below, the vistor will be the trigger.

Project

The goal is to build a simple caching feature into a script that will cache the results for a predefined duration.

Requirements

  • Allow caching to be turned on and off.
  • Ability to set the cache’s duration before it expires.
  • Set a location on where the cache will be stored.

Solution

When a user requests a page that contains the script, it will first check whether caching is enabled. If it is, it will check to see if a cached version already exists that can be displayed. If that is also true, it will compare the cache’s creation time with the current time and determine whether the cache is expired or not. If it is, it will execute the script and cache its content, but if it isn’t, it will display the cache to the user. If caching is turned off or the cache file doesn’t exist, the script will always execute the full script.

Implementation

First we’re going to setup a few variables to use in the script:

1
2
3
$ENABLE_CACHE = true;
$CACHE_TIME_HOURS = 12;
$CACHE_FILE_PATH = './.cache/news.txt';
$ENABLE_CACHE = true;
$CACHE_TIME_HOURS = 12;
$CACHE_FILE_PATH = './.cache/news.txt';
  • On line 1 we have a bool (true/false) that determines whether caching should be on or off.
  • On line 2 we’ll use an integer that tells the script after how many hours the cache is expired.
  • On line 3 is a string that indicates the path and filename of the cache file. Make sure that directory is writable by apache. Also notice that I created a new folder called “.cache” that begins with a period. I do that for two reaons: (1) all folders with a symbol, such as a period, are sorted higher than the other folders, so I can find it easily and (2) a folder with a period is a system folder to me, which means it should never be directly accessed by the public. By the way, the “./” at the beginning of the string means “this directory” and refers to the directory the script is executing from — it’s a relative path.

Now that we have our variables, let’s write a condition in which to display the cache:

1
2
3
4
5
if($ENABLE_CACHE && file_exists($CACHE_FILE_PATH) && (time() - filemtime($CACHE_FILE_PATH) < ($CACHE_TIME_HOURS * 60 * 60))) {
    // display cache
} else {
    // run script and save cache
}
if($ENABLE_CACHE && file_exists($CACHE_FILE_PATH) && (time() - filemtime($CACHE_FILE_PATH) < ($CACHE_TIME_HOURS * 60 * 60))) {
	// display cache
} else {
	// run script and save cache
}
  • On line 1 we do the following checks: if the variable $ENABLE_CACHE is set to true, and the file at ./.cache/news.txt exists, and the current time (let’s say 1/4/12 4:00pm) minus the file’s creation time (let’s say 1/4/12 3:00pm) — which equals 1 hour — is less than 12 hours (1 is less than 12), then display the cache.

Let’s complete the script as follows:

1
2
3
4
5
6
7
8
9
10
11
$ENABLE_CACHE = true;
$CACHE_TIME_HOURS = 12;
$CACHE_FILE_PATH = './.cache/news.txt';
 
if($ENABLE_CACHE && file_exists($CACHE_FILE_PATH) && (time() - filemtime($CACHE_FILE_PATH) < ($CACHE_TIME_HOURS * 60 * 60))) {
    echo @file_get_contents($CACHE_FILE_PATH);
} else {
    // your script runs here and the result is stored in a variable called $output
    @file_put_contents($CACHE_FILE_PATH, $output);
    echo $output;
}
$ENABLE_CACHE = true;
$CACHE_TIME_HOURS = 12;
$CACHE_FILE_PATH = './.cache/news.txt';

if($ENABLE_CACHE && file_exists($CACHE_FILE_PATH) && (time() - filemtime($CACHE_FILE_PATH) < ($CACHE_TIME_HOURS * 60 * 60))) {
	echo @file_get_contents($CACHE_FILE_PATH);
} else {
	// your script runs here and the result is stored in a variable called $output
	@file_put_contents($CACHE_FILE_PATH, $output);
	echo $output;
}
  • On line 6 I use an @ sign to suppress any errors (error control operator).
  • On line 8 is where you would do your XML parsing and then store the output in a variable called $output.
  • On line 9 you save the entire output to the file.
  • On line 10 you display all of the output on the screen.

Conclusion

As you can see, in its simplest form, it’s quite easy to implement. Now, if you had multiple scripts that required the caching of data, you might want to create functions or even an object class and with member functions, so that you can streamline changes, but I’ll save that for another how-to.

If you have any questions or comments, as always, feel free to leave them below.