This mechanism makes use of the front end of the site and provides a task that may be invoked by a web browser, but more commonly by a utility accessing a web page such as wget or curl. Such tasks typically when invoked in this way, do NOT provide progress details.
The front-end access URL's for cron tasks are not usually designed to be run from a normal web browser, but from an unattended cron script, utilising a server side executable as a means of accessing the function. Typically utilities such as wget, curl or lynx are used, which can be thought of as applications which simulates the behaviour of the browser. They typically try to access the cron supplied site URL so that the accessed page actions occur.
Normal web browsers tend to be "impatient". If a web page returns a bunch of redirection headers, the web browser thinks that the web server has had some sort of malfunction, and stop loading the page. It will also show some kind of "destination unreachable" message. Remember, these browsers are meant to be used on web pages which are supposed to show some content to a human. This behaviour is normal. Most browsers will quit after they encounter the twentieth page redirect response, which is bound to happen. Using browsers such as Firefox, Internet Explorer, Chrome, Safari, Opera or any other browser are not intended to work with the front-end cron based features. They are NOT meant to work by design.
Command line utilities, by default, will also give up loading a page after it has been redirected a number of times. For example, wget gives up after 20 redirects, curl does so after 50 redirects. Depending upon the actual task being executed it may be advisable to configure the command line utility with a large number of redirects. The number will depend upon the task itself.
Tip | |
---|---|
There are several locations upon the web that provide a free service enabling of scripts supporting the CRON daemon. There is a free service from Webcron that provides a simple interface, that we have tested out ourselves. There is also a paid service that Webcron.org provide that fully supports a number of front-end features and is reasonably cheap - you need to spend about 1 Euro for 1000 runs. Just make sure you set up your Webcron CRON job time limit to be at least 10% more than the time it takes for the script to execute upon your site. If you do not know, a good value can be determined from timing how long the script takes to run on a typical work load. Just run the script from your site's front end several times and average out the run times and add about 10% more to the required value. It should be noted that IF your site uses any redirection component such as sh404SEF (and possibly .htaccess) that the URL specified in the Webcron service should be the 'redirected' URL and NOT the initial URL, otherwise a 301 error is seen. |
One feature that is often included is the supply of a 'Secret key' or 'Pass Phrase'. It is a character based string which will allow the CRON job to ascertain that it has the right to request to run is acceptable. It can be thought of as an additional security feature.
Tip | |
---|---|
Use only lower- and upper-case alphanumeric characters (0-9, a-z, A-Z) in your secret key. Other characters may need to be manually URL-encoded in the CRON job's command line. This is error prone and can cause the backup to never start even though you'll be quite sure that you have done everything correctly. |
Most hosts offer a CPanel of some kind. There will be a section named something like "CRON Jobs", "scheduled tasks" or the like. The help screen they provide should describe how to set up a scheduled job. The missing part would be the command to issue. Simply putting the URL in there is unlikely to work.
Warning | |
---|---|
If your host only supports entering a URL in their "CRON" feature, this may not work with most cron scripts. There is no workaround. It is a hard limitation imposed by the host. |
Important | |
---|---|
Be careful with any caching that may be present upon the web site. If the specific page being accessed is present within the cache it will be delivered to the caller, BUT the underlying actions will NOT be invoked. It is better to disable cache for the specific page. We wasted a lot of time with tracking down this specific problem, and have not found it mentioned anywhere else on the web. |