CompTIA Linux+ / LPIC-1 Cert Guide (Exams LX0-103 & LX0-104/101-400 & 102-400) (2016)
Chapter 16. Schedule and Automate Tasks
This chapter covers the following topics:
The Cron System
Running Ad-hoc Jobs
This chapter covers the following exam topics:
Automate system administration tasks by scheduling jobs: 107.2
Running one server, or a fleet of servers, involves some periodic cleanup tasks. Temporary files need deleting, logs need rotating, caches need expiring, reports need running, and much more. You can either spend every working day with an ever-growing checklist of things to do, or learn how Linux can schedule jobs to be run in the wee hours when you’re sleeping.
“Do I Know This Already?” Quiz
The “Do I Know This Already?” quiz enables you to assess whether you should read this entire chapter or simply jump to the “Exam Preparation Tasks” section for review. If you are in doubt, read the entire chapter. Table 16-1 outlines the major headings in this chapter and the corresponding “Do I Know This Already?” quiz questions. You can find the answers in Appendix A, “Answers to the ‘Do I Know This Already?’ Quizzes and Review Questions.”
Table 16-1 “Do I Know This Already?” Foundation Topics Section-to-Question Mapping
1. Consider the following cron entry and choose the correct statement.
0 10 12 * * /usr/local/bin/backup.sh
a. backup.sh will run at 10 minutes after midnight on the 12th day of the month.
b. backup.sh will run at 10am on the 12th day of the month.
c. backup.sh will run at 10 minutes after 12 noon every day of the year.
d. backup.sh will run as 12 minutes after 10 every day of the year.
2. Consider the following cron entry and choose the correct statement.
* */2 * * 1 /usr/local/bin/ping.sh
a. ping.sh will run every 2 hours on Mondays.
b. ping.sh will run every half hour on the first of the month.
c. ping.sh will run every half hour on Mondays.
d. ping.sh will run every 2 hours on the first of the month.
3. You are trying to explain how to set up a cron to a user, but the user tells you that she is getting an error:
$ crontab -e
You (sean) are not allowed to use this program (crontab)
See crontab(1) for more information
You look, and there is no file called /etc/cron.deny. What could be the problem?
a. sean is not in the sudoers file.
b. -sean needs to be in /etc/cron.deny.
c. sean needs to be in /etc/cron.allow.
d. The crontab program is not setuid root.
4. Which of the following are not true regarding cron and anacron? (Choose two.)
a. Both run periodic jobs.
b. Only root can use anacron.
c. Anacron can run jobs more frequently than cron.
d. Both guarantee jobs will be run.
5. Which of the following would be valid lines in the anacrontab file?
a. 7 25 weekly.backup root /usr/local/bin/backup.sh
b. 0 0 7 25 weekly.backup /usr/local/bin/backup.sh
c. 7 25 weekly.backup /usr/local/bin/backup.sh
d. 7 25 root weekly.backup /usr/local/bin/backup.sh
6. How does at differ from cron?
a. at can only schedule to the nearest day.
b. at is for ad-hoc jobs that are manually created.
c. at is for periodic jobs that need to be run at a certain time every day.
d. at runs jobs when system load is low.
Linux has two ways of scheduling jobs based on whether they’re regularly occurring. The cron system runs jobs on a schedule that can be monthly, weekly, daily, hourly, or even per minute. The at system is used for one-off jobs, such as to run a job over the weekend.
A third system, called anacron, is closely related to cron. It’s for running periodic jobs on a system that may not always be on like laptops.
The Cron System
Cron is the main job scheduler in Linux. Named after the Greek word for time, chronos, cron runs a job for you on a fixed schedule. Jobs can be anything that can be run on the command line including calling another shell script.
People typically schedule intensive jobs at times when the system is expected to be underused such as overnight. By separating the interactive user sessions from the heavy batch work, both classes get better performance. Your Linux system almost certainly has a handful of jobs set to run overnight or at least on a daily basis.
Cron has several moving parts. A daemon called crond runs in the background and executes tasks on behalf of all users. Jobs are configured through crontab, which is a utility that manipulates the individual user cron tables. Finally, a series of configuration files under /etc can also contain jobs and control which users are allowed to add jobs using crontab.
The cron system has evolved over time. The utilities themselves have improved to allow more configuration options. Distributions have also added more default settings to make common tasks easier. This chapter walks you through the various options available and makes it clear which is a distribution configuration and which is a cron option.
The crontab, or cron table, is both a thing that stores a list of jobs for a user and the utility to manage the table.
Using the crontab Command
The simplest way to use crontab is to run it with the -e flag, which means you want to edit the current user’s crontab. You are placed into an editor showing the current crontab. After you make changes and save, your input is checked to make sure it’s a valid crontab. If the validation check fails, you are given an option to go back so that you don’t lose your changes.
The editor used to make changes to the crontab is set through the EDITOR environment variable, with vi being the default. If you’d rather use nano, for instance, make sure that export EDITOR=/bin/nano is in your .bash_profile.
If you are root, the -u option allows you to supply a username so that you can view or edit that user. To edit sean’s crontab, run crontab -e -u sean.
The -l flag displays the crontab rather than editing it:
# crontab -l -u sean
25 * * * * /usr/bin/php /home/sean/fof/update-quiet.php > /dev/null
0 2 * * * /home/sean/bin/database_backup 2>&1
The crontab itself has undergone some improvements in formatting, and most are built around the idea that there are five columns to specify when the job is run and then the rest of the line contains the text to be executed.
The columns, in order, are
1. Minute (0-59)
2. Hour in 24 hour time (0-23)
3. Day of month (1-31)
4. Month (1-12)
5. Day of week (0-7 with 0 and 7 being Sunday)
Each column must be filled in. If you want to match all values for a column, use an asterisk (*). Some examples illustrate how the schedules work:
0 12 * * *—The minute is 0, the hour is 12, and it is to run on all months and days. This runs at 12:00 noon every day.
0 0 1 1 *—The minute and hour are both 0 so, this means midnight. The day of the month and month are both 1, which is January 1. This job runs at midnight on New Year’s day.
* * * * *—Runs every minute.
30 5 * * 1—Runs at 5:30 a.m. every Monday.
All columns must match for the job to run. One interesting exception is that if you specify both the day of the month and the day of the week (columns 3 and 5), the job will run with either match. Otherwise all columns must match up for the job to run. So 30 5 1 * 1 runs at 5:30 on the first of the month and every Monday.
Spelling Out Month and Day Names
The syntax shown previously uses a number for the month and day of week—1 is January, 2 is February, and so forth.
You can spell out the month and day of week names by using the first three letters. Not only is this easier to remember, but it helps make more sense of the columns.
0 0 * jan sun runs at midnight on every Sunday in January. The whole file doesn’t have to be consistent, so you can use numbers or names at your convenience.
Making Multiple Matches
The syntax you’ve seen so far does not allow for the same job to run at multiple times. If you wanted to run the same job at midnight and noon, you would need two separate lines. Fortunately cron’s syntax has evolved to allow this.
The first way to specify multiple times is to separate the items with a comma. For example, use the following syntax to run a job at midnight and noon:
0 0,12 * * *
The first column is still 0 to indicate that the job runs when the minute is zero. The hour column is now 0,12, which means the job will run when the hour is either 0 or 12, which is midnight and noon, respectively.
This also works when using the names of the months. A job can be run twice a day during June, July, and August with this:
0 0,12 * jun,jul,aug *
The second way to run a job at multiple time periods is to give a range of values. Rather than jun,jul,aug, the previous example could have been written with jun-aug. Or to run a job on the hour from 9:00 a.m. to 5:00 p.m.:
0 9-17 * * *
This runs the first job at 9:00 a.m. and the last job at 5:00 p.m.
Both these methods can be combined, such as 8-10,16-18.
The next optimization is to provide an easy way to run a job by stepping over certain periods, such as to run a job every 2 hours. You could do this with 0,2,4,6,8,10,12,14,16,18,20,22, but that’s messy!
Instead, run a job every 2 hours, on the hour with
0 */2 * * *
Or perhaps every 30 minutes:
*/30 * * * *
Or on odd numbered hours:
0 1-23/2 * * *
Think of this operator as saying “skip this number.”
Putting the crontab Together
So far you’ve looked at the first five columns, which collectively describe the time that the job will run. Any remaining text on the line runs as a command at the appointed time. A line such as
0 0 * * * /usr/local/bin/backup.sh
runs the /usr/local/bin/backup.sh program every day at midnight.
The command you run can be any valid shell code:
0 0 * * * if [[ ! -f /var/lock/maintenance ]]; then /usr/local/bin/
only runs the backup script at midnight if /var/lock/maintenance doesn’t exist.
Issues About Path
Scripts that work fine when run at the command line but don’t work when run from cron are a common problem. The environment is different because cron doesn’t run your .bash_profile and .bashrc scripts. Therefore, you can expect a minimal environment, including a basic PATH.
See it for yourself by adding a cron entry such as:
* * * * * env > /tmp/env
That job runs every minute and dumps the environment to /tmp/env:
The environment for a cron job is fairly sparse: The path only has /usr/bin and /bin. It is also missing any additions that you are used to at your own shell, such as in .bash_profile.
While the scripts run out of cron are free to set their own variables internally, cron lets you set environment variables in the usual format:
0 0 * * * /usr/local/bin/backup.sh
The backup script then is run with the extended path set on the first line.
You aren’t limited to just setting the path. Any variable will work. Some variable names are special:
MAILTO—Anything that a job prints to the screen is mailed to this address.
SHELL—Run the job with a different shell. /bin/bash is used by default.
CRON_TZ—Use an alternate time zone for the crontab; otherwise, use system time.
Dealing with Output
A script often prints something to the screen, either for debugging, status updates, or to log an error. Anything that a job prints to the screen is sent in an email to the current user, which can be overridden with the MAILTO variable inside the crontab.
There are three ways of dealing with this:
Just accept the emails. This often makes for a boring morning as you wade through all the night’s emails.
Write the scripts so that they only generate output when there’s a legitimate error.
Within the crontab, redirect output to /dev/null.
Each has its advantages and disadvantages. If a script failing to run is a problem, you should have some way of knowing this, either from an email with the output or an external monitoring system. Too many emails usually means you end up ignoring them all. The option you choose depends on each job.
For a job that is chatty and the output doesn’t matter, it’s easy to redirect the output to the bit bucket:
25 * * * * /usr/bin/php /home/sean/fof/update-quiet.php > /dev/null
At 25 minutes after the hour, a PHP script is executed. The output is redirected to /dev/null and the error stream is redirected to the standard out stream with 2>&1.
Recall that all programs have a standard output stream and an error stream, with normal redirects only working on the latter. The 2>&1 ensures that errors are redirected, too. Without this the regular output would be redirected but not the errors, resulting in an email. This may be desirable in some cases.
Most distributions include a version of cron that includes the nicknames extension. This extension provides aliases to commonly used schedules.
@reboot—Run once after reboot.
@yearly—Run once a year at midnight on January 1.
@annually—Same as @yearly.
@monthly—Run at midnight on the first of the month.
@weekly—Run once a week on Sunday at midnight.
@daily—Run once a day at midnight.
@hourly—Run once an hour, on the hour.
Therefore, the following two crontabs are the same:
0 0 * * * /usr/local/bin/backup.sh
As cron has grown over the years, so has the number of files that can be used to run jobs.
The crontabs edited with the crontab command are stored in /var/spool/cron.
# ls -l /var/spool/cron/
-rw------- 1 root root 0 Nov 17 19:54 root
-rw------- 1 sean sean 559 Mar 29 12:16 sean
From this output you can see that there are two crontabs: one for root and one for sean. The root crontab is empty. The files themselves are just text files.
Even though as the root user you can edit the files in /var/spool/cron yourself, you should always use the crontab command so that you have syntax checking. Regular users are prohibited from editing these files directly because they are not able to access the /var/spool/cron directory.
Some software packages need to bundle periodic tasks. For example, the sysstat package includes a valuable tool called the system activity reporter, or sar. A cron job fires every 10 minutes to collect some stats, and another job fires around midnight to archive the day’s statistics.
If the utility were to manipulate root’s crontab, it could accidentally get removed if the root user didn’t understand why the entry was there. Removing the entry after the package is removed also becomes a problem.
Therefore there is a second set of crontabs meant to be manipulated only by the root user. There is a shared file called /etc/crontab and a directory containing individual tables in /etc/cron.d. The file is usually used by the distribution itself to list any default jobs, or for the administrator to list any manually entered jobs. The directory is most helpful for integrating with package management, where a package that needs its own cron to place a file in /etc/cron.d. When the package is removed or upgraded, the cron can be removed or changed without accidentally affecting other cron entries.
These system crontabs have one important difference: They contain a sixth column that goes in between the schedule and the command to be run. This column indicates which user should be used to execute the job.
An example of a system crontab such as this looks like the following:
# Run system wide raid-check once a week on Sunday at 1am by default
0 1 * * Sun root /usr/sbin/raid-check
The first five columns schedule a job for Sunday at 1:00 a.m. Column 6 says the root user will run it. The command is /usr/sbin/raid-check.
One problem with scheduling jobs by a specific time is that they all run at the same time. Often you don’t care about the specific time a job runs, you just want it to run hourly, daily, weekly, or monthly. Cron is installed with a configuration that has a directory for each of these time periods and will run the jobs in each directory consecutively when the scheduled time comes up.
These directories are
/etc/cron.hourly—Jobs here are run once an hour.
/etc/cron.daily—Jobs here are run once a day.
/etc/cron.weekly—Jobs here are run once a week.
/etc/cron.monthly—Jobs here are run once a month.
We look into the reasons why later this chapter, but these convenience jobs don’t necessarily run on a predictable schedule. The system guarantees that monthly jobs are run once every month, but you can’t say for sure that it’ll happen exactly on the first of the month at midnight.
The files that go in these directories are just scripts. You do not include any schedule columns or user columns. Most of these are placed there by installation scripts. For example, the logrotate package needs to run daily to maintain system logs (see Chapter 18, “Logging and Time Services,” for more on logrotate), so it places a script in /etc/cron.daily that runs the script. The periodic updating of the database that the locate command uses is also run from cron.daily.
Sometimes you don’t want everyone to be able to use cron jobs. Two files, /etc/cron.allow and /etc/cron.deny, implement a whitelist and blacklist policy, respectively.
You should only have one of these files present; otherwise, the behavior gets hard to understand. Cron’s decision process for allowing a user access to edit her crontab is as follows:
1. If /etc/cron.allow exists, only users in this file and the root user can use crontab.
2. If /etc/cron.allow does not exist but /etc/cron.deny does, anyone in the latter file is denied access.
3. If neither file exists, only root can manage crons.
Most systems ship with an empty /etc/cron.deny file so that anyone can access their crontab. It should be noted that any existing crontabs will continue to run and that the root user can still manage a denied user’s crontab. The cron.allow and cron.deny files only control who can edit their own crontab.
Anacron is a simplified cron that complements the existing cron system. It is made to handle jobs that run daily or less frequently and jobs for which the precise time doesn’t matter.
Anacron’s chief advantage over cron is that it runs jobs that were scheduled to go when the computer was off. Rather than worry about specific times, anacron focuses on when the last time a particular job was run. If the computer was off when the job was supposed to run, anacron simply runs it the next time the computer is on.
Anacron also differs from cron in a few important ways:
There is no command to manage the jobs. Everything is specified in /etc/anacrontab.
Everything is run as the root user.
The maximum granularity with which you can specify a schedule is 1 day, as opposed to 1 minute with cron.
Jobs are run consecutively.
While anacron can run as a daemon, it’s typically run from cron itself to see whether if there are any jobs outstanding, and once processing is done, anacron exits.
The format of the anacrontab file is still based around columns. The columns are
1. Period, in days, between runs of this job. Some of the @nicknames are available.
2. Delay, in minutes, that anacron will wait before executing the job.
3. A tag, called a job identifier, for the job that is unique across all jobs. Anacron uses this to track the last time the job was run.
4. The command to run.
Environment variables are specified the same way they are in crontabs. Just put the key=value statement on its own line.
The default anacrontab on a Red Hat system is
#period in days delay in minutes job-identifier command
1 5 cron.daily nice run-parts /etc/cron.daily
7 25 cron.weekly nice run-parts /etc/cron.weekly
@monthly 45 cron.monthly nice run-parts /etc/cron.monthly
These three jobs run daily, weekly, and monthly, respectively. The jobs all run nice run-parts followed by one of the convenience cron directories you learned about in the previous section.
The nice command runs the command at a lower priority. run-parts is a tool that comes with cron that runs each file in the given directory one after another.
Thus, the three default anacron jobs run the daily, weekly, and monthly cron tasks, respectively. cron.hourly is still handled by cron because anacron can only schedule jobs on a daily basis, not hourly.
As anacron is used for the daily, weekly, and monthly jobs, these jobs eventually run even if the system is off overnight. They just run at a low priority when the system comes on again.
Between cron and anacron you have many options available for running periodic jobs. There is overlap between the two tools and you will run into situations where either will work. Your decision on which to use comes down to a few key points:
Does your job need to run more than once a day or at a specific time? If so, you need to use crontab.
Is the timing of the job flexible and you’re more interested in making sure it’s always run, even if it’s late? If so, you should use anacron.
Is this job on a system that is not always on? As long as the timing constraints work for you, anacron is a good choice.
Running Ad-hoc Jobs
The final class of jobs that can be scheduled are ad-hoc jobs. Cron and anacron run jobs on a periodic basis—you want to run your log rotation every night. There are times when you want to run a job once, just not now. This is the job of the at command and its friend, batch.
The at Command
The at command is designed to run a task once at a specific time. The at command’s tasks or jobs are queued up in the /var/spool/at directory, with a single file representing each job.
A typical at job is intended to take care of the one-off or very infrequent jobs that take place at odd times. For example, many sysadmins remind themselves of meetings or to perform some task with at:
$ at 2pm today
at> xmessage "take a break"
job 1 at 2004-04-02 14:00
You type the first line of the previous code block (at 2pm today) at the command line, causing the at> prompt to appear. Then you type the command you want to execute, press Enter, and press Ctrl+D to end the task. Ending the task shows the <EOT> notice, followed by a line that echoes the job’s scheduling information.
Alternatively, you can pass the job you want to run over the standard input:
$ echo '/usr/local/bin/backup.sh' | at 20:00 today
job 10 at Sun Mar 29 20:00:00 2015
The at command uses a variety of time specifiers, some complex and some simple:
midnight—Runs the task at 00:00 on the current day.
noon—Runs the task at 12:00 on the current day.
teatime—Runs the task at 16:00 (at’s British roots are evident).
time-of-day—Such as 2:00 p.m. or 5:00 a.m.
date—You can specify a time on a specific day, such as 2pm jul 23 or 4am 121504.
now + time—You can specify any number of minutes, hours, days, and weeks from the current time, such as now + 30 minutes.
The at command just starts the jobs. A couple of commands can help you manage the at jobs on the system, including these:
atq—This shows the summary of jobs in the at queue with the job number, date and time, and executing user (this also can be seen with at -l). It does not show the contents of the job itself; for that the root user needs to look at the file in /var/spool/at.
atrm—This deletes at jobs by job number, which is determined by using the previous command and the syntax atrm # (where # is the job number). at -d is a synonym for this.
at has a set of security files, /etc/at.allow and /etc/at.deny, which allow users to or prevent users from queuing up at jobs. These behave the same way as the cron restrictions. If the at.allow file exists and contains usernames, only those users and the root user are allowed to use at. If theat.deny file exists and contains usernames, those users are denied and all others are allowed. If neither file exists, only the root user is allowed to submit at jobs.
The batch Command
Using the batch command is relatively simple; it’s somewhat of an extension of the at command and shares the same man page. The batch command is used to run tasks or jobs at no specific time, but at a particular threshold of system utilization. As you can imagine, some systems are busy and you need to determine which jobs might need to be run with another scheduling utility if they are time-sensitive.
The metric used by batch and at to know whether the system has capacity to run a job is the load average. The load average of the system is seen in the three numbers that shows up when you run the w (who) command:
16:04:34 up 135 days, 17:40, 2 users, load average: 0.54, 0.60,
These numbers, 0.54, 0.60, and 0.51, represent the average number of processes waiting to be run when sampled over the last minute, 5 minutes, and 15 minutes, respectively. A load average of 0.60 over 5 minutes means that over the last 5 minutes, when sampled, there was a process waiting to be run 60% of the time.
You can expect that a system with two CPUs is busy when the load average is 2. By looking at the load average over the three different time intervals you can tell whether the load is temporary, in which case the 1 minute measurement will be high and the other two will be low, or if it’s sustained by having all three high. Or, if the 15 minute measurement is high and the others are low, you know you had some high load but have recovered.
By default, batch runs jobs once at a future time when the system 1 minute load average is less than or equal to 0.8. This can be configured by specifying the desired utilization average with the atrun command, such as
atrun -l 1.6
This sets the threshold that batch will watch to 1.6, and if load average drops below that value, the batch job is run. A value of 1.8, would be good for a system with two processors. For a system with N processors you want this value to be slightly less than N, such as 80% of N.
Submitting batch jobs is similar to at, and it even uses the at prompt and interface. To submit a compile job that runs when the system threshold is reached, you would use the following:
$ echo bigcompile | batch
job 11 at Sun Mar 29 14:36:00 2015
You can create a job with at or batch and then cat the file by using a command such as
at’s spooled jobs use a prefix of the letter a, whereas batch jobs use the letter b as a prefix. When you view the file, notice all the environment settings stored in the job, including a line that exports the username used to submit the job. Only the root user can look at these files.
Remember that at and batch both export a lot of information when the job is run, which goes away after that shell is closed. at and batch jobs are run in a replica of the environment that existed at the time the job was submitted, which means all variables, aliases, and functions in the shell are available to the job that was started in that shell.
Periodic tasks can be run through cron or anacron so that the systems administrator doesn’t need to run the jobs manually. Cron entries are managed with the crontab command, and the entries themselves have a specific format that includes the minute, hour, day of month, month, and day of week, along with the command itself. A user’s ability to edit her own crontab is managed with /etc/cron.allow and /etc/cron.deny, with the root user always being able to edit crontabs.
If a computer is off when a scheduled cron job was to run, the job won’t be executed. Anacron solves this problem by running the job when the computer is turned on. The tradeoff is that the minimum time between jobs is one day and that you don’t have exact control over when the job runs.
The at facility lets you run ad-hoc jobs at a particular future time.
Exam Preparation Tasks
As mentioned in the section “How to Use This Book” in the Introduction, you have a couple of choices for exam preparation: the exercises here, Chapter 21, “Final Preparation,” and the practice exams on the DVD.
Review All Key Topics
Review the most important topics in this chapter, noted with the Key Topics icon in the outer margin of the page. Table 16-2 lists a reference of these key topics and the page numbers on which each is found.
Table 16-2 Key Topics for Chapter 16
Define Key Terms
Define the following key terms from this chapter and check your answers in the glossary:
The answers to these review questions are in Appendix A.
1. You want to run a task on the hour, every other hour starting at 1:00 a.m., with no other restrictions. Which crontab accomplishes this?
a. */120 * * *
b. 1/2 * * * *
c. 0 */2 * * *
d. 0 1-23/2 * * *
2. You have configured a job to run with the batch command, but apparently system utilization never drops as low as the default value. Which of the following commands can be used to set a custom value for the batch command?
3. You are trying to set up a job to run every night, but the script keeps aborting with errors that the command was not found. Which of the following in your crontab might help?
4. User crontabs are stored in:
5. If both cron.allow and cron.deny exist, the behavior is to
a. Only deny users in cron.deny to use crontab
b. Only allow users in cron.allow to use crontab
c. Only allow the root user to use crontab
d. First check cron.deny and then check cron.allow
6. If a script is in /etc/anacrontab and the computer is off overnight, the script will be run after the computer has booted.
7. What is the primary purpose of the job identifier in the anacrontab file?
a. It is used to track the last time the job was run.
b. It is the way anacron sorts the jobs to be run.
c. It is what shows up in the process listing when the job is running.
d. It identifies the owner of the job.
8. Which of the following directories allows you to place crontabs that specify the user under which cron will run the job?
9. You have created a job with the at command but later realize you don’t want the command to run. How do you delete the job?
a. at -q to find the job, then at -d to delete it
b. rm /var/spool/at/*
c. at -l to find the job, then atrm to delete it.
d. atq to find the job, then at -r to delete it
10. If you wanted to run a job when the system load was low, the best command to use would be