Beginning the Linux Command Line, Second edition (2015)
CHAPTER 9. Process and System Management
When working with Linux, from an administrative perspective, working with processes is important. Every application or task you start on a Linux computer is started as a process. You will find that in some instances, a task may hang, or something else may happen that urges you to do some process management. In this chapter, you will learn how to monitor and manage processes. You will also learn how to schedule processes for automatic startup.
Understanding Linux Processes
When your computer boots, it will start a kernel. The kernel on its turn is responsible for starting the first process, which on modern distributions is the systemd process. This process is responsible for all other processes. When starting a process, systemd starts the process as a child of its own. If for instance you’re working from a bash session in a GNOME graphical environment, systemd starts gnome-terminal, and from there bash is started. To gnome-terminal is the parent process for bash, adnd systemd is its grandparent.
To get an overview of the relations between parent and child processes, you can use the pstree command, of which partial output is shown in listing 9-1.
Listing 9-1. pstree Shows the Parent-Child Relation Between Processes
systemd-+-ModemManager---2*[{ModemManager}]
|-NetworkManager---3*[{NetworkManager}]
|-2*[abrt-watch-log]
|-abrtd
|-accounts-daemon---2*[{accounts-daemon}]
|-alsactl
|-at-spi-bus-laun-+-dbus-daemon---{dbus-daemon}
| `-3*[{at-spi-bus-laun}]
|-at-spi2-registr---{at-spi2-registr}
|-atd
|-auditd-+-audispd-+-sedispatch
| | `-{audispd}
| `-{auditd}
|-avahi-daemon---avahi-daemon
|-bluetoothd
|-chronyd
|-colord---2*[{colord}]
|-crond
|-cupsd
|-2*[dbus-daemon---{dbus-daemon}]
|-dbus-launch
|-dconf-service---2*[{dconf-service}]
|-evolution-addre---4*[{evolution-addre}]T
o run a process, the Linux kernel works with a queue of runnable processes. In this queue, every process waits for its turn to be serviced by the scheduler. By default, Linux works with time slices for process handling. This means that every process gets a fair amount of system time before it has to make place for other processes. If a process needs more attention, you can use the nice function to increase (or decrease if necessary) the system time that is granted to the process. More on using nice on processes later in this chapter.
In some situations, you will have to stop a process yourself. This may happen if the process doesn’t reply anymore, or if the process behaves in a way that harms other processes. To stop a process, the Linux kernel will tell the responsible parent process that this process needs to be stopped. Under normal circumstances, the parent process that was responsible for starting a given process will always be present until all its children are stopped.
In the abnormal situation where the child is still there, but the parent is already stopped, the child process cannot be stopped anymore, and it becomes a zombie. From the command line there is nothing that you can do to stop a zombie process; the only solution is to restart your computer.
You will find that if zombie processes occur, often the same processes are involved. That is because the occurrence of zombie processes is often due to bad programming. So you may have to update the software that creates the zombie process to get finally rid of your zombie processes. In the following sections, you will learn how to monitor and manage processes.
Apart from zombie status, processes can be in other states as well. You can see these states when using the ps aux command, which shows current process status; these are displayed in the STAT column (see Listing 9-2). Processes can be in the following states:
· Running: The process is active.
· Sleeping: The process is loaded in memory but hasn’t been active recently.
· Zombie: The process is in defunctional state.
· Stopped: The process is stopped by a user that has used the Ctrl-Z command - it can be started again using the fg command.
Listing 9-2. Processes Can Be In Different States
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 0.0 0.4 59732 4984 ? Ss 07:29 0:02 /usr/lib/systemd/systemd --switched-root --system --deserialize 24
root 2 0.0 0.0 0 0 ? S 07:29 0:00 [kthreadd]
root 3 0.0 0.0 0 0 ? S 07:29 0:00 [ksoftirqd/0]
root 5 0.0 0.0 0 0 ? S< 07:29 0:00 [kworker/0:0H]
root 7 0.0 0.0 0 0 ? S 07:29 0:00 [migration/0]
root 8 0.0 0.0 0 0 ? S 07:29 0:00 [rcu_bh]
root 9 0.0 0.0 0 0 ? S 07:29 0:00 [rcuob/0]
root 10 0.0 0.0 0 0 ? S 07:29 0:00 [rcuob/1]
root 11 0.0 0.0 0 0 ? S 07:29 0:00 [rcuob/2]
[root@workstation ~]#
You should know that there a different kinds of processes. Among these are the service processes, the so-called daemons. An example is the httpd process, which provides web services on your system. Daemon processes are automatically started when your server boots and systemd is entering a specific target. A systemd target defines the state a system should be in, and the processes and services tahat should be started to get into that state. On the flip side are the interactive processes, which typically are started by typing some command at the command line of your computer.
Finally, there are two ways in which a process can do its work to handle multiple tasks. First, it can just launch a new instance of the same process to handle the incoming request. If this is the case, you will see the same process listed multiple times in ps aux. The alternative is that the process works with one master process only, but launches a thread, which is a kind of a subprocess, for each new request that comes in. Currently, processes tend to be multithreaded, as this uses system resources more efficiently. For a Linux administrator, managing a multi-threaded process is a bit more challenging. Threads are managed from the master process itself, and not by an administrator who’s trying to manipulate them from the command line.
Monitoring Processes
All work on processes that you’ll need to do will start by monitoring what the process is doing. Two commands are particularly important: top and ps. The ps command allows you to display a list of all processes that are running on your computer. Because ps lists all processes (when used as root), that makes it an excellent choice if you need to find a given process to perform management tasks on it. The top command gives an overview of the most active processes.
This overview is refreshed every 5 seconds by default. As it also offers you a possibility to perform management tasks on these active processes, top is a very useful command for process management, especially for users who are taking their first steps on the Linux command line.
Monitoring Processes with top
The single most useful utility for process management is top. You can start it by typing the top command at the command line. Make sure that you have root permissions when doing this; otherwise, you can’t do process management. In Listing 9-3, you can see what the top screen looks like.
Listing 9-3. top Makes Process Management Easy
top - 10:22:12 up 2:52, 3 users, load average: 0.00, 0.01, 0.05
Tasks: 290 total, 2 running, 288 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.3 us, 0.0 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 1010336 total, 122568 free, 566424 used, 321344 buff/cache
KiB Swap: 1048572 total, 1041076 free, 7496 used. 253484 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
772 root 20 0 268820 3160 2308 S 0.3 0.3 0:09.78 vmtoolsd
804 root 20 0 550176 16596 4084 S 0.3 1.6 0:01.40 tuned
1 root 20 0 59732 4988 2768 S 0.0 0.5 0:02.67 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.02 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 0:00.12 ksoftirqd/0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
7 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
9 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/0
10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/1
11 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/2
12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/3
13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/4
14 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/5
15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/6
16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/7
17 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/8
top basically shows you all you need to know about the current status of your system, and it refreshes its output every 5 seconds by default. Its results are divided in two parts. On the top part of the output window, you can see how busy your system is; in the lower part, you’ll see a list of the busiest processes your computer currently has.
The upper five lines of the top output (see Listing 9-3) shows you what your system currently is doing. This information can be divided into a few categories:
· Data about uptime and users: On the first line, top shows you the current time (10.22 in this example), which is followed by the time the system has been up and the number of users connected to the system. Although useful, this is not critical information for process management.
· Current usage statistics: Still on the first line, there are three numbers related to current system usage. These three numbers indicate how busy your computer is relative to the amount of CPUs or CPU cores in your computer (from the perspective of top, there is no difference between a CPU and a CPU core): they give you the average for the last minute, the last 5 minutes, and the last 15 minutes.
The numbers give an overview of the average amount of processes that has been waiting to get services in the indicated period. In general, the number that you see here shouldn’t be superior to the number of CPU cores in your computer (but exceptions to this generic guideline do exist).
· Overview of tasks: The second line of top shows you information about the total number of tasks and their current status. On an average computer, you won’t see many more than about 200 tasks here, but specific workloads may have a much higher or lower amount of processes being active. The following status information for these tasks is displayed:
· Running: These are tasks that have been actively serviced during the last polling loop.
· Sleeping: These are tasks that have not been active in the last polling loop.
· Stopped: These are tasks that are stopped using the Ctrl-Z key stroke.
· Zombie: These are tasks of which the parent no longer is available and hence cannot be stopped or managed anymore.
· Overview of CPU usage: If the load average of your computer is relatively high, the CPU usage line can give an indication of exactly what your computer is doing. In this line, a subdivision is made of the different kinds of demands that processes are issuing on your CPU. On a multi-CPU system, you’ll see the summary for all CPUs together. If you want to see the load statistics for each of the CPUs from the top interface, press 1. The following options are listed:
· us: The amount of load that was issued in user space. These typically are tasks that run without root privileges and cannot access kernel functions directly.
· sy: The amount of load that was issued in system space. These typically are tasks that were started with root privileges and can access kernel functions directly. As compared to user space–level tasks, the number of tasks that you see here should be relatively low on most systems, but exceptions do exist.
· ni: Processes of which the priority has been adjusted using nice.
· id: The activity of the idle loop. This gives the percentage of inactivity on your system. It is no problem at all if this parameter is high.
· wa: The amount of time that your system has spent in waiting mode. This is the time that your system has been waiting for I/O. If you see a high value here, it indicates that you have a lot of I/O-related tasks on your computer and that the storage in your computer cannot deal with it efficiently. An average that is higher than 30% may indicate that your I/O channel doesn’t perform as it should.
· hi: The amount of time that your computer has spent handling hardware interrupts. It should be low at all times. If you see a high value here, it often indicates that some badly functioning drivers are used.
· si: The amount of time that your system has spent handling software interrupts.
· st: This parameter applies to environments where virtualization is used. It indicates the amount of time that was stolen from the processor in this machine by other virtual machines.
· Current memory use: In the last part of the upper lines of top output, you can see information about the amount of memory your computer is using. These two lines give information about the usage of real memory and swap memory, which is emulated memory on the hard disk of your computer, at the same time. The following parameters are listed:
· KiB Mem total: the total amount of memory in Kilobytes that is available as physical installed RAM
· free: the amount of memory that currently isn’t used for anything.
· used: the amount of memory that has been allocated by programs and services running on your computer.
· buff/cache: the amount of memory that is being used to cache read and write requests.This is memory that the Linux kernel can make available for other tasks in case this is needed.
· KiB Swap total: The total amount of swap memory that is available
· free: The amount of swap that is not being used
· used: The amount of swap that currently has been allocated
· avail Mem: The total amount of memory that is available. This amount consist of the amount that is listed as free, with in addition the amount of memory that can be freed immediately by liberating unneccessary buffer and cache memory.
The lower part of the top output shows you process information, divided in a couple of columns that are displayed by default. You should know that more columns are available than the ones displayed by default. If you want to activate the display of other columns, you should press the F key while in the top screen. This shows you a list of all columns that are available, indicating with an * which are currently active, as you can see in Listing 9-4. To toggle the status of a column, press the letter associated with that column. For instance, pressing J will show you which CPU was last used by a process.
Listing 9-4. You Can Toggle Other Columns to Be Displayed As Well in top
Fields Management for window 1:Def, whose current sort field is %CPU
Navigate with Up/Dn, Right selects for move then <Enter> or Left commits,
'd' or <Space> toggles display, 's' sets sort. Use 'q' or <Esc> to end!
* PID = Process Id PGRP = Process Group vMj = Major Faults
* USER = Effective Use TTY = Controlling T vMn = Minor Faults
* PR = Priority TPGID = Tty Process G USED = Res+Swap Size
* NI = Nice Value SID = Session Id nsIPC = IPC namespace
* VIRT = Virtual Image nTH = Number of Thr nsMNT = MNT namespace
* RES = Resident Size P = Last Used Cpu nsNET = NET namespace
* SHR = Shared Memory TIME = CPU Time nsPID = PID namespace
* S = Process Statu SWAP = Swapped Size nsUSER = USER namespac
* %CPU = CPU Usage CODE = Code Size (Ki nsUTS = UTS namespace
* %MEM = Memory Usage DATA = Data+Stack (K
* TIME+ = CPU Time, hun nMaj = Major Page Fa
* COMMAND = Command Name/ nMin = Minor Page Fa
PPID = Parent Proces nDRT = Dirty Pages C
UID = Effective Use WCHAN = Sleeping in F
RUID = Real User Id Flags = Task Flags <s
RUSER = Real User Nam CGROUPS = Control Group
SUID = Saved User Id SUPGIDS = Supp Groups I
SUSER = Saved User Na SUPGRPS = Supp Groups N
GID = Group Id TGID = Thread Group
GROUP = Group Name ENVIRON = Environment v
The following list describes the columns that are listed by default:
· PID: This is the process identification (PID) number of the process. Every process has a unique PID, and you will need this PID number to manage the process.
· USER: This indicates the name of the user who started the process.
· PR: This indicates the current process priority. Processes with a higher priority (which is expressed with a lower number!) will be serviced before processes with a lower priority. If a process with a higher priority needs CPU time, it will always be handled before the process that has a lower priority. Some processes have the RT (real time) priority, which means that they can access system resources at all times.
· NI: Between processes that have the same priority, the nice value indicates which has precedence. Processes with a low nice value are not so very nice and will always go before processes with a high nice value. However, this works only for processes that have the same priority.
· VIRT: This column refers to the total amount of memory that is allocated by a process. The amount that is mentioned here is just a reservation of an address range and doesn’t refer to any physically used memory. All processes on Linux can make virtual memory reservations from a total address space of 32 TB!
· RES: This column indicates the amount of resident memory, which is memory that the process has allocated and is currently also actively using. You may see differences between VIRT and RES because processes like to ask for more memory than they really need at the moment, which is referred to as memory over allocation.
· SHARE: This refers to shared memory. Typically, these are libraries the process uses that are used by other processes as well.
· S: This column gives the process status. The values that you find here are the same as the values in the second line of the top output, as discussed previously.
· %CPU: This column shows the percentage of CPU cycles that the process has been using. This is also the column that top sorts by default; the most active process is listed at the top of the list.
· %MEM: This column refers to the percentage of memory that the process is using.
· TIME: This indicates the accumulated real time that the process has used the CPU during the total period since it has started.
· COMMAND: This indicates the command that was used to start this process.
By default, top output is sorted on CPU usage. You can sort the output on any other information as well; there are over 20 different ways to do so. Some of my favorites are listed here:
· b: By process parent ID. This allows you to see in a quick overview all processes that are started by the same parent process.
· w: By process status. This allows you to group all processes that have the same status in an easy way.
· d: By UID. This allows you to see all processes that were started by the same user.
· h: By priority. This allows you to see processes with the highest priority listed on top.
· n: By memory usage. This shows the processes that have the largest amount of memory in use listed first.
When done monitoring process activity with top, you can exit the utility. To do this, issue the q command. Apart from the interactive mode that you’ve just read about, you can also use top in batch mode. This is useful if you want to redirect the output of top to a file or pipe it to some other command. When using top in batch mode, you can’t use any of the commands discussed previously. You tell top to start in batch mode by passing some options to it when starting it:
· -b: Starts top in batch mode
· -d: Tells top what delay it should use between samples
· -n: Tells top how often it should produce its output in batch mode
For instance, the following would tell top to run in batch mode with a 5-second interval, doing its work two times:
top -b -d 5 -n 2
EXERCISE 9-1: MONITORING PROCESSES WITH TOP
In this exercise you’ll start some processes and monitor behavior using top.
1. Type dd if=/dev/zero of=/dev/null &
2. Repeat step 1 three more times.
3. Observe system load in top. What numbers do you see in the load average indicators?
4. How much CPU load is currently happening on your servers?
5. Observe the idle loop, it should be next to 0.
6. Observe how the processes are equally using system resources.
7. From the top interface, type r. Next, enter the PID of one of the four dd processes, and enter the nice value -5. This should lower the amount of CPU time this processes is geting.
8. From the top interace, press k. Next, enter the PID of one of the four dd processes. This should remove the process from your system.
9. Repeat the procedure from step 8 to remove all other dd processes also.
Finding Processes with ps
If you want to manage processes from scripts in particular, the ps command is invaluable. This command shows you a list of all processes that are currently active on your computer. ps has many options, but most people use it in two ways only: ps aux and ps -ef. The value of ps is that it shows all processes in its output in a way that you can grep for the information you need. Imagine that you see in top that there is a zombie process; ps aux | grep defunc will show you which is the zombie process. Or imagine that you need the PIDs of all instances of your Apache web server; ps aux | grep httpd will give you the result.
One way of displaying all processes and their properties is by using ps aux. Listing 9-5 shows a part of the output of this command. To make it more readable I’ve piped the results of this command through less.
Listing 9-5. ps aux Shows All Processes and a Lot of Details About What the Processes Are Doing
nuuk:/ # ps aux | less
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
Root 1 0.0 0.0 728 284 ? S Dec16 0:00 init [5]
Root 2 0.0 0.0 0 0 ? SN Dec16 0:00 [ksoftirqd/0]
Root 3 0.0 0.0 0 0 ? S< Dec16 0:00 [events/0]
Root 4 0.0 0.0 0 0 ? S< Dec16 0:00 [khelper]
root 5 0.0 0.0 0 0 ? S< Dec16 0:00 [kthread]
root 8 0.0 0.0 0 0 ? S< Dec16 0:00 [kblockd/0]
root 9 0.0 0.0 0 0 ? S< Dec16 0:00 [kacpid]
root 10 0.0 0.0 0 0 ? S< Dec16 0:00 [kacpi_notify]
root 110 0.0 0.0 0 0 ? S Dec16 0:00 [pdflush]
root 111 0.0 0.0 0 0 ? S Dec16 0:00 [pdflush]
root 112 0.0 0.0 0 0 ? S Dec16 0:00 [kswapd0]
root 113 0.0 0.0 0 0 ? S< Dec16 0:00 [aio/0]
root 320 0.0 0.0 0 0 ? S< Dec16 0:00 [cqueue/0]
root 321 0.0 0.0 0 0 ? S< Dec16 0:00 [kseriod]
root 365 0.0 0.0 0 0 ? S< Dec16 0:00 [kpsmoused]
root 723 0.0 0.0 0 0 ? S< Dec16 0:00 [scsi_eh_0]
root 833 0.0 0.0 0 0 ? S< Dec16 0:00 [ksnapd]
root 837 0.0 0.0 0 0 ? S< Dec16 0:00 [ata/0]
root 838 0.0 0.0 0 0 ? S< Dec16 0:00 [ata_aux]
root 895 0.0 0.0 0 0 ? S Dec16 0:00 [kjournald]
root 964 0.0 0.1 2408 684 ? S<s Dec16 0:00 /sbin/udevd --daemon
lines 1-22
In the command ps aux, three options are used to ask the system to show process information. First, the option a makes sure that all processes are shown. Next, the option u gives extended usage information, whereas the option x also shows from which TTY and by what user a process is started. You can see the results in Listing 9-5, in which the following columns are listed. Because many of these columns are similar to the columns in top, I will give a short description of them only.
· USER: The name of the user who started the process.
· PID: The PID of the process. The command ps aux sorts the processes by their PID.
· %CPU: The percentage of CPU time the process has used since startup.
· %MEM: The percentage of memory the process is currently using.
· VSZ: The virtual memory size, which is the total amount of memory claimed by this process.
· RSS: The resident memory size, which is the amount of memory the process currently has in use.
· TTY: The terminal (TTY) from which the process was started. A question mark indicates a daemon process that is not associated to any TTY.
· STAT: The current status of the process.
· START: The time at which the process was started.
· TIME: The total amount of system time this process has been using since it started.
· COMMAND: The command that was used to start this process. If the name of this command is between square brackets (you can see quite a few examples of this in Listing 9-5), the process is not started with a command at the command line, but is a kernel thread.
Note The ps command can be used in two ways, both of which go back to the time when there were two major styles in UNIX versions: the BSD style and the System V style. The command ps aux was used in the BSD style to give a list of all processes and their properties, and ps -ef was used in System V style to do basically the same. There are some minor differences, but basically both commands have the same result. So feel free to make your choice here!
The second way in which the ps command is often used is by issuing the ps -ef command. You can see a partial output of this command in Listing 9-6.
Listing 9-6. ps -ef Provides Just Another Way of Displaying Process Information
nuuk:~ # ps -ef | less
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 Dec16 ? 00:00:00 init [5]
root 2 1 0 Dec16 ? 00:00:00 [ksoftirqd/0]
root 3 1 0 Dec16 ? 00:00:00 [events/0]
root 4 1 0 Dec16 ? 00:00:00 [khelper]
root 5 1 0 Dec16 ? 00:00:00 [kthread]
root 8 5 0 Dec16 ? 00:00:00 [kblockd/0]
root 9 5 0 Dec16 ? 00:00:00 [kacpid]
root 10 5 0 Dec16 ? 00:00:00 [kacpi_notify]
root 110 5 0 Dec16 ? 00:00:00 [pdflush]
root 111 5 0 Dec16 ? 00:00:00 [pdflush]
root 112 1 0 Dec16 ? 00:00:00 [kswapd0]
root 113 5 0 Dec16 ? 00:00:00 [aio/0]
root 320 5 0 Dec16 ? 00:00:00 [cqueue/0]
root 321 5 0 Dec16 ? 00:00:00 [kseriod]
root 365 5 0 Dec16 ? 00:00:00 [kpsmoused]
root 723 5 0 Dec16 ? 00:00:00 [scsi_eh_0]
root 833 5 0 Dec16 ? 00:00:00 [ksnapd]
root 837 5 0 Dec16 ? 00:00:00 [ata/0]
root 838 5 0 Dec16 ? 00:00:00 [ata_aux]
root 895 1 0 Dec16 ? 00:00:00 [kjournald]
root 964 1 0 Dec16 ? 00:00:00 /sbin/udevd --daemon
root 1530 5 0 Dec16 ? 00:00:00 [khubd]
lines 1-23
Just two columns in ps -ef are new compared to the output for ps aux. First is the PPID column. This column tells you which process was responsible for starting this process, the so-called parent process. Then there is the column with the name C, which refers to the CPU utilization of this process and hence gives the same information as the %CPU column in ps aux.
Personally, I like ps aux a lot if I need to terminate all processes that were started with the same command. On my SUSE box, it happens that the management program YaST crashes. This program basically uses two kinds of processes: processes that have yast in their command name and processes that have y2 in their command line. To get a list of PIDs for these processes, I use the following commands:
ps aux | grep yast | grep -v grep | awk '{ print $2 }' ps
aux | grep y2 | grep -v grep | awk '{ print $2 }'
Next, it is fairly easy to kill all instances of this process based on the list of PIDs that these two commands will show. You’ll read more about this in the section “Killing Processes with kill, pkill, and killall” later in this chapter.
Another useful way of showing process activity with ps, is by using ps fax. The option f shows the process list in a forest view, which allows you to easily see relations between parent and child processes. This offers an alternative way of showing parent-child relations to the pstreecommand.
Finding PIDs with pgrep
In the preceding section, you read how you can find processes with ps and grep. There is a different option also: the pgrep command. This command is fairly easy to use: enter pgrep followed by the name of the process whose PID you are looking for, and as a result you will see all PIDs that instances of this process currently are using. For instance, if you want to know all PIDs that the Gnome processes are using, use pgrep gnome. This will display a result similar to what you see in Listing 9-7.
Listing 9-7. The pgrep Command Offers an Alternative If You Need to Find PIDs Easily
nuuk:~ # pgrep gnome 3781
3836
3840
3854
3860
3882
3889
3893
3921
3922
A useful feature of pgrep is that you can search for processes based on specific attributes as well. For instance, you can use -u to locate processes that are owned by a specific user, as in the following command:
pgrep -u linda
Also useful is that you can have it display processes if you are not sure about a property. For example, if you want to see processes that are owned by either linda or lori, use the following:
pgrep -u linda,lori
Showing Parent-Child Relations with pstree
For process management purposes, it is useful to know about parent-child relations between processes as well. You can use the pstree command without arguments to show a complete hierarchical list of all processes on your computer, or with a PID as an argument to show a process tree that starts on the selected PID only. If the output of pstree looks weird, you should use the -G option to give the result of pstree in a specific format for your terminal.
I need this to ensure proper display in a PuTTY window, for example. In Listing 9-8, you can see a partial output of this command.
Listing 9-8. Use pstree to Find Out More About the Hierarchical Relation Between Processes
nuuk:~ # pstree -G
ilulissat:~ # pstree -G
init---acpid
|-application-bro
|-auditd---{auditd}
|-bonobo-activati
|-cron
|-cupsd
|-2*[dbus-daemon]
|-dbus-launch
|-dhcpcd
|-esd
|-events/0
|-events/1
|-gconfd-2
|-gdm---gdm---X
| |-gnome-session
|-gnome-keyring-d
|-gnome-panel
|-gnome-power-man
|-gnome-screensav
|-gnome-settings---{gnome-settings-}
|-gnome-terminal---bash
| |-gnome-pty-helpe
| |-{gnome-terminal}
|-gnome-vfs-daemo---{gnome-vfs-daemo}
|-gnome-volume-ma
|-gpg-agent
|-hald---hald-addon-acpi
| |-hald-addon-stor
|-intlclock-apple
|-irqbalance
|-khelper
|-kjournald
|-klogd
|-ksoftirqd/0
|-ksoftirqd/1
|-kswapd0
|-kthread---aio/0
| |-aio/1
| |-ata/0
| |-ata/1
| |-ata_aux
| |-cqueue/0
| |-cqueue/1
| |-kacpi_notify
| |-kacpid
| |-kauditd
| |-kblockd/0
| |-kblockd/1
| |-kgameportd
| |-khubd
| |-kpsmoused
| |-kseriod
| |-2*[pdflush]
| |-scsi_eh_0
|-main-menu---{main-menu}
|-master---pickup
| |-qmgr
|-metacity
|-migration/0
|-migration/1
|-6*[mingetty]
|-mixer_applet2
|-nautilus---3*[{nautilus}]
|-nscd---6*[{nscd}]
|-portmap
|-powersaved
|-resmgrd
|-shpchpd_event
|-slpd
|-sshd---sshd---bash---pstree
|-startpar
|-syslog-ng
|-udevd
|-zen-updater---6*[{zen-updater}]
|-zmd---13*[{zmd}]
In the output of pstree, you can see which process is responsible for calling which other process. For instance, in Listing 9-8, init is the first process that is started. The output of this command is generated on an older Linux distribution, where init was used as the service manager instead of the more recent systemd.
This process calls basically all the other processes such as acpid, application-bro, and so on. If a process has started other processes, you will see that with pstree as well. For instance, you can see that the pstree command used for this example listing actually is in the output listing as well, as a child of the bash process, which on its turn is started from an SSH environment.
Note Some people like to run a graphical user interface on their server; some people don’t. From the process perspective, it certainly makes sense not to run a GUI on your server. If you are not sure this really is useful, you should compare the result of pstree on a server that does have a GUI up and running with the result of the same command on a server that does not have a GUI up and running. You’ll see amazing differences as the result.
Managing Processes
At this point you know how to monitor the generic state of your computer. You have read how to see what processes are doing and know about monitoring process activity. In this section, you’ll learn about some common process management tasks. These include killing processes that don’t listen anymore and adjusting process priority with nice. In a dedicated subsection, you can read how to manage processes from the top utility.
Killing Processes with kill, pkill, and killall
Among the most common process management tasks is the killing of processes. Killing a process, however, goes beyond the mere termination of a process. If you use the kill command or any of its alternatives, you can send a signal to the process. Basically, by sending it a signal, you give the process a command that it simply cannot ignore. A total of 32 signals are available, but of these only four are common. Table 9-1 gives an overview of these common signals.
Table 9-1. Common Process Management Signals
Signal |
Value |
Comment |
SIGHUP |
1 |
Forces a process to reread its configuration without really stopping the process. Use it to apply changes to configuration files. |
SIGKILL |
9 |
Terminates the process using brute force. You risk losing data from open files when using this signal. Use it only if the process doesn’t stop after sending it a signal 15. |
SIGTERM |
15 |
Requests the process to terminate. The process may ignore this. |
SIGUSR1 |
30 |
Sends a specific user-defined signal to the process. Only works if defined within the command. |
When sending a signal to the process, you normally can choose between the signal name or the signal number. In the next three sections, you will see how to do this with the kill, pkill, and killall commands.
Killing processes with kill
The kill command provides the most common way to send signals to processes, and you will find it quite easy to use. This command works with only two arguments: the signal number or name and the PID upon which you want to act. If you don’t specify a signal number, kill by default sends signal 15, asking the process to terminate.
kill does not work with process names, just PID numbers. This means you first have to find the PIDs of the processes you want to send a signal to, which you can do with a command such as pgrep. You can specify multiple PIDs as arguments to kill. The following example shows you how to kill three PIDs with a single command:
kill 3019 3021 3022
Only some commands listen to user-defined signals. An example of these is the dd command, which you can use to clone a device. You can send this command signal USR1, which will ask dd to show its current progress. To find out whether a command listens to one of the USR signals, go to the man page for kill.
Killing processes with killall
Compared to kill, killall is a more versatile command, specifically due to its ability to work with some arguments that allow you to specify which processes you want to kill in a versatile way. For instance, you can use killall to terminate processes that have a specific file open at that time by just mentioning the file name. Some of the most useful options for killall are listed here:
· -I: This option tells killall to ignore case. Useful if you don’t want to think about upperand lowercase.
· -i: This option puts killall in interactive mode. You’ll have to confirm before any process is killed.
· -r: This option allows you to work with regular expressions. This is useful because you won’t have to enter the exact process name.
· -u: This option kills only processes that a specific user owns. Useful if you need to terminate everything a user is doing right now.
For example, if you want to kill all processes that linda currently has opened, use the following command:
killall -u linda
Or if you need to terminate all http processes, use regular expressions as in the following command:
killall -r http
Killing processes with pkill
The third command that you can use to send signals to processes is pkill. Like killall, pkill can also look up processes based on their name or other attributes, which you can address using specific options. For instance, to kill all processes that are owned by user linda, use the following:
pkill -u linda
Another useful feature of pkill is that you can kill processes by their parent ID. For example, if you need to kill all processes that have process 1499 as their parent ID, use the following:
pkill -P 1499
Adjusting Process Priority with nice
As discussed earlier in this chapter, every process is started with a default priority. You can see the priority in the default output of the top command. By default, all processes that have the same priority are treated as equal by the operating system. If within these priorities you want to give more CPU time to a process, you can use the nice and renice commands to change their nice status. Process niceness ranges from -20 to 19. -20 means that a process is not very nice and will get the most favorable scheduling. 19 means that a process is very nice to others and gets the least favorable scheduling.
There are two ways to change the niceness of a program: use nice to start a program with the niceness that you specify, and use renice to change the niceness of a program that has already been started. The following shows how to change the niceness of top to the value of 5:
nice -n 5 top
In case you need to change the nice value for a program that is already running, you should use renice. A useful feature is the option to change the nice status of all processes that a given user has started. For instance, the following command would change the niceness of all processes linda has started to the value -5:
renice -5 -u linda
You can also just use a PID to change the nice value of a process:
renice -5 1499
Process Management from top
You have already learned how to monitor processes using top. You’ve also learned how to manage processes using different command-line tools. From within the top interface, you can also perform some process management tasks. Two tasks are available: you can send processes a signal using kill functionality, and you can renice a process using nice functionality. To do this, use the following options from within the top interface:
· k: Sends a signal to a process. It will first ask for the PID, and then what signal to send to that PID. You should use the numerical PID to manipulate the process.
· r: Changes the niceness of a process. When using this command, you next have to enter the PID of the process whose niceness you need to change.
EXERCISE 9-2: MANAGING PROCESSES
In this exercise you’ll learn how to manage processes using kill killall and nice.
1. Create some workload: type the command dd if=/dev/zero of=/dev/null & four times.
2. Type ps aux to verify that the four processes have been started. Processes that have been started last will show in the end of the list. Notice the PID of one of the dd processes.
3. Type PID=nnn, where nnn is the process PID that you have found in step 2 of the previous step
4. Type renice -5 $PID. This adjusts the niceness of the process to -5, which leaves more place for other processes.
5. Type pidof dd. This shows the PIDs of all dd processes that are currently running.
6. Use kill nnn to kill one of the dd processes. Make sure to replace nnn with one of the PIDs that you have found in the previous step.
7. Type killall dd. This will kill all of the remaining dd processes.
Scheduling Processes
On your computer, some processes will start automatically. In particular, these are the service processes your computer needs to do its work. Other processes are started manually. This means that you have to type a command at the command line to start them. There is also a solution between these two options. If you need a certain task to start automatically at predefined intervals, you can use cron to do so.
There are two parts in cron. First is the cron daemon crond. This process starts automatically on all computers and will check its configuration every minute to see whether it has to issue a certain task. By default, cron reads its master configuration file, which is /etc/crontab.Listing 9-9 shows what this file looks like on an Ubuntu server system.
Caution! The file /etc/crontab directs all tasks that should be scheduled through cron. Notice that you should NOT modify this file directly. After the discussion of the contents of this file, you’ll read how you should make changes to the configuration of scheduled tasks through cron. Modifications that have been made to /etc/crontab will work, but you might loose them as this file can be overwritten during package updates.
Listing 9-9. Example /etc/crontab File
root@ubuntu:~# cat /etc/crontab
# /etc/crontab: system-wide crontab
# Unlike any other crontab you don’t have to run the `crontab'
# command to install the new version when you edit this file
# and files in /etc/cron.d. These files also have username fields,
# that none of the other crontabs do.
SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
# m h dom mon dow user command
17 * * * * root cd / && run-parts --report /etc/cron.hourly
25 6 * * * root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.daily )
47 6 * * 7 root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.weekly )
52 6 1 * * root test -x /usr/sbin/anacron || ( cd / && run-parts --report /etc/cron.monthly )
In all crontab configuration, you will find three different elements. First, you can see an indication of the time when a command should run. Next is the name of the user with whose permissions the job has to execute, and the last part is the name of the command that has to run.
You can use five time positions to indicate when a cron job has to run:
· Minute
· Hour
· Day of month
· Month
· Day of week
For instance, a task definition in /etc/crontab can look as follows:
10 5 3 12 * nobody /usr/bin/false
This task would start 10 minutes after 5 a.m. on December 3 only. A very common error that people make is shown in the following example:
* 5 * * * nobody /usr/bin/false
The purpose of this line is probably to run a task at 5 a.m. every morning; however, it would run every minute between 5:00 a.m. and 5:59 a.m., because the minute specification is an asterisk, which means “every.” Instead, to run the task at 5 a.m. only, the following should be specified:
0 5 * * * nobody /usr/bin/false
Creating user crontabs
Apart from the system crontab, individual users can have crontabs as well. This is normally the common way to make adjustments to the scheduled tasks. Imagine that you want to make a backup every morning. To do so, you probably have a backup program, and this backup program may run automatically with the permissions of a specific user. You can, of course, make the definition in /etc/crontab, with the disadvantage that only root can schedule jobs this way. Therefore, the alterative in which users themselves specify the cron job may be more appealing. To do this, you have to use the crontab command. For instance, if user linda wants to install a cron job to send a mail message to her cell phone every morning at 6 a.m., she would use the following command:
crontab -e
This opens an editor window in which she can define the tasks that she wants cron to run automatically. Because the crontab file will be installed as her crontab file, there is no need to include a user specification there. This means just including the following line would be enough:
0 6 * * 1-5 mail -s "wakeup" mycellphone@example.com <.
Notice the use of 1-5 in the specification of the day of the week. This tells the cron process to run this job only on days 1 through 5, which is from Monday to Friday.
If you are logged in as the root user, you can also create cron jobs for other users. To do this, use crontab -u followed by the name of the user you want to create the cron job for. The command crontab -u linda, if issued as root for example, would create a cron job for user linda. This command also opens the crontab editor, which allows you to enter all the required commands. Also useful if you are root: the command crontab -l gives an overview of all the crontab jobs that are currently scheduled for a given user account.
Understanding cron.{hourly|daily|weekly|monthly}
Cron also uses four different directories to execute cron jobs at a regular interval. These are the directories /etc/cron.hourly, /etc/cron.daily, /etc/cron.weekly and /etc/cron.monthly. In these directories you can put scripts that will be executed at the indicated intervals. You can create these scripts as administrator, but many scripts will be placed here automatically when new packages are installed. The logrotate processes for instance are executed this way.
The contents of these scripts is bash shell scripting code, and they don’t contain any of the time indicators that are specific to cron. This is not needed, because the cron helper process anacron is taking care of execution of these scripts. Anacron was developed to ensure that specific tasks will be executed at a guarnateed interval. That ensures that the task will also run if the system has been down for maintenance temporarily.
Using /etc/cron.d
Yet another way of running tasks through cron, is by creating files in /etc/cron.d. All files in the directory /etc/cron.d will be included when the cron process is started. Using this approach offers an alternative to making modifications to the /etc/crontab file. The advantage of this approach is that your changes won’t get lost during software updates. The contents of the files in /etc/cron.d is exactly the same as the contents of the lines that are added in /etc/crontab.
Summary
In this chapter, you have learned how to tune and manage processes and memory on your computer. You have learned about the way that Linux works with processes and also about memory usage on Linux. You acquired knowledge about some of the most important commands related to process management, including top and ps. In this chapter, the following commands and utilities have been discussed:
· init: First process loaded on a Linux computer.
· mingetty: Process responsible for initializing terminal windows.
· pstree: Command that shows a hierarchical overview of parent and child processes.
· nice: Command that sets priority of a process as it starts up.
· renice: Command that resets nice value for processes that are currently active.
· ps: Command that shows a list of processes and much useful information about each of them.
· top: Command that allows you to monitor processes and perform basic process management actions.
· pgrep: grep utility that is optimized for process management.
· free: Command that shows the amount of memory that is still available.
· kill: Command for terminating processes.
· pkill: Command for terminating processes.
· killall: Command for terminating processes. Optimized to terminate multiple processes using one command.
· crond: Process that allows you to run processes at a fixed time on a regular basis.
· crontab: Command that interfaces with crond to schedule processes.
In the next chapter, you’ll learn how to configure system logging.