Shell Variables and Scripts - Shells - Ubuntu 15.04 Server with systemd: Administration and Reference (2015)

Ubuntu 15.04 Server with systemd: Administration and Reference (2015)

Part V. Shells

Chapter 21. Shell Variables and Scripts

A shell script combines Linux commands in such a way as to perform a specific task. The different kinds of shells provide many programming tools that you can use to create shell scripts. You can define variables and assign values to them. You can also define variables in a script file, and have a user interactively enter values for them when the script is executed. The shell provides loop and conditional control structures that repeat Linux commands or make decisions on which commands you want to execute. You can also construct expressions that perform arithmetic or comparison operations. All these shell programming tools operate in ways similar to those found in other programming languages, so if you’re already familiar with programming, you might find shell programming simple to learn.

The BASH, TCSH, and Z shells are types of shells. You can have many instances of a particular kind of shell. A shell, by definition, is an interpretive environment within which you execute commands. You can have many environments running at the same time, of either the same or different types of shells; you have several shells running at the same time that are of the BASH shell type, for example.

This chapter will cover the basics of creating a shell script using the BASH and TCSH shells, the shells used on most Linux systems. You will learn how to create your own scripts, define shell variables, and develop user interfaces, as well as learn the more difficult task of combining control structures to create complex programs. Tables throughout the chapter list shell commands and operators, while numerous examples show how they are implemented.

Usually, the instructions making up a shell program are entered into a script file that can then be executed. You can even distribute your program among several script files, one of which will contain instructions on how to execute others. You can think of variables, expressions, and control structures as tools you can use to bring together several Linux commands into one operation. In this sense, a shell program is a new and complex Linux command that you have created.

The BASH shell has a flexible and powerful set of programming commands that allows you to build complex scripts. It supports variables that can be either local to the given shell or exported to other shells. You can pass arguments from one script to another. The BASH shell has a complete set of control structures, including loops and if statements, as well as case structures, all of which you’ll learn about as you read this book. All shell commands interact easily with redirection and piping operations that allow them to accept input from the standard input or send it to the standard output. Unlike the Bourne shell, the first shell used for UNIX, BASH incorporates many of the features of the TCSH and Z shells. Arithmetic operations in particular are easier to perform in BASH.

Shell Variables

Within each shell, you can enter and execute commands. You can further enhance the capabilities of a shell using shell variables. A shell variable lets you hold data that you can reference over and over again as you execute different commands within a shell. For example, you can define a shell variable to hold the name of a complex filename. Then, instead of retyping the filename in different commands, you can reference it with the shell variable.

You define variables within a shell, and such variables are known as shell variables. Some utilities, such as the Mail utility, have their own shells with their own shell variables. You can also create your own shell using shell scripts. You have a user shell that becomes active as soon as you log in. This is often referred to as the login shell. Special system-level parameter variables are defined within this login shell. Shell variables can also be used to define a shell’s environment.

Note: Shell variables exist as long as your shell is active—that is, until you exit the shell. For example, logging out will exit the login shell. When you log in again, any variables you may need in your login shell must be defined again.

Definition and Evaluation of Variables: =, $, set, unset

You define a variable in a shell when you first use the variable’s name. A variable’s name may be any set of alphabetic characters, including the underscore. The name may also include a number, but the number cannot be the first character in the name. A name may not have any other type of character, such as an exclamation point, an ampersand, or even a space. Such symbols are reserved by the shell for its own use. Also, a variable name may not include more than one word. The shell uses spaces on the command line to distinguish different components of a command such as options, arguments, and the command name.

You assign a value to a variable with the assignment operator (=). You type the variable name, the assignment operator, and then the value assigned. Do not place any spaces around the assignment operator. The assignment operation poet = Virgil, for example, will fail. (The C shell has a slightly different type of assignment operation.) You can assign any set of characters to a variable. In the next example, the variable poet is assigned the string Virgil:

$ poet=Virgil

Once you have assigned a value to a variable, you can use the variable name to reference the value. Often you use the values of variables as arguments for a command. You can reference the value of a variable using the variable name preceded by the $ operator. The dollar sign is a special operator that uses the variable name to reference a variable’s value, in effect evaluating the variable. Evaluation retrieves a variable’s value, usually a set of characters. This set of characters then replaces the variable name on the command line. Wherever a $ is placed before the variable name, the variable name is replaced with the value of the variable. In the next example, the shell variable poet is evaluated and its contents, Virgil, is used as the argument for an echo command. The echo command simply echoes or prints a set of characters to the screen.

$ echo $poet
Virgil

You must be careful to distinguish between the evaluation of a variable and its name alone. If you leave out the $ operator before the variable name, all you have is the variable name itself. In the next example, the $ operator is absent from the variable name. In this case, the echo command has as its argument the word poet, and so prints out poet:

$ echo poet
poet

The contents of a variable are often used as command arguments. A common command argument is a directory pathname. It can be tedious to retype a directory path that is being used over and over again. If you assign the directory pathname to a variable, you can simply use the evaluated variable in its place. The directory path you assign to the variable is retrieved when the variable is evaluated with the $ operator. The next example assigns a directory pathname to a variable, and then uses the evaluated variable in a copy command. The evaluation of ldir (which is$ldir) results in the pathname /home/chris/letters. The copy command evaluates to cp myletter /home/chris/letters.

$ ldir=/home/chris/letters
$ cp myletter $ldir

You can obtain a list of all the defined variables with the set command. If you decide you do not want a certain variable, you can remove it with the unset command. The unset command undefines a variable.

Variable Values: Strings

The values that you assign to variables may consist of any set of characters. These characters may be a character string that you explicitly type in or the result obtained from executing a Linux command. In most cases, you will need to quote your values using either single quotes, double quotes, backslashes, or back quotes. Single quotes, double quotes, and backslashes allow you to quote strings in different ways. Back quotes have the special function of executing a Linux command and using its results as arguments on the command line.

Quoting Strings: Double Quotes, Single Quotes, and Backslashes

Variable values can be made up of any characters. However, problems occur when you want to include characters that are also used by the shell as operators. Your shell has certain metacharacters that it uses in evaluating the command line. A space is used to parse arguments on the command line. The asterisk, question mark, and brackets are metacharacters used to generate lists of filenames. The period represents the current directory. The dollar sign, $, is used to evaluate variables, and the greater-than and less-than characters , > <, are redirection operators. The ampersand, &, is used to execute background commands and the bar pipes output. If you want to use any of these characters as part of the value of a variable, you first need to quote them. Quoting a metacharacter on a command line makes it just another character. It is not evaluated by the shell.

You can use double quotes, single quotes, and backslashes to quote such metacharacters. Double and single quotes allow you to quote several metacharacters at a time. Any metacharacters within double or single quotes are quoted. A backslash quotes the single character that follows it.

If you want to assign more than one word to a variable, you need to quote the spaces separating the words. You can do so by enclosing all the words within double quotes. You can think of this as creating a character string to be assigned to the variable. Of course, any other metacharacters enclosed within the double quotes are also quoted.

In the following first example, the double quotes enclose words separated by spaces. Because the spaces are enclosed within double quotes, they are treated as characters, not as delimiters used to parse command line arguments. In the second example, double quotes also enclose a period, treating it as just a character. In the third example, an asterisk is also enclosed within the double quotes. The asterisk is considered just another character in the string and is not evaluated.

$ notice="The meeting will be tomorrow"
$ echo $notice
The meeting will be tomorrow

$ message="The project is on time."
$ echo $message
The project is on time.

$ notice="You can get a list of files with ls *.c"
$ echo $notice
You can get a list of files with ls *.c

Double quotes, however, do not quote the dollar sign, the operator that evaluates variables. A $ operator next to a variable name enclosed within double quotes will still be evaluated, replacing the variable name with its value. The value of the variable will then become part of the string, not the variable name. There may be times when you want a variable within quotes to be evaluated. In the next example, the double quotes are used so that the winner's name will be included in the notice.

$ winner=dylan
$ notice="The person who won is $winner"
$ echo $notice
The person who won is dylan

On the other hand, there may be times when you do not want a variable within quotes to be evaluated. In that case you have to use the single quotes. Single quotes suppress any variable evaluation and treat the dollar sign as just another character. In the next example, single quotes prevent the evaluation of the winner variable.

$ winner=dylan
$ result='The name is in the $winner variable'
$ echo $result
The name is in the $winner variable

If, in this case, the double quotes were used instead, an unintended variable evaluation would take place. In the next example, the characters "$winner" are interpreted as a variable evaluation.

$ winner=dylan
$ result="The name is in the $winner variable"
$ echo $result
The name is in the dylan variable

You can always quote any metacharacter, including the $ operator, by preceding it with a backslash. The use of the backslash is to quote ENTER keys (newlines). The backslash is useful when you want to both evaluate variables within a string and include $ characters. In the next example, the backslash is placed before the $ in order to treat it as a dollar sign character: \$. At the same time the variable $winner is evaluated because the double quotes that are used do not quote the $ operator.

$ winner=dylan
$ result="$winner won \$100.00"
$ echo $result
dylan won $100.00

Quoting Commands: Single Quotes

There are, however, times when you may want to use single quotes around a Linux command. Single quotes allow you to assign the written command to a variable. If you do so, you can then use that variable name as another name for the Linux command. Entering in the variable name, preceded by the $ operator on the command line, will execute the command. In the next example, a shell variable is assigned the characters that make up a Linux command to list files, 'ls -F'. Notice the single quotes around the command. When the shell variable is evaluated on the command line, the Linux command it contains will become a command line argument, and it will be executed by the shell.

$ lsf='ls -F'
$ $lsf
mydata /reports /letters
$

In effect you are creating another name for a command, like an alias.

Values from Linux Commands: Back Quotes

Although you can create variable values by typing in characters or character strings, you can also obtain values from other Linux commands. To assign the result of Linux command to a variable, you first need to execute the command. If you place a Linux command within back quotes (`) on the command line, that command is first executed and its result becomes an argument on the command line. In the case of assignments, the result of a command can be assigned to a variable by placing the command within back quotes first to execute it. The back quotes can be thought of as an expression consisting of a command to be executed whose result is then assigned to the variable. The characters making up the command itself are not assigned. In the next example, the command ls *.c is executed and its result is then assigned to the variable listc. ls *.c, which generates a list of all files with a .c extension. This list of files is then assigned to the listc variable.

$ listc=`ls `*.c`
$ echo $listc
main.c prog.c lib.c

Keep in mind the difference between single quotes and back quotes. Single quotes treat a Linux command as a set of characters. Back quotes force execution of the Linux command. There may be times when you accidentally enter single quotes when you mean to use back quotes. In the following first example, the assignment for the lscc variable has single quotes, not back quotes, placed around the ls *.c command. In this case, ls *.c are just characters to be assigned to the variable lscc. In the second example, back quotes are placed around the ls *.c command, forcing evaluation of the command. A list of filenames ending in .c is generated and assigned as the value of lscc.

$ lscc='ls *.c'
$ echo $lscc
ls *.c

$ lscc=`ls *.c`
$ echo $lscc
main.c prog.c

Shell Scripts: User-Defined Commands

You can place shell commands within a file and then have the shell read and execute the commands in the file. In this sense, the file functions as a shell program, executing shell commands as if they were statements in a program. A file that contains shell commands is called a shell script.

You enter shell commands into a script file using a standard text editor such as the Vi editor. The sh or . command used with the script’s filename will read the script file and execute the commands. In the next example, the text file called lsc contains an ls command that displays only files with the extension .c:

lsc

ls *.c

A run of the lsc script is shown here:

$ sh lsc
main.c calc.c
$ . lsc
main.c calc.c

Executing Scripts

You can dispense with the sh and . commands by setting the executable permission of a script file. When the script file is first created by your text editor, it is given only read and write permission. The chmod command with the +x option will give the script file executable permission. Once it is executable, entering the name of the script file at the shell prompt and pressing ENTER will execute the script file and the shell commands in it. In effect, the script’s filename becomes a new shell command. In this way, you can use shell scripts to design and create your own Linux commands. You need to set the permission only once.

In the next example, the lsc file’s executable permission for the owner is set to on. Then the lsc shell script is directly executed like any Linux command.

$ chmod u+x lsc
$ lsc
main.c calc.c

You may have to specify that the script you are using is in your current working directory. You do this by prefixing the script name with a period and slash combination, as in ./lsc. The period is a special character representing the name of your current working directory. The slash is a directory pathname separator. The following example shows how to execute the lsc script:

$ ./lsc
main.c calc.c

Script Arguments

Just as any Linux command can take arguments, so also can a shell script. Arguments on the command line are referenced sequentially starting with 1. An argument is referenced using the $ operator and the number of its position. The first argument is referenced with $1, the second, with$2, and so on. In the next example, the lsext script prints out files with a specified extension. The first argument is the extension. The script is then executed with the argument c (of course, the executable permission must have been set).

lsext

ls *.$1

A run of the lsext script with an argument is shown here:

$ lsext c
main.c calc.c

In the next example, the commands to print out a file with line numbers have been placed in an executable file called lpnum, which takes a filename as its argument. The cat command with the -n option first outputs the contents of the file with line numbers. Then this output is piped into thelpr command, which prints it. The command to print out the line numbers is executed in the background.

lpnum

cat -n $1 | lpr &

A run of the lpnum script with an argument is shown here:

$ lpnum mydata

You may need to reference more than one argument at a time. The number of arguments used may vary. In lpnum, you may want to print out three files at one time and five files at some other time. The $ operator with the asterisk, $*, references all the arguments on the command line. Using $* enables you to create scripts that take a varying number of arguments. In the next example, lpnum is rewritten using $* so that it can take a different number of arguments each time you use it:

lpnum

cat -n $* | lpr &

A run of the lpnum script with multiple arguments is shown here:

$ lpnum mydata preface

Environment Variables

When you log in to your account, your Linux system generates your user shell. Within this shell, you can issue commands and declare variables. You can also create and execute shell scripts. However, when you execute a shell script, the system generates a subshell. You then have two shells, the one you logged in to and the one generated for the script. Within the script shell you can execute another shell script, which will then have its own shell. When a script has finished execution, its shell terminates and you enter back to the shell from which it was executed. In this sense, you can have many shells, each nested within the other.

Variables that you define within a shell are local to it. If you define a variable in a shell script, then, when the script is run, the variable is defined with that script's shell and is local to it. No other shell can reference it. In a sense, the variable is hidden within its shell.

To illustrate this situation more clearly, the next example will use two scripts, one of which is called from within the other. When the first script executes, it generates its own shell. From within this shell, another script is executed which, in turn, generates its own shell. In the next example, the user first executes the dispfirst script, which displays a first name. When the dispfirst script executes, it generates its own shell and then, within that shell, it defines the firstname variable. After it displays the contents of firstname, the script executes another script: displast. Whendisplast executes, it generates its own shell. It defines the lastname variable within its shell and then displays the contents of lastname. It then tries to reference firstname and display its contents. It cannot do so because firstname is local to dispfirst's shell and cannot be referenced outside it. An error message is displayed indicating that for the displast shell, firstname is an undefined variable.

dispfirst

firstquot;Charles"
echo "First name is $firstname"

displast

displast

lastquot;Dickens"

echo "Last name is $lastname"
echo "$firstname $lastname"

The run of the dispfirst script is shown here:

$ dispfirst
First name is Charles
Last name is Dickens
Dickens
sh: firstname: not found

If you want the same value of a variable, used both in a script's shell and a subshell, you can simply define the variable twice, once in each script, and assign it the same value. In the previous example, there is a myfile variable defined in dispfile and in printfile. The user executes the bscript, which first displays the list file with line numbers. When the dispfile script executes, it generates its own shell and then, within that shell, it defines the myfile variable. After it displays the contents of the file, the script then executes another script printfile. When printfile executes, it generates its own shell. It defines its own myfile variable within its shell and then sends a file to the printer.

What if you want to define a variable in one shell and have its value referenced in any subshell? For example, what if you want to define the myfile variable in the dispfile script and have its value, "List", referenced from within the printfile script, rather than explicitly defining another variable in printfile? Since variables are local to the shell they are defined in, there is no way you can do this with ordinary variables. However, there is a type of variable called an environment variable that allows its value to be referenced by any subshells. Environment variables constitute an environment for the shell and any subshell it generates, no matter how deeply nested.

dispfile

myfile="List"

echo "Displaying $myfile"
pr -t -n $myfile

printfile

printfile

myfile="List"

echo "Printing $myfile"
lp $myfile &

The run of the dispfile script is shown here:

$ dispfile
Displaying List
1 screen
2 modem
3 paper
Printing List

You can define environment variables in the three major types of shells: Bourne, Korn, and C. However, the strategy used to implement environmental variables in the Bourne and Korn shells is very different from that of the C shell. In the Bourne and Korn shells, environmental variables are exported. That is to say, a copy of an environmental variable is made in each subshell. In a sense, if the myfile variable is exported, a copy is automatically defined in each subshell for you. In the C shell, on the other hand, an environmental variable is defined only once and can be directly referenced by any subshell.

Shell Environment Variables

In the Bourne, BASH, and Korn shells, an environment variable can be thought of as a regular variable with added capabilities. To make an environment variable, you apply the export command to a variable you have already defined. The export command instructs the system to define a copy of that variable for each new shell generated. Each new shell will have its own copy of the environment variable. This process is called exporting variables.

In the next example, the variable myfile is defined in the dispfile script. It is then turned into an environment variable using the export command. The myfile variable will consequently be exported to any subshells, such as that generated when printfile is executed.

dispfile

myfile="List"
export myfile

echo "Displaying $myfile"
pr -t -n $myfile

printfile

printfile

echo "Printing $myfile"
lp $myfile &

The run of the dispfile script is shown here:

$ dispfile
Displaying List
1 screen
2 modem
3 paper
Printing List

When printfile is executed it will be given its own copy of myfile and can reference that copy within its own shell. You no longer need to explicitly define another myfile variable in printfile.

It is a mistake to think of exported environment variables as global variables. A new shell can never reference a variable outside of itself. Instead, a copy of the variable with its value is generated for the new shell. You can think of exported variables as exporting their values to a shell, not themselves. For those familiar with programming structures, exported variables can be thought of as a form of call-by-value.

Control Structures

You can control the execution of Linux commands in a shell script with control structures. Control structures allow you to repeat commands and to select certain commands over others. A control structure consists of two major components: a test and commands. If the test is successful, then the commands are executed. In this way, you can use control structures to make decisions as to whether commands should be executed.

Two different kinds of control structures are used: loops, which repeat commands, and conditions, which execute commands when certain conditions are met. The BASH shell has three loop control structures—while, for, and for-in—and two condition structures—if and case. The control structures have as their test the execution of a Linux command. All Linux commands return an exit status after they have finished executing. If a command is successful, its exit status will be 0. If the command fails for any reason, its exit status will be a positive value referencing the type of failure that occurred. The control structures check to see whether the exit status of a Linux command is 0 or some other value. In the case of the if and while structures, if the exit status is a 0 value, the command was successful and the structure continues.

Test Operations

With the test command, you can compare integers and strings, and even perform logical operations. The command consists of the keyword test, followed by the values being compared, separated by an option that specifies what kind of comparison is taking place. The option can be thought of as the operator, but it is written, like other options, with a minus sign and letter codes. For example, -eq is the option that represents the equality comparison. Two string operations, however, actually use an operator instead of an option. When you compare two strings for equality, you use the equal sign (=). For inequality you use !=. Table 21-1 lists some of the commonly used options and operators used by test. The syntax for the test command is shown here:

test value -option value
test string = string

Integer Comparisons

Function

-gt

Greater-than

-lt

Less-than

-ge

Greater-than-or-equal-to

-le

Less-than-or-equal-to

-eq

Equal

-ne

Not-equal

String Comparisons

-z

Tests for empty string

=

Equal strings

!=

Not-equal strings

Logical Operations

-a

Logical AND

-o

Logical OR

!

Logical NOT

File Tests

-f

File exists and is a regular file

-s

File is not empty

-r

File is readable

-w

File can be written to, modified

-x

File is executable

-d

Filename is a directory name

Table 21-1: BASH Shell Test Operators

The next example compares two integer values to determine whether they are equal. In this case, the equality option, -eq, should be used. The exit status of the test command is examined to determine the result of the test operation. The shell special variable $? holds the exit status of the most recently executed Linux command.

$ num=5
$ test $num -eq 10
$ echo $?
1

Instead of using the keyword test for the test command, you can use enclosing brackets. The command test $greeting = "hi" can be written as

$ [ $greeting = "hi" ]

Similarly, the command test $num -eq 10 can be written as

$ [ $num -eq 10 ]

The brackets themselves must be surrounded by white space: a space, tab, or enter. Without the spaces, the code is invalid.

Conditional Control Structures

The BASH shell has a set of conditional control structures that allow you to choose what Linux commands to execute. Many of these are similar to conditional control structures found in programming languages, but there are some differences. The if condition tests the success of a Linux command, not an expression. Furthermore, the end of an if-then command must be indicated with the keyword fi, and the end of a case command is indicated with the keyword esac. The condition control structures are listed in Table 21-2 .

The if structure places a condition on commands. That condition is the exit status of a specific Linux command. If a command is successful, returning an exit status of 0, then the commands within the if structure are executed. If the exit status is anything other than 0, the command has failed and the commands within the if structure are not executed. The if command begins with the keyword if and is followed by a Linux command whose exit condition will be evaluated. The keyword fi ends the command.

The elsels script in the next example executes the ls command to list files with two different possible options, either by size or with all file information. If the user enters an s, files are listed by size; otherwise, all file information is listed.

elsels

echo Enter s to list file sizes,
echo otherwise all file information is listed.
echo -n "Please enter option: "
read choice
if [ "$choice" = s ]
then
ls -s
else
ls -l
fi
echo Good-bye

Condition Control Structures:
if, else, elif, case

Function

if command then
command
fi

if executes an action if its test command is true.

if command then
command
else
command
fi

if-else executes an action if the exit status of its test command is true; if false, the else action is executed.

if command then
command
elif command then
command
else
command
fi

elif allows you to nest if structures, enabling selection among several alternatives; at the first true if structure, its commands are executed and control leaves the entire elif structure.

case string in
pattern)
command;;
esac

case matches the string value to any of several patterns; if a pattern is matched, its associated commands are executed.

command && command

The logical AND condition returns a true 0 value if both commands return a true 0 value; if one returns a nonzero value, then the AND condition is false and also returns a nonzero value.

command || command

The logical OR condition returns a true 0 value if one or the other command returns a true 0 value; if both commands return a nonzero value, then the OR condition is false and also returns a nonzero value.

! command

The logical NOT condition inverts the return value of the command.

Loop Control Structures:
while, until, for, for-in, select

while command
do
command
done

while executes an action as long as its test command is true.

until command
do
command
done

until executes an action as long as its test command is false.

for variable in list-values
do
command
done

for-in is designed for use with lists of values; the variable operand is consecutively assigned the values in the list.

for variable
do
command
done

for is designed for reference script arguments; the variable operand is consecutively assigned each argument value.

select string in item-list
do
command
done

select creates a menu based on the items in the item-list; then it executes the command; the command is usually a case.

Table 21-2: BASH Shell Control Structures

A run of the program follows:

$ elsels
Enter s to list file sizes,
otherwise all file information is listed.
Please enter option: s
total 2
1 monday 2 today

Loop Control Structures

The while loop repeats commands. A while loop begins with the keyword while and is followed by a Linux command. The keyword do follows on the next line. The end of the loop is specified by the keyword done. The Linux command used in while structures is often a test command indicated by enclosing brackets.

The for-in structure is designed to reference a list of values sequentially. It takes two operands: a variable and a list of values. The values in the list are assigned one by one to the variable in the for-in structure. Like the while command, the for-in structure is a loop. Each time through the loop, the next value in the list is assigned to the variable. When the end of the list is reached, the loop stops. Like the while loop, the body of a for-in loop begins with the keyword do and ends with the keyword done. The cbackup script makes a backup of each file and places it in a directory called sourcebak. Notice the use of the * special character to generate a list of all filenames with a .c extension.

cbackup

for backfile in *.c
do
cp $backfile sourcebak/$backfile
echo $backfile
done

A run of the program follows:

$ cbackup
io.c
lib.c
main.c
$

The for structure without a specified list of values takes as its list of values the command line arguments. The arguments specified on the command line when the shell file is invoked become a list of values referenced by the for command. The variable used in the for command is set automatically to each argument value in sequence. The first time through the loop, the variable is set to the value of the first argument. The second time, it is set to the value of the second argument.