Pro Bash Programming: Scripting the GNU/Linux Shell, Second Edition (2015)

CHAPTER 10. Writing Bug-Free Scripts and Debugging the Rest

The programmer who has never written a buggy program is a figment of someone’s imagination. Bugs are the bane of a programmer’s existence. They range from simple typing errors to bad coding to faulty logic. Some are easily fixed; others can take hours of hunting.

At one end of the spectrum are the syntax errors that prevent a script from completing or running at all. These may involve a missing character: a space, a bracket or brace, a quotation mark. It may be a mistyped command or variable name. It may be a missing keyword, such as thenafter elif.

At the other end of the spectrum are the errors in logic. It may be counting from 1 when you should have started at 0, or it may be using -gt (greater than) when it should have been -ge (greater than or equal to). It may be a faulty formula (isn’t Fahrenheit to Celsius (F – 32) * 1.8?) or using the wrong field in a data record (I thought the shell was field 5 in /etc/passwd!).

In between the extremes, common errors include trying to operate on the wrong type of data (either the program itself supplied the wrong data or an external source did) and failing to check that a command succeeds before proceeding to the next step.

This chapter looks at various techniques to get a program doing what it is supposed to do, including the various shell options for checking and following a script’s progress, strategically placing debugging instructions, and, most important, preventing bugs in the first place.

Prevention Is Better Than Cure

It is far better to avoid introducing bugs than to remove them. There’s no way to guarantee bug-free scripts, but a number of precautions can reduce the frequency of bugs considerably. Making your code easy to read helps. So does documenting it, so that you know what it’s for, what it expects, what results it produces, and so on.

Structure Your Programs

The term structured programming is applied to various programming paradigms, but they all involve modular programming—breaking the problem down into manageable parts. In developing a large application with the shell, this means either functions, separate scripts, or a combination of both.

Even a short program can benefit from some structure; it should contain discrete sections:

· Comments

· Initialization of variables

· Function definitions

· Runtime configuration (parse options, read configuration file, and so on)

· Sanity check (are all values reasonable?)

· Process information (calculate, slice and dice lines, I/O, and so on)

Using this outline, all the components of a short but complete script are presented in the following sections. There are errors in the scripts provided; these will be found and corrected using various debugging techniques.

Comments

The comments should include metadata about the script, including a description, a synopsis of how to call the command or function, author, date of creation, date of last revision, version number, options, and any other information that is needed in order to run the command successfully, as in the following examples:

#: Title: wfe - List words ending with PATTERN
#: Synopsis: wfe [-c|-h|-v] REGEX
#: Date: 2009-04-13
#: Version: 1.0
#: Author: Chris F.A. Johnson
#: Options: -c - Include compound words
#: -h - Print usage information
#: -v - Print version number

The #: is used to introduce these comments so that grep '^#:' wfe will extract all the metadata.

Initialization of Variables

First, define some variables containing metadata. There will be some duplication with the previous comments, but these variables may be needed later:

## Script metadata
scriptname=${0##*/}
description="List words ending with REGEX"
usage="$scriptname [-c|-h|-v] REGEX"
date_of_creation=2009-04-13
version=1.0
author="Chris F.A. Johnson"

Then define the default values, file locations, and other information needed by this script:

## File locations
dict=$HOME
wordfile=$dict/singlewords
conpoundfile=$dict/Compounds

## Default is not to show compound words
compounds=

## Regular expression supplied on the command line
pattern=$1

Function Definitions

There are three functions that are part of the original author’s scripts (apart from quick-and-dirty one-offs). They are die, usage, and version; they may be included in the script itself or in a function library sourced by the script. They haven’t been included in the scripts for this book; that would be unnecessarily repetitive. Examples of these are:

## Function definitions
die() #@ DESCRIPTION: print error message and exit with supplied return code
{ #@ USAGE: die STATUS [MESSAGE]
error=$1
shift
[ -n "$*" ] printf "%s\n" "$*" >&2
exit "$error"
}

usage() #@ DESCRIPTION: print usage information
{ #@ USAGE: usage
#@ REQUIRES: variable defined: $scriptname
printf "%s - %s\n" "$scriptname" "$description"
printf "USAGE: %s\n" "$usage"
}

version() #@ DESCRIPTION: print version information
{ #@ USAGE: version
#@ REQUIRES: variables defined: $scriptname, $author and $version
printf "%s version %s\n" "$scriptname" "$version"
printf "by %s, %d\n" "$author" "${date_of_creation%%-*"
}

Any other functions will follow right after these generic functions.

Runtime Configuration and Options

Chapter 12 will provide an in-depth look at runtime configuration and the different methods that can be used. Much of the time, all you need to do is parse the command-line options:

## parse command-line options, -c, -h, and -v
while getopts chv var
do
case $var in
c) compounds=$compoundfile ;;
h) usage; exit ;;
v) version; exit ;;
esac
done
shift $(( $OPTIND - 1 ))

Process Information

As is often the case in a short script, the actual work of the script is relatively short; setting up parameters and checking the validity of data take up the greater part of the program:

## Search $wordfile and $compounds if it is defined
{
cat "$wordfile"
if [ -n "$compounds" ]
then
cut -f1 "$compounds"
fi
} | grep -i ".$regex$" |
sort -fu ## Case-insensitive sort; remove duplicates

Here, cat is necessary because the second file, whose location is stored in the compounds variable, cannot be given as an argument to grep because it is more than a list of words. The file has three tab-separated fields: the phrase with spaces and other nonalpha characters is removed and the following letter is capitalized, the original phrase, and the lengths as they would appear in a cryptic crossword puzzle:

corkScrew cork-screw (4-5)
groundCrew ground crew (6,4)
haveAScrewLoose have a screw loose (4,1,5,5)

If it were a simple word list, like singlewords, the pipeline could have been replaced by a simple command:

grep -i ".$regex$" "$wordfile" ${compounds:+"$compounds"}

The grep command searches the files given on the command line for lines that match a regular expression. The -i option tells grep to consider uppercase and lowercase letters as equivalent.

Document Your Code

Chris Johnson, the first Author of this book mentioned,

Until fairly recently, my own documentation habits left a lot to be desired. In my scripts directory, I have more than 900 programs written over the past 15 years or thereabout. There are more than 90 function libraries. About 20 scripts are called by cron, and a dozen more are called by those scripts. There are probably about 100 scripts that I use regularly, with “regularly” being anything from several times a day to once or twice a year.

The rest are scripts under development, abandoned scripts, scripts that didn’t work out, and scripts that I no longer have any idea what they are for. I don’t know what they are for because I didn’t include any documentation, not even a one-line description. I don’t know whether they work, whether I decided I didn’t really need that script, or anything about them.

For many of them, I can tell what they do from their name. In others, the code is straightforward, and the purpose is obvious. But there are still many scripts whose purpose I don’t know. Some of them I will probably end up duplicating when I need that task again. When I do, they’ll have at least minimal documentation.

The story is the same with many developers, especially with code snippets. There are software that help you organise your code snippets, but nothing beats documentation and adding notes, TODO, etc that can be searched on.

Format Your Code Consistently

There are various models for pretty printing code, and some people are quite vociferous in their defense of a particular style. I have my own preference (which you’ll have noticed from the scripts in this book), but consistency is more important than the indentations being two, four, or six spaces per level. That there is indentation is more important than the amount of it. I would say that two spaces (which is what I use) is the minimum and that eight is the outside limit, if not too much.

Similarly, it doesn’t matter whether you have then on the same line as if or not. Either of these is fine:

if [ "$var" = "yes" ]; then
echo "Proceeding"
fi

if [ "$var" = "yes" ]
then
echo "Proceeding"
fi

The same goes for other loops and function definitions. I prefer this format:

funcname()
{
: body here
}

Others like this format:

funcname() {
: body here
}

As long as the formatting is consistent and makes the structure clear, it doesn’t matter which format you use.

The K.I.S.S. Principle

Simplicity aids in understanding the intent of your program, but it’s not just keeping code as short as possible that counts. When someone posted the following question below, my first thought was, “That will be a complicated regex.” My second was that I wouldn’t use a regular expression:

· I need a regular expression to express financial quantities in American notation. They have a leading dollar sign and an optional string of asterisks, a string of decimal digits, and a fractional part consisting of a decimal point (.) and two decimal digits. The string to the left of the decimal point could be a single zero. Otherwise, it must not start with a zero. If there are more than three digits to the left of the decimal point, groups of three must be separated by commas. Example: $**2,345.67.

I’d break the task into discrete steps and code each one separately. For example, the first check would be:

amount='$**2,345.67'
case $amount in
\$[*0-9]*) ;; ## OK (dollar sign followed by asterisks or digits), do nothing
*) exit 1 ;;
esac

By the time the tests are finished, there will be a lot more code than there would be in a regular expression, but it will be easier to understand and to change if the requirements change.

Grouping Commands

Rather than redirect each of several lines, group them with braces and use a single redirection. I saw this in a forum recently:

echo "user odad odd" > ftp.txt
echo "prompt" >> ftp.txt
echo "cd $i" >> ftp.txt
echo "ls -ltr" >> ftp.txt
echo "bye" >> ftp.txt

I would recommend this instead:

{
echo "user odad odd"
echo "prompt"
echo "cd $i"
echo "ls -ltr"
echo "bye"
} > ftp.txt

Test as You Go

Rather than save all the debugging until the end, it should be an integral part of the process of developing a program. Each section should be tested as it is written. As an example, let’s look at a function I wrote as part of a chess program. No, it’s not a chess-playing program (though it could be when it’s completed); that would be excruciatingly slow in the shell. It’s a set of functions for preparing instructional material.

It needs to be able to convert one form of chess notation to another and to list all possible moves for any piece on the board. It needs to be able to tell whether a move is legal and to create a new board position after a move has been made. At its most basic level, it has to be able to convert a square in standard algebraic notation (SAN) to its numeric rank and file. That’s what this function does.

The SAN format for naming a square is a lowercase letter representing the file and a number representing the rank. Files are rows of squares from white’s side of the board to black’s. Ranks are rows of squares from left to right. The square in white’s left-hand corner is a1; that in black’s is h8. To calculate possible moves, these need to be converted to the rank and file: a1 is converted to rank=1 and file=1; h8 becomes rank=8 and file=8.

It’s a simple function, but it demonstrates how to test a function. The function receives the name of a square as an argument and stores the rank and file in those variables. If the square is not valid, it sets both rank and file to 0 and returns an error:

split_square() #@ DESCRIPTION: convert SAN square to numeric rank and file
{ #@ USAGE: split_square SAN-SQUARE
local square=$1
rank=${square#?}
case $square in
a[1-8]) file=1;; ## Conversion of file to number
b[1-8]) file=2;; ## and checking that the rank is
c[1-8]) file=3;; ## a valid number are done in a
d[1-8]) file=4;; ## single look-up
e[1-8]) file=5;;
f[1-8]) file=6;; ## If the rank is not valid,
g[1-8]) file=7;; ## it falls through to the default
h[1-8]) file=8;;
*) file=0
rank=0
return 1 ## Not a valid square
;;
esac
return 0
}

To test this function, it is passed all possible legitimate squares as well as some that are not. It prints the name of the square and the file and rank numbers:

test_split_square()
{
local f r
for f in {a..i}
do
for r in {1..9}
do
split_square "$f$r"
printf "$f$r %d-%d " "$file" "$rank"
done
echo
done
}

When the test is run, the output is as follows:

a1 1-1 a2 1-2 a3 1-3 a4 1-4 a5 1-5 a6 1-6 a7 1-7 a8 1-8 a9 0-0
b1 2-1 b2 2-2 b3 2-3 b4 2-4 b5 2-5 b6 2-6 b7 2-7 b8 2-8 b9 0-0
c1 3-1 c2 3-2 c3 3-3 c4 3-4 c5 3-5 c6 3-6 c7 3-7 c8 3-8 c9 0-0
d1 4-1 d2 4-2 d3 4-3 d4 4-4 d5 4-5 d6 4-6 d7 4-7 d8 4-8 d9 0-0
e1 5-1 e2 5-2 e3 5-3 e4 5-4 e5 5-5 e6 5-6 e7 5-7 e8 5-8 e9 0-0
f1 6-1 f2 6-2 f3 6-3 f4 6-4 f5 6-5 f6 6-6 f7 6-7 f8 6-8 f9 0-0
g1 7-1 g2 7-2 g3 7-3 g4 7-4 g5 7-5 g6 7-6 g7 7-7 g8 7-8 g9 0-0
h1 8-1 h2 8-2 h3 8-3 h4 8-4 h5 8-5 h6 8-6 h7 8-7 h8 8-8 h9 0-0
i1 0-0 i2 0-0 i3 0-0 i4 0-0 i5 0-0 i6 0-0 i7 0-0 i8 0-0 i9 0-0

All squares with the rank and file 0-0 are invalid.

Debugging a Script

In the wfe script, which was presented section by section earlier, there are a few bugs. Let’s run that script and see what happens. The script is in $HOME/bin, which is in your PATH, and it can therefore be called by its name alone. Before that, however, a good first step is to check the script with the -n option. This tests for any syntax errors without actually executing the code:

$ bash -n wfe
/home/jayant/bin/wfe-sh: wfe: line 70: unexpected EOF while looking for matching '"'
/home/jayant/bin/wfe-sh: wfe: line 72: syntax error: unexpected end of file

The error message says that there’s a missing quotation mark ("). It has reached the end of the file without finding it. That means it could be missing anywhere in the file. After a quick (or not-so-quick) glance through the file, it’s not apparent where it should be.

When that happens, I start removing sections from the bottom of the file until the error disappears. I remove the last section; it’s still there. I remove the option parsing, and the error hasn’t disappeared. I remove the last function definition, version(), and the error has gone. The error must be in that function; where is it?

version() #@ DESCRIPTION: print script's version information
{ #@ USAGE: version
#@ REQUIRES: variables defined: $scriptname, $author and $version
printf "%s version %s\n" "$scriptname" "$version"
printf "by %s, %d\n" "$author" "${date_of_creation%%-*"
}

There are no mismatched quotations marks, so some other closing character must be missing and causing the problem. After a quick look, I see that the last variable expansion is missing a closing brace. Fixed, it becomes "${date_of_creation%%-*}". Another check with -n, and it gets a clean bill of health. Now it’s time to run it:

$ wfe
bash: /home/jayant/bin/wfe: Permission denied

Oops! We forgot to make the script executable. This doesn’t usually happen with a main script; it happens more often with scripts that are called by another script. Change the permissions and try again:

$ chmod +x /home/jayant/bin/wfe
$ wfe
cat: /home/jayant/singlewords: No such file or directory

Have you downloaded the two files, singlewords and Compounds? If so, where did you put them? In the script, they are declared to be in $dict, which is defined as $HOME. If you put them somewhere else, such as in a subdirectory named words, change that line in the script. Let’s make a directory, words, and put them in there:

mkdir $HOME/words &&
cd $HOME/words &&
wget http://cfaj.freeshell.org/wordfinder/singlewords &&
wget http://cfaj.freeshell.org/wordfinder/Compounds

In the script, change the assignment of dict to reflect the actual location of these files:

dict=$HOME/words

Let’s try again:

$ wfe
a
aa
Aachen
aalii
aardvark
.... 183,758 words skipped ....
zymotic
zymotically
zymurgy
Zyrian
zythum

We forgot to tell the program what we are searching for. The script ought to have checked that an argument was supplied, but we forgot to include a sanity check section. Add that before the search is done (after the line shift $(( $OPTIND - 1 )) ):

## Check that user entered a search term
if [ -z "$pattern" ]
then
{
echo "Search term missing"
usage
} >&2
exit 1
fi

Now, try again:

$ wfe
Search term missing
wfe - List words ending with REGEX
USAGE: wfe [-c|-h|-v] REGEX

That’s better. Now let’s really look for some words:

$ wfe drow
a
aa
Aachen
aalii
aardvark
.... 183,758 words skipped ....
zymotic
zymotically
zymurgy
Zyrian
zythum

There’s still something wrong.

One of the most useful debugging tools is set -x, which prints each command with its expanded arguments as it is executed. Each line is preceded by the value of the PS4 variable. The default value of PS4 is “+ ”; we’ll change it to include the number of the line being executed. Put these two lines before the final section of the script:

export PS4='+ $LINENO: ' ## single quotes prevent $LINENO being expanded immediately
set -x

and try again:

$ wfe drow
++ 77: cat /home/jayant/singlewords
++ 82: grep -i '.$'
++ 83: sort -fu
++ 78: '[' -n '' ']' ## Ctrl-C pressed to stop entire word list being printed

On line 82, you see that the pattern entered on the command line is missing. How did that happen? It should be grep -i '.drow$'. Line 82 in the script should be as follows:

} | grep -i ".$regex$" |

What happened to the value of regex? Comment out set -x, and add the set -u option at the top of the script. This option treats unset variables as an error when they are expanded. Run the script again to check whether regex is set:

$ wfe drow
/home/jayant/bin/wfe: line 84: regex: unbound variable

Why is regex unset? Take a look at the earlier script and see which variable was used to hold the command-line argument. Oh! It was pattern, not regex. You have to be consistent, and regex is a better description of its contents, so let’s use that. Change all instances of patternto regex. You should do it in the comments at the top, as well. Now try it:

$ wfe drow
windrow

Success! Now add compound words and phrases to the mix with the -c option:

$ wfe -c drow
/home/jayant/bin/wfe: line 58: compoundfile: unbound variable

Here we go again! Surely we assigned the Compounds file in the file locations section. Take a look; yes, it’s there on line 23 or thereabout. Wait a minute, there’s a typo: conpoundfile=$dict/Compounds. Change con to com. Keep your fingers crossed:

$ wfe -c drow
$

What? Nothing? Not even windrow? It’s time to set -x and see what’s going on. Uncomment that line, and play it again:

$ wfe -c drow
++ 79: cat /home/jayant/singlewords
++ 84: grep -i '.-c$'
++ 85: sort -fu
++ 80: '[' -n /home/jayant/Compounds ']'
++ 82: cut -f1 /home/jayant/Compounds

At least that’s easy to figure out. We assigned regex before processing the options, and it snarfed the first argument, the -c option. Move the assignment down to after the getopts section, specifically, to after the shift command. (And you’ll probably want to comment out set -x.):

shift $(( $OPTIND - 1 ))
## Regular expression supplied on the command line
regex=$1

Are there any more issues?

$ wfe -c drow
skidRow
windrow

That looks good. It might seem like a lot of work for a small script, but it seems longer in the telling than in the doing, especially once you get used to doing it—or, better still, getting it right in the first place.

Summary

Bugs are inevitable, but with care, most can be prevented. When they do materialize, there are shell options to help trace the problem.

Exercises

1. What is wrong with if [ $var=x ]? What should it be? Why does it give the result it does?

2. Write a function, valid_square(), that returns successfully if its sole argument is a valid SAN chessboard square or fails if it is not. Write a function to test whether it works.