Arrays - C Programming For Beginners (2015)

C Programming For Beginners (2015)

CHAPTER 8

Arrays

In this chapter, we will explain the following:

§ What is an array and how to declare one

§ How to store values in an array

§ How to read a known number of values into an array using a for loop

§ How to process elements of an array using a for loop

§ How to read an unknown number of values into an array using a while loop

§ How to extract a required element from an array with a subscript

§ How to find the sum of numbers stored in an array

§ How to find the average of numbers stored in an array

§ How to use an array to keep several counts

§ How to work with a string as an array of characters

§ How to reverse the elements in an array

§ How to write a function to tell if a phrase is a palindrome

§ How to pass an array as an argument to a function

§ How to find the largest and smallest values in an array

8.1 Simple vs Array Variable

The variables we have been using so far (such as ch, n, sum) are normally called simple variables. At any given time, a simple variable can be used to store one item of data, for instance, one number or one character. Of course, the value stored in the variable can be changed, if we wish. However, there are many situations in which we wish to store a group of related items and to be able to refer to them by a common name. The array variable allows us to do this.

For example, suppose we wish to store a list of 60 scores made by students in a test. We can do this by inventing 60 different int variables and storing one score in one variable. But it would be quite tedious, cumbersome, unwieldy and time-consuming to write code to manipulate these 60 variables. (Think of how you would assign values to these 60 variables.) And what if we needed to deal with 200 scores?

A better way is to use an array to store the 60 scores. We can think of this array as having 60 ‘locations’—we use one location to store one element, in this case, one score. To refer to a particular score, we use a subscript. For example, if score is the name of the array, then score[5] refers to the score in position 5—here 5 is used as a subscript. It is written inside the square brackets, [ and ].

In general, an array can be used to store a list of values of the same type; for instance, we speak of an array of integers, an array of characters, an array of strings or an array of floating-point numbers. As you will see, using an array allows us to work with a list in a simple, systematic way, regardless of its size. We can process all or some items using a simple loop. We can also do things like search for an item in the list or sort the list in ascending or descending order.

8.2 Array Declaration

Before an array is used, it must be declared. For example, consider the statement:

int score[60];

This declares that score is an ‘integer array’ or an ‘array of ints’ with subscripts ranging from 0 to 59. An array declaration consists of

§ The type (int, in this example)

§ The name of the array (score, in this example)

§ A left square bracket, [

§ The size of the array (60, in this example)

§ A right square bracket, ]

In C, array subscripts start at 0 and go up to n-1, if n is the size of the array.

We can think of the declaration as creating 60 int variables which can be referred to collectively by the array variable score. To refer to a specific one of these scores, we use a subscript written in square brackets after the array name. In this example,

score[0] refers to the 1st score

score[1] refers to the 2nd score

score[2] refers to the 3rd score

.

.

score[58] refers to the 59th score

score[59] refers to the 60th score

As you can see, array subscripting is a bit awkward in C; it would be much nicer (and logical) if score[i] were to refer to the ith score. We will see how to get around this shortly.

It is an error to try to refer to an element that is outside the range of subscripts allowed. If you do, you will get an “array subscript” error. For example, you cannot refer to score[60], score[-1] and score[99] since they do not exist.

A subscript can be written using a constant (like 25), a variable (like n) or an expression (like i+1). The value of the subscript determines which element is being referred to.

In our example, each element of the array is an int and can be used in any way that an ordinary int variable can. In particular, a value can be stored in it, its value can be printed and it can be compared with another int.

We could picture score as in Figure 8.1.

score

score[0]

score[1]

score[2]

score[3]

.

.

score[56]

score[57]

score[58]

score[59]

Figure 8.1. Declaration of int score[60]

Like a simple variable, when an array is declared, the values of its elements remain undefined until we execute statements which store values in them. This is discussed in Section 8.3, next.

To give another example, suppose we need to store the item numbers (integers) and prices (floating-point numbers) of 100 items. We can use one array (item, say) to hold the item numbers and another array (price, say) to hold the prices. These can be declared with this:

int item[100];

double price[100];

The elements of item range from item[0] to item[99] and the elements of price range from price[0] to price[99]. When we store values in these arrays (see next), we will ensure that

price[0] holds the price of item[0];

price[1] holds the price of item[1];

and, in general,

price[i] holds the price of item[i].

8.3 Store Values in an Array

Consider the array score. If we wish, we could set selected elements to specific values, as follows:

score[3] = 56;

score[7] = 81;

But what if we wish to set the 60 locations to 60 scores? Would we have to write 60 statements as in the following?

score[0] = 45;

score[1] = 63;

score[2] = 39;

.

.

score[59] = 78;

This is certainly one way of doing the job, but it is very tedious, time-consuming and inflexible. A neater way is to let the subscript be a variable rather than a constant. For example, score[h] can be used to refer to the score in location h; which score is meant depends on the value of h. If the value of h is 47, then score[h] refers to score[47], the score in location 47.

Note that score[h] can be used to refer to another score simply by changing the value of h, but, at any one time, score[h] refers to one specific score, determined by the current value of h.

Suppose the 60 scores are stored in a file scores.txt. The following code will read the 60 scores and store them in the array score:

FILE * in = fopen("scores.txt", "r");

for (int h = 0; h < 60; h++)

fscanf(in, "%d", &score[h]);

Suppose the file scores.txt begins with the following data:

45 63 39 ...

The for loop is executed with the value of h ranging from 0 to 59:

§ When h is 0, the first score, 45, is read and stored in score[0];

§ When h is 1, the second score, 63, is read and stored in score[1];

§ When h is 2, the third score, 39, is read and stored in score[2];

and so on, up to

§ When h is 59, the 60th score is read and stored in score[59].

Note that this method is much more concise than writing 60 assignment statements. We are using one statement

fscanf(in, "%d", &score[h]);

to store the scores in 60 different locations. This is achieved by varying the value of the subscript, h. This method is also more flexible. If we had to deal with 200 scores, say, we only need to change 60 to 200 in the declaration of score and in the for statement (and supply the 200 scores in the data file). The previous method would require us to write 200 assignment statements.

If we wish to print the scores as they are read, we could write the for loop like this:

for (int h = 0; h < 60; h++) {

fscanf(in, "%d", &score[h]);

printf("%d\n", score[h]);

}

On the other hand, if we wish to print the scores after they are read and stored in the array, we could write another for loop:

for (h = 0; h < 60; h++)

printf("%d\n", score[h]);

We have used the same loop variable h that was used to read the scores. But it is not required that we do so. Any other loop variable would have the same effect. For instance, we could have written:

for (int x = 0; x < 60; x++)

printf("%d\n", score[x]);

What is important is the value of the subscript, not the variable that is used as the subscript.

We often need to set all elements of a numeric array to 0. This may be necessary, for instance, if we are going to use them to hold totals, or as counters. For example, to set the 60 elements of score to 0, we could write:

for (int h = 0; h < 60; h++)

score[h] = 0;

The for loop is executed 60 times, with h taking on the values 0 to 59:

§ The first time through the loop, h is 0, so score[0] is set to 0.

§ The second time through the loop, h is 1, so score[1] is set to 0.

and so on, until

§ The 60th time through the loop, h is 59, so score[59] is set to 0.

If we want to set the elements to a different value (-1, say), we could write:

for (int h = 0; h < 60; h++)

score[h] = -1;

It should be noted that even though we have declared score to be of size 60, it is not required that we use all the elements. For example, suppose we want to set just the first 20 elements of score to 0, we could do this with the following:

for (int h = 0; h < 20; h++)

score[h] = 0;

This sets elements score[0], score[1], score[2], up to score[19] to 0. Elements score[20] to score[59] remain undefined.

8.3.1 About not using element 0

As we have seen, starting from element 0 can be a bit awkward and unnatural when we have to say things like “the third element is stored in location 2”; the subscript is “out of sync” with the position of the element. It would be much more sensible and logical to say “the first element is stored in location 1” or “the fifth element is stored in location 5”.

For situations like these, it is better to ignore element 0 and pretend that the subscripts start from 1. However, you will have to declare the size of your array to be one more than you actually need. For instance, if we want to cater for 60 scores, we will have to declare score as

int score[61];

This creates elements score[0] to score[60]. We can ignore score[0] and use only score[1] to score[60]. Having to declare an extra element is a small price to pay for being able to work with our problem in a more natural and logical manner.

There are times when it is better to work with an array from position 0. But, for those times when it is not, we will declare our array size to be one more than required and ignore the element in position 0. It is better programming practice to use the language to suit your purpose rather than constrain yourself to the idiosyncrasies of the language.

Suppose we want to cater for 60 scores. A good way to do this is as follows:

#define MaxScores 60

...

int score[MaxScores + 1];

We can now work with elements score[1] to score[MaxScores].

8.4 Average and Differences from Average

Consider the problem of finding the average of a set of numbers (integers) and the amount by which each number differs from the average. In order to find the average, we need to know all the numbers. In Section 5.2, we saw how to find the average by reading and storing one number at a time. Each new number read replaced the previous one. At the end, we could calculate the average but we’ve lost all the numbers.

Now, if we also want to know how much each number differs from the average, we would need to store the original numbers so that they are available after the average is calculated. We will store them in an array. The program will be based on the following assumptions:

§ No more than 100 numbers will be supplied; this information is needed to declare the size of the array;

§ The numbers will be terminated by 0; it is assumed that 0 is not one of the numbers.

The following shows how we want the program to work:

Enter up to 100 numbers (end with 0)

2 7 5 3 0

Numbers entered: 4

Sum of numbers: 17

The average is 4.25

Numbers and differences from average

2 -2.25

7 2.75

5 0.75

3 -1.25

Program P8.1 shows how to write the program to work like this.

Program P8.1

//find average and difference from average

#include <stdio.h>

#define MaxNum 100

int main() {

int a, num[MaxNum];

int n = 0;

double sum = 0;

printf("Enter up to %d numbers (end with 0)\n", MaxNum);

scanf("%d", &a);

while (a != 0) {

sum += a;

num[n++] = a; //store in location n, then add 1 to n

scanf("%d", &a);

}

if (n == 0) printf("No numbers entered\n");

else {

printf("\nNumbers entered: %d\n", n);

printf("Sum of numbers: %1.0f\n\n", sum);

double average = sum / n;

printf("The average is %3.2f\n", average);

printf("\nNumbers and differences from average\n");

for (int h = 0; h < n; h++)

printf("%4d %6.2f\n", num[h], num[h] - average);

}

}

Points to note about Program P8.1:

§ Using #define, we set the symbolic constant MaxNum to 100; we use it to declare the array and in the prompt for numbers. This makes the program easy to modify if we change our mind and wish to cater for a different amount of numbers.

§ We enter the while loop when the number read is not 0. Inside the loop, we add it to the sum, store it in the array and count it. Each time we reach the end of the loop, the value of n is the amount of numbers stored in the array so far.

§ On exit from the while loop, we test n. If it is still 0, then no numbers were supplied and there’s nothing else to do. The program does not make the mistake of trying to divide by n if it is 0. If n is positive, we confidently divide the sum by it to find the average.

§ The for loop ‘steps through’ the array, printing the numbers and their differences from the average. Here, n is the number of elements of the array that were actually used, not necessarily the entire array. The elements used are num[0] to num[n-1].

§ The program works out the sum of the numbers as they are read. If we need to find the sum of the first n elements after they have been stored in the array, we can do this with the following:

§ sum = 0;

§ for(int h = 0; h < n; h++) sum += num[h];

Program P8.1 does the basics. But what if the user entered more than 100 numbers? Recall that, as declared, the elements of num range from num[0] to num[99].

Now suppose that n is 100, meaning that 100 numbers have already been stored in the array. If another one is entered, and it is not 0, the program will enter the while loop and attempt to execute the statement

num[n++] = a;

Since n is 100, this is now the same as

num[100] = a;

But there is no element num[100]—you will get an “array subscript” error. When you start working with arrays, you must be very careful that your program logic does not take you outside the range of subscripts. If it does, your program will crash.

To cater for this possibility, we could write the while condition as

while (a != 0 && n < MaxNum) { ...

If n is equal to MaxNum (100), it means we have already stored 100 values in the array and there is no room for any more. In this case, the loop condition will be false, the loop will not be entered and the program will not try to store another value in the array.

This is another example of defensive programming, of trying to make our programs immune to outside forces. Now, there is no way for a user action to cause our program to crash by exceeding the bounds of the array.

8.5 Letter Frequency Count

Let us write a program which counts the frequency of each letter in the input. The program will treat an uppercase letter and its lowercase equivalent as the same letter; for example, E and e increment the same counter.

In Program P7.10, we wrote a function, position, which, given a character, returns 0 if the character is not a letter; if it is a letter, it returns its position in the alphabet. We will use position to solve this problem. However, we will rewrite it using the predefined character functions isupper and islower.

To solve this problem, we need to keep 26 counters, one for each letter of the alphabet. We need a counter for a’s and A’s, one for b’s and B’s, one for c’s and C’s, and so on. We could declare 26 variables called a, b, c, ..., up to z; aholds the count for a’s and A’s, b holds the count for b’s and B’s, and so on. And, in our program, we could write statements of the following form (assuming ch contains the next character):

if (ch == 'a' || ch == 'A') a++;

else if (ch == 'b' || ch == 'B') b++;

else if (ch == 'c' || ch == 'C') c++;

else if ...

This gets tiresome very quickly. And we will have similar problems when we have to print the results. Having to work with 26 variables for such a small problem is neither suitable nor convenient. As we will see, an array lets us solve this problem much more easily.

We will need an int array with 26 elements to hold the count for each letter of the alphabet. Since it is more natural to use element 1 (rather than element 0) to hold the count for a’s and A’s, element 2 (rather than element 1) to hold the count for b’s and B’s, and so on, we will declare the array letterCount as

int letterCount[27];

We will ignore letterCount[0] and use the following:

§ letterCount[1] to hold the count for a’s and A’s

§ letterCount[2] to hold the count for b’s and B’s

§ letterCount[3] to hold the count for c’s and C’s

§ etc.

§ letterCount[26] to hold the count for z’s and Z’s

The complete program is shown as Program P8.2. It reads data from the file passage.txt and sends output to the file output.txt.

Program P8.2

#include <stdio.h>

#include <ctype.h>

int main() {

char ch;

int n, letterCount[27], position(char);

FILE * in = fopen("passage.txt", "r");

FILE * out = fopen("output.txt", "w");

for (n = 1; n <= 26; n++) letterCount[n] = 0; //set counts to 0

while ((ch = getc(in)) != EOF) {

n = position(ch);

if (n > 0) ++letterCount[n];

}

//print the results

fprintf(out, "Letter Frequency\n\n");

for (n = 1; n <= 26; n++)

fprintf(out, "%4c %8d\n", 'a' + n - 1, letterCount[n]);

fclose(in);

fclose(out);

} //end main

int position(char ch) {

if (isupper(ch)) return ch - 'A' + 1;

if (islower(ch)) return ch - 'a' + 1;

return 0;

} //end position

Suppose passage.txt contains the following:

The quick brown fox jumps over the lazy dog.

If the quick brown fox jumped over the lazy dog then

Why did the quick brown fox jump over the lazy dog?

Program P8.2 sends the following output to the file output.txt:

Letter

Frequency

a

3

b

3

c

3

d

6

e

11

f

4

g

3

h

8

i

5

j

3

k

3

l

3

m

3

n

4

o

12

p

3

q

3

r

6

s

1

t

7

u

6

v

3

w

4

x

3

y

4

z

3

When a character ch is read, we call the function position, like this:

n = position(ch);

If n is greater than 0, we know that ch contains a letter and n is the position in the alphabet of that letter. For example, if ch contains Y, then n is 25, since Y is the 25th letter of the alphabet. If we add 1 to letterCount[n], we are adding 1 to the count for the letter that ch contains. Here, if we add 1 to letterCount[25], we are adding 1 to the count for Y. The following statement does the job:

if (n > 0) ++letterCount[n];

Take a look at the fprintf statement that prints one line of the output:

fprintf(out, "%4c %8d\n", 'a' + n - 1, letterCount[n]);

This prints a letter (in lowercase) followed by its count. Let us see how. The code for 'a' is 97. When n is 1,

'a' + n - 1

is evaluated as 97+1-1, which is 97; when 97 is printed with %c, it is interpreted as a character, so the letter a is printed. When n is 2,

'a' + n - 1

is evaluated as 97+2-1, which is 98; when 98 is printed with %c, it is interpreted as a character, so b is printed. When n is 3,

'a' + n - 1

is evaluated as 97+3-1, which is 99; when 99 is printed with %c, it is interpreted as a character, so c is printed.. And so on. As n takes on the values from 1 to 26,

'a' + n - 1

will take on the codes for the letters from 'a' to 'z'.

As a matter of interest, we could have used the following special form of the for statement described earlier to achieve the same result. Here it is:

for (ch = 'a', n = 1; n <= 26; ch++, n++)

fprintf(out, "%4c %8d\n", ch, letterCount[n]);

The loop is still executed with n going from 1 to 26. But, in sync with n, it is also executed with ch going from 'a' to 'z'. Note the use of ch++ to move on to the next character.

8.6 Making Better Use of fopen

Consider the statement:

FILE * in = fopen("passage.txt", "r");

This says to “open the file passage.txt for reading”. It assumes that the file has been created and the appropriate data stored in it. But what if the user forgot to create the file or has put it in the wrong place (the wrong folder, for instance)? We can use fopen to check for this. If fopen cannot find the file, it returns the predefined value NULL (defined in stdio.h). We can test for this as follows:

FILE * in = fopen("passage.txt", "r");

if (in == NULL) {

printf("File cannot be found\n");

exit(1);

}

If in is NULL, the program prints a message and stops. If in is not NULL, the program proceeds as before.

The predefined function exit is used to terminate execution of a program and return control to the operating system. It is conventional to use exit(0) to indicate normal temination; other arguments are used to indicate some sort of error.

To use exit, we must write the directive

#include <stdlib.h>

at the head of our program, since exit is defined in the “standard library”, stdlib.h. Among other things, this library contains functions for working with random numbers, functions for searching and functions for sorting.

As usual, we can assign a value to in and test it for NULL, using the following:

FILE * in;

if ((in = fopen("passage.txt", "r")) == NULL) {

printf("File cannot be found\n");

exit(1);

}

Note that we cannot use FILE * in in the if condition, since a declaration is not permitted there.

Similarly, when we write

FILE * out = fopen("output.txt", "w");

we are assuming that the file output.txt exists or can be created. If it does not exist and cannot be created (the disk may be write-protected or full, for instance), fopen will return NULL. We can test for this as follows:

FILE * out;

if ((out = fopen("output.txt", "w")) == NULL) {

printf("File cannot be found or created\n");

exit(1);

}

So far, we have written the name of our file in the fopen statement. To use a different file, we would have to change the name in the statement, and we would have to re-compile the program. Our program would be more flexible if we let the user tell us the name of the file when the program is run.

We can declare dataFile (say) to hold the name of the file with

char dataFile[40];

You can change 40 to any size you wish. If in has been declared as FILE *, we can prompt the user for the file name and test if everything is okay with this:

printf("Enter name of file: ");

scanf("%s", dataFile);

if ((in = fopen(dataFile, "r")) == NULL) {

printf("File cannot be found\n");

exit(1);

}

Since we are using %s to read the name of the file, the name may not contain a space. If your file name may contain a space, you can use gets.

8.7 Array as Argument to a Function

In Chapter 7, we saw how arguments are passed to functions. In C, arguments are passed “by value”. When an argument is passed “by value”, a temporary location is created with the value of the argument, and this temporary location is passed to the function. The function never has access to the original argument.

We also saw that when, for instance, we use gets(item) to read a string into the character array item, the function is able to put the string into the argument item. This implies that the function has access to the actual argument—no copy is involved.

In C, an array name denotes the address of its first element. When we use an array name as an argument to a function, the address of the first element is passed to the function which, therefore, has access to the array.

We now take a closer look at some issues involved in writing functions with array arguments.

We will write a function, sumList, which returns the sum of the integers in an array passed to the function. For example, if the array contains the following:

3

8

1

5

7

0

1

2

3

4

the function should return 24.

We could write the function header like this:

int sumList(int num[])

The array argument is written just like an array declaration but with no size specified. However, the square brackets must be present to distinguish it from a simple argument. For instance, if we had written int num, this would mean that num is an ordinary int variable.

You can specify a size, if you wish, using a constant, a symbolic constant or any integer expression which can be evaluated at the time the program is compiled. (C99 and later versions of C allow variable-length arrays in which an array subscript can be specified at run-time. We will see an example in Section 9.4.1.) However, your program will be more flexible if you do not.

Now, suppose score is declared in main as

int score[10];

and we make the call

sumList(score);

We can simply think that, in the function, score is known by the name num; any reference to num is a reference to the original argument score.

The more precise explanation is this: since the name score denotes the address of score[0], this address is passed to the function where it becomes the address of the first element of num, num[0]. In fact, any address can be passed to the function where it will be taken to be the address of num[0].

The function is free to assume any size it wishes for num. Obviously, this could land us in trouble if we attempt to process array elements which do not exist. For this reason, it is good programming practice to ‘tell’ the function how many elements to process. We do this using another argument, as in:

int sumList(int num[], int n)

Now the calling function can tell sumList how many elements to process by supplying a value for n. Using the declaration of score, above, the call

sumList(score, 10);

tells the function to process the first 10 elements of score (the whole array). But, and herein lies the advantage of this approach, we could also make a call such as

sumList(score, 5);

to get the function to process the first 5 elements of score.

Using this function header, we write sumList as follows:

int sumList(int num[], int n) {

int sum = 0;

for (int h = 0; h < n; h++) sum += num[h];

return sum;

}

The function ‘steps through’ the array, from num[0] to num[n-1], using a for loop. Each time through the loop, it adds one element to sum. On exit from the loop, the value of sum is returned as the value of the function.

The construct

for (h = 0; h < n; h++)

is typical for processing the first n elements of an array.

To use the function, consider the following code in main:

int sumList(int[], int), score[10];

for (int h = 0; h < 5; h++) scanf("%d", &score[h]);

printf("Sum of scores is %d\n", sumList(score, 5));

As usual, any function that wants to use sumList must declare it using a function prototype. Note the use of int[] to indicate that the first argument is an integer array. If we wish, we could use an identifier in declaring the prototype, as in:

int sumList(int list[], int);

The actual identifier used is not important. We could replace list by any valid identifier.

The for loop reads 5 values into the array. Note that since an array element is just like an ordinary variable, we must write &score[h] in scanf to read a value into score[h].

Suppose the values read into score are as follows:

3

8

1

5

7

0

1

2

3

4

In printf, the call

sumList(score, 5)

will get the function to return the sum of the first 5 elements of score, that is, 24. You should gather by now that, to find the sum of the first 3 elements, say, we can write

sumList(score, 3)

8.8 String - Array of Characters

In Section 2.6 we showed you how to store a string in a “character array”. Now that we know a bit about arrays, we can explain how strings are actually stored.

In C, a string is stored in an array of characters. Each character in the string is stored in one position in the array, starting at position 0. The null character, \0, is put after the last character. This is done so that programs can tell when the end of a string has been reached. For example, the string

"Enter rate:"

is stored as follows (◊ denotes a space):

E

n

t

e

r

r

a

t

e

:

\0

0

1

2

3

4

5

6

7

8

9

10

11

(Of course, inside the computer, each character is represented by its numeric code, in binary.)

The null string, a string with no characters, is written as "" (two consecutive double quotes) and stored like this:

\0

0

The string constant "a" is stored as follows:

a

\0

0

1

This should not be confused with the character constant 'a', which has a numeric value (its integer code value) associated with it and can be used in arithmetic expressions. There is no numeric value associated with the string "a".

We can compare two characters using the relational operators ==, !=, <, <=, > and >=, but we cannot compare two strings, even single-character strings like "a" and "h", this way. To compare two strings, we can use the standard string function strcmp.

Suppose we intend to store a name in the variable name declared as

char name[25];

If we read a string into name using

gets(name);

or

scanf("%s", name);

C will put \0 after the last character stored. (This is called properly terminating the string with \0.) We must ensure that there is enough room in the array to store \0. So if we declare an array of size 25, we can store a string of at most 24 characters in it since we must reserve one location for \0.

For example, suppose Alice Wonder is typed in response to gets(name). The array name will look like this (only the used positions are shown):

A

l

i

c

e

W

o

n

d

e

r

\0

0

1

2

3

4

5

6

7

8

9

10

11

12

Since name is an array, we can work with individual characters, if we so desire. For instance, name[0] refers to the first character, name[1] refers to the second, and so on. In general, we can use name[i] to refer to the character in position i. And, as we have seen, we can use name, by itself, to refer to the string stored in the array.

The length of a string is defined as the number of characters in it, not counting \0. The predefined string function strlen takes an array of characters as its argument and returns the length of the string stored in it. In this example, strlen(name) would return 12, the number of characters in "Alice Wonder". As a matter of interest, strlen starts counting characters from the beginning of the array until it finds \0; \0 is not counted.

In fact, all the standard string functions (like strlen, strcpy, strcat and strcmp) assume that the strings we give them are properly terminated with \0. If they are not, unpredictable results will occur. Imagine what will happen, for instance, if we give strlen an array of characters but there was no \0 to indicate the end of the string. It will go on forever looking for \0.

When we write statements like the following:

char name[25] = "Alice Wonder";

or

strcpy(name, "Alice Wonder");

C will store \0 after the last character so we do not have to worry about it.

However, if we store characters in an array ourselves, we must be careful and add \0 at the end. This is very important if we intend to use any of the standard string functions with the string or if we intend to print it with %s. For example, consider this code:

char word[10];

int n = 0;

char ch = getchar();

while (!isalpha(ch)) ch = getchar(); //read and ignore non-letters

while (isalpha(ch)) {

word[n++] = ch;

ch = getchar();

}

word[n] = '\0';

The code reads characters from the input and stores the first word found in the array word. Here, a word is defined as any consecutive string of alphabetic characters. The first while loop reads over any non-alphabetic characters. It exits when it finds the first alphabetic character. The second while loop is executed as long as the character read is alphabetic. It uses n to step through the positions in the array, starting at position 0. On exit from this loop, \0 is stored in position n, since, at this time, n ndicates the position after which the last letter was stored.

To illustrate, suppose the data was:

123$#%&First Caribbean7890

The first while loop will read characters until it reaches F, since F is the first alphabetic character in the data. The second loop will store

F in word[0]

i in word[1]

r in word[2]

s in word[3]

t in word[4]

Since n is incremented after each character is stored, the value of n at this stage is 5. When the space after t is read, the while loop exits and \0 is stored in word[5], properly terminating the string. The array word will look like this:

F

i

r

s

t

\0

0

1

2

3

4

5

6

We can now use word with any of the standard string functions and can print it using %s, as in:

printf("%s", word);

%s will stop printing characters when it reaches \0.

The above code is not perfect—we used it mainly for illustrative purposes. Since word is of size 10, we can store a maximum of 9 letters (plus \0) in it. If the next word is longer than 9 letters (for example, serendipity), the code will attempt to access word[10], which does not exist, giving an “array subscript” error.

As an exercise, consider how you would handle words which are longer than what you have catered for. (Hint: check that n is valid before storing anything in word[n].)

To illustrate how we can work with individual characters in a string, we write a function, numSpaces, to count and return the number of spaces in a string str:

int numSpaces(char str[]) {

int h = 0, spaces = 0;

while (str[h] != '\0') {

if (str[h] == ' ') spaces++;

h++;

}

return spaces;

} //end numSpaces

Consider the code:

char phrase[] = "How we live and how we die";

printf("Number of spaces is %d\n", numSpaces(phrase));

The first statement creates an array of just the right size to hold the characters of the string plus \0. Since the phrase contains 26 characters (letters and spaces), the array phrase will be of size 27, with phrase[0] containing H, phrase[25] containing e and phrase[26] containing \0.

In printf, the call numSpaces(phrase) will transfer control to the function, where phrase will be known as str. In the function, the while loop will step through the array until it reaches \0. For each character, it will check if it is a space. If it is, 1 is added to spaces. On exit from the loop, the value of spaces is returned as the value of the function. For the sample phrase, the value returned will be 6.

As a matter of interest, the body of the while loop could be written as:

if (str[h++] == ' ') spaces++;

Here, h is incremented after we test if str[h] contains a space.

Exercises

Write a function to return the number of digits in a string str.

Write a function to return how many vowels there are in a string str. Hint: it would be useful to write a function isVowel which, given a character ch, returns 1 if ch is a vowel and 0 if it is not.

8.8.1 Reverse the Characters in a String

As another example, we write code to reverse the characters in a stringstr. For example, if str contains lived, we must change it to devil. To illustrate how the code will work, we picture str as follows:

l

i

v

e

d

\0

0

1

2

3

4

5

6

We will first exchange str[0], l, and str[4], d, giving this:

d

i

v

e

l

\0

0

1

2

3

4

5

6

Next, we will exchange str[1], i, and str[3], e, giving this:

d

e

v

i

l

\0

0

1

2

3

4

5

6

str[2] is already in place (the middle letter does not move), so there is nothing more to do and the method ends with str reversed.

It appears that we will need two variables: one will take on subscript values starting from 0 and increasing, while the other will take on subscript values starting from length(str)-1 and decreasing. We will call them lo and hi. Initially, we will set lo to 0 and hi to length(str)-1.

The basic idea of the algorithm is as follows:

1. set lo to 0

2. set hi to length(str)-1

3. exchange the characters in positions lo and hi

4. add 1 to lo

5. subtract 1 from hi

6. repeat from step 3

When do we stop? Well, we can stop when there are no more characters to exchange. This will happen when lo becomes greater than or equal to hi. Or, put another way, we must keep exchanging characters as long as lo is less than hi. We can now write the algorithm as follows:

set lo to 0

set hi to length(str) - 1

while lo < hi do

exchange the characters in positions lo and hi

add 1 to lo

subtract 1 from hi

endwhile

In this form, it is easily converted to C as follows (assume c is char):

lo = 0;

hi = strlen(str) - 1;

while (lo < hi) {

c = str[lo];

str[lo] = str[hi];

str[hi] = c;

lo++; hi--;

}

However, we can use the expressive power of the for statement to write this more concisely and, perhaps, more readable, as follows:

for (lo = 0, hi = strlen(str) - 1; lo < hi; lo++, hi--) {

c = str[lo];

str[lo] = str[hi];

str[hi] = c;

}

Swapping two characters in a string is something we may want to do from time to time. It would be convenient to write a function (swap, say) to do this task. When we call swap, we will give it the string and the subscripts of the characters we want to exchange. For example, if word is a char array, the call

swap(word, i, j);

will exchange characters word[i] and word[j]. Since word is an array, the original array (not a copy) is passed to swap. When the function swaps two characters, it is swapping them in the actual argument, word.

The function can be written as follows:

void swap(char str[], int i, int j) {

char c = str[i];

str[i] = str[j];

str[j] = c;

} //end swap

In the function, the actual argument (word, say) is known by the name str.

Using swap, we can reverse the characters with another function, reverse, written as follows:

void reverse(char str[]) {

void swap(char [], int, int);

int lo, hi;

for (lo = 0, hi = strlen(str) - 1; lo < hi; lo++, hi--)

swap(str, lo, hi);

} //end reverse

Since reverse uses swap, we must declare the prototype for swap in reverse. Note, again, that the prototype is similar to the function header, except that we omit the variable names. However, if you wish, you may include the names—any names will do.

Using these functions, we write Program P8.3 which reads a string, reverses it and prints it.

Program P8.3

#include <stdio.h>

#include <string.h>

int main() {

char sample[100];

void reverse(char s[]);

printf("Type some data and I will reverse it\n");

gets(sample);

reverse(sample);

printf("%s\n", sample);

} //end main

void reverse(char str[]) {

void swap(char [], int, int);

int lo, hi;

for (lo = 0, hi = strlen(str) - 1; lo < hi; lo++, hi--)

swap(str, lo, hi);

} //end reverse

void swap(char str[], int i, int j) {

char c = str[i];

str[i] = str[j];

str[j] = c;

} //end swap

The following is a sample run of P8.3:

Type some data and I will reverse it

Once upon a time

emit a nopu ecnO

Reversing a string may not seem too important in its own right but there are times when we need to reverse the elements of an array. For example, we may have a list of student marks stored in an array and sorted in ascending order, like this:

32

45

59

67

81

0

1

2

3

4

If we want the marks in descending order, all we have to do is reverse the array, like this:

81

67

59

45

32

0

1

2

3

4

8.9 Palindrome

Consider the problem of determining if a given string is a palindrome (the same when spelt forwards or backwards). Examples of palindromes (ignoring case, punctuation and spaces) are:

civic

Race car

Madam, I'm Adam.

A man, a plan, a canal, Panama.

If all the letters were of the same case (upper or lower) and the string (word, say) contained no spaces or punctuation marks, we could solve the problem as follows:

assign word to another string, temp

reverse the letters in temp

if temp = word then word is a palindrome

else word is not a palindrome

In other words, if the reverse of a word is the same as the word, it is a palindrome. Sounds logical and correct. However, it is not efficient. Let us see why.

Suppose the word was thermostat. This method would reverse thermostat to get tatsomreht. Comparing the two tells us that thermostat is not a palindrome. But we can get the answer more quickly as follows:

compare the first and last letters, t and t

they are the same, so

compare the second and second to last letters, h and a

these are different so the word is not a palindrome

We will write a function called palindrome which, given a string word, returns 1 if word is a palindrome and 0 if it is not. For the moment, we will assume that word is all uppercase or all lowercase and does not contain spaces or punctuation. The function will be based on the following idea:

compare the first and last letters

if they are different, the string is not a palindrome

if they are the same, compare the second and second to last letters

if they are different, the string is not a palindrome

if they are the same, compare the third and third to last letters

and so on; we continue until we find a non-matching pair (and it’s not a palindrome) or there are no more pairs to compare (and it is a palindrome). We can express this logic in pseudocode as follows:

set lo to 0

set hi to length(word) - 1

while lo < hi do //while there are more pairs to compare

if word[lo] != word[hi] then return 0 // not a palindrome

//the letters match, move on to the next pair

lo = lo + 1

hi = hi - 1

endwhile

return 1 // all pairs match, it is a palindrome

The while loop compares pairs of letters; if it finds a non-matching pair, it immediately returns 0. If all pairs match, it will exit in the normal way when lo is no longer less than hi. In this case, it returns 1.

The function palindrome is shown in Program P8.4 which tests it by reading several words and printing whether or not each is a palindrome.

Program P8.4

#include <stdio.h>

#include <string.h>

int main() {

char aWord[100];

int palindrome(char str[]);

printf("Type a word. (To stop, press 'Enter' only): ");

gets(aWord);

while (strcmp(aWord, "") != 0) {

if (palindrome(aWord)) printf("is a palindrome\n");

else printf("is not a palindrome\n");

printf("Type a word. (To stop, press 'Enter' only): ");

gets(aWord);

}

} //end main

int palindrome(char word[]) {

int lo = 0;

int hi = strlen(word) - 1;

while (lo < hi)

if (word[lo++] != word[hi--]) return 0;

return 1;

} //end palindrome

In the function, we use the single statement

if (word[lo++] != word[hi--]) return 0;

to express all the logic of the body of the while loop in the above algorithm. Since we use ++ and -- as suffixes, lo and hi are changed after word[lo] is compared with word[hi].

We could, of course, have expressed it as:

if (word[lo] != word[hi]) return 0;

lo++;

hi--;

The program prompts the user to type a word and tells her if it is a palindrome. It then prompts for another word. To stop, the user must press the "Enter" or "Return" key only. When she does this, the empty string is stored in aWord. The while condition checks for this by comparing aWord with "" (two consecutive double quotes denote the empty string). The following is a sample run of Program P8.4:

Type a word. (To stop, press "Enter" only): racecar

is a palindrome

Type a word. (To stop, press "Enter" only): race car

is not a palindrome

Type a word. (To stop, press "Enter" only): Racecar

is not a palindrome

Type a word. (To stop, press "Enter" only): DEIFIED

is a palindrome

Type a word. (To stop, press "Enter" only):

Note that race car is not a palindrome because 'e' is not the same as ' ' and Racecar is not a palindrome because 'R' is not the same as 'r'. We will fix this shortly.

8.9.1 A Better Palindrome Function

The function we wrote works for one-word palindromes with all uppercase or all lowercase letters. We now tackle the more difficult problem of checking words or phrases which may contain uppercase letters, lowercase letters, spaces and punctuation marks. To illustrate our approach, consider the phrase:

Madam, I'm Adam

We will convert all the letters to one case (lower, say) and remove all spaces and non-letters, giving

madamimadam

We can now use the function we wrote in Program P8.4 to test if this is a palindrome.

Let us write a function lettersOnlyLower which, given a string phrase, converts all letters to lowercase and removes all spaces and non-letters. The function stores the converted string in the second argument. Here it is:

void lettersOnlyLower(char phrase[], char word[]) {

int i = 0, n = 0;

char c;

while ((c = phrase[i++]) != '\0')

if (isalpha(c)) word[n++] = tolower(c);

word[n] = '\0';

}

Comments on the function lettersOnlyLower

§ i is used to index the given phrase, stored in phrase.

§ n is used to index the converted phrase, stored in word.

§ The while loop looks at each character of phrase, in turn. If it is a letter, it is converted to lowercase using the predefined function tolower and stored in the next position in word; to use tolower, your program must be preceded by the directive

§ #include <ctype.h>

§ On exit from the while, word is properly terminated with \0.

Putting everything together, we get Program P8.5 which tests our new function, letterOnlyLower.

Program P8.5

#include <stdio.h>

#include <string.h>

#include <ctype.h>

int main() {

char aPhrase[100], aWord[100];

void lettersOnlyLower(char p[], char w[]);

int palindrome(char str[]);

printf("Type a phrase. (To stop, press 'Enter' only): ");

gets(aPhrase);

while (strcmp(aPhrase, "") != 0) {

lettersOnlyLower(aPhrase, aWord);

printf("Converted to: %s\n", aWord);

if (palindrome(aWord)) printf("is a palindrome\n");

else printf("is not a palindrome\n");

printf("Type a word. (To stop, press 'Enter' only): ");

gets(aPhrase);

} //end while

} //end main

void lettersOnlyLower(char phrase[], char word[]) {

int j = 0, n = 0;

char c;

while ((c = phrase[j++]) != '\0')

if (isalpha(c)) word[n++] = tolower(c);

word[n] = '\0';

} //end lettersOnlyLower

int palindrome(char word[]) {

int lo = 0;

int hi = strlen(word) - 1;

while (lo < hi)

if (word[lo++] != word[hi--]) return 0;

return 1;

} //end palindrome

The program prompts the user for a phrase and tells her whether or not it is a palindrome. We also print the converted phrase to show you how the function works.

A sample run is shown here:

Type a phrase. (To stop, press "Enter" only): Madam I'm Adam

Converted to: madamimadam

is a palindrome

Type a phrase. (To stop, press "Enter" only): Flo, gin is a sin. I golf.

Converted to: floginisasinigolf

is a palindrome

Type a phrase. (To stop, press "Enter" only): Never odd or even.

Converted to: neveroddoreven

is a palindrome

Type a phrase. (To stop, press "Enter" only): Thermostat

Converted to: thermostat

is not a palindrome

Type a phrase. (To stop, press "Enter" only): Pull up if I pull up.

Converted to: pullupifipullup

is a palindrome

Type a phrase. (To stop, press "Enter" only):

8.10 Array of Strings – Name of Day

In Program P7.4, we wrote a function printDay which printed the name of a day, given the number of the day. We will now write a function nameOfDay which will be given two arguments: the first is the number of a day and the second is a character array. The function will store, in the array, the name of the day corresponding to the number of the day. For example, the call

nameOfDay(6, dayName);

will store Friday in dayName, assuming dayName is a character array.

We show how to write nameOfDay using an array to store the names of the days. Suppose we have an array day as shown in Figure 8.2 (day[0] is not used and is not shown).

day

Sunday

day[1]

Monday

day[2]

Tuesday

day[3]

Wednesday

day[4]

Thursday

day[5]

Friday

day[6]

Saturday

day[7]

Figure 8.2. The array day

If d contains a value from 1 to 7, then day[d] contains the name of the day corresponding to d. For instance, if d is 3, day[d] contains Tuesday. But how can we store the names of the days in an array? What kind of array would we need?

We will need an array where each element can hold a string—an array of strings. But a string itself is stored in an array of characters. So we need an array of “array of characters”—we need a two-dimensional array. Consider the declaration

char day[8][10];

We can think of day as having 8 rows and 10 columns. If we store the name of a day in each row, then we can store 8 names. Each name is stored in an array of 10 characters. The rows are numbered from 0 to 7 and the columns are numbered from 0 to 9. As hinted in the above diagram, we will not use row 0. We will store the names in rows 1 to 7. If we store the names of the days in this array, it will look like this (we put the null string "" in day[0]):

0

1

2

3

4

5

6

7

8

9

0

\0

1

S

u

n

d

a

y

\0

2

M

o

n

d

a

y

\0

3

T

u

e

s

d

a

y

\0

4

W

e

d

n

e

s

d

a

y

\0

5

T

h

u

r

s

d

a

y

\0

6

F

r

i

d

a

y

\0

7

S

a

t

u

r

d

a

y

\0

Figure 8.3. The 2-dimensional array day

C lets us to refer to the ith row with day[i]. If we need to, we can use day[i][k] to refer to the character in row i and column k. For example, day[3][2] is e and day[7][4] is r.

We can declare the array day and initialize it with the names of the days using this:

char day[8][10] = {"", "Sunday", "Monday", "Tuesday",

"Wednesday", "Thursday", "Friday", "Saturday"};

This declaration will create the array shown in Figure 8.3. The strings to be placed in the array are enclosed by { and } and separated by commas with no comma after the last one. The first string, the null string, is placed in day[0], the second in day[1], the third in day[2], and so on.

The complete function, nameOfDay, is shown in Program P8.6 in which main is used just to test the function.

Program P8.6

#include <stdio.h>

#include <string.h>

int main() {

void nameOfDay(int, char[]);

int n;

char dayName[12];

printf("Enter a day from 1 to 7: ");

scanf("%d", &n);

nameOfDay(n, dayName);

printf("%s\n", dayName);

} //end main

void nameOfDay(int n, char name[]) {

char day[8][10] = {"", "Sunday", "Monday", "Tuesday", "Wednesday",

"Thursday", "Friday", "Saturday"};

if (n < 1 || n > 7) strcpy(name, "Invalid day");

else strcpy(name, day[n]);

} //end nameOfDay

In the function, the following statement checks the value of n.

if (n < 1 || n > 7) strcpy(name, "Invalid day");

else strcpy(name, day[n]);

If n is not a value from 1 to 7, the function stores Invalid day in name. If it is a valid day number, it stores the value of day[n] in name. For example, if n is 6, the function stores day[6], that is, Friday, in name.

In main, dayName is declared to be of size 12 since it needs to hold the string "Invalid day" if the day number is invalid.

8.11 A Flexible getString Function

So far, we have used the format specification %s to read a string containing no whitespace characters and the function gets to read a string up to the end-of-line. However, neither of these allows us to read a string delimited by double quotes, for instance. Suppose we had data as in the following format:

"Margaret Dwarika" "Clerical Assistant"

We would not be able to use %s or gets to read this data easily.

We will write a function, getString, which lets us read a string enclosed within ‘delimiter’ characters. For example, we could specify a string as $John Smith$ or "John Smith". This is a very flexible way of specifying a string. Each string can be specified with its own delimiters which could be different for the next string. It is particularly useful for specifying strings which may include special characters such as the double quotes without having to use an escape sequence like\".

For instance, in order to specify the following string in C:

"Don't move!" he commanded.

we must write:

"\"Don't move!\" he commanded."

With getString, this string could be supplied as

$"Don't move!" he commanded.$

or

%"Don't move!" he commanded.%

or using any other character as a delimiter, provided it is not one of the characters in the string. We could even use something like this:

7"Don't move!" he commanded."7

but would normally use special characters like ", $, % or # as delimiters.

We will write getString with two parameters: a file designated by in and a character array str. The function will read the next string from in and store it in str.

The function assumes that the first non-whitespace character met (delim, say) is the delimiter. Characters are read and stored until delim is met again, indicating the end of the string. The delimiter characters are not stored since they are not part of the string.

Suppose we have the following declarations in main:

FILE * input = fopen("quizdata.txt", "r");

char country[50];

and the file quizdata.txt contains strings delimited as described above. We would be able to read the next string from the file and store it in country with this:

getString(input, country);

It is up to us to ensure that country is big enough to hold the next string. If not, the program may crash or nonsense results will occur.

Here is getString:

void getString(FILE * in, char str[]) {

//stores, in str, the next string within delimiters

// the first non-whitespace character is the delimiter

// the string is read from the file 'in'

char ch, delim;

int n = 0;

str[0] = '\0';

// read over white space

while (isspace(ch = getc(in))) ; //empty while body

if (ch == EOF) return;

delim = ch;

while (((ch = getc(in)) != delim) && (ch != EOF))

str[n++] = ch;

str[n] = '\0';

} // end getString

Comments on getString

§ The predefined function isspace returns 1 (true) if its char argument is a space, tab or newline character and 0 (false), otherwise.

§ If getString encounters end-of-file before finding a non-whitespace character (the delimiter), the empty string is returned in str. Otherwise, it builds the string by reading one character at a time; the string is terminated by the next occurrence of the delimiter or end-of-file, whichever comes first.

§ We can read a string from the standard input (the keyboard) by calling getString with stdin as the first argument.

8.12 A Geography Quiz Program

Let us write a program which quizzes a user on countries and their capitals. The program will illustrate some useful programming concepts like reading from the keyboard and a file and being very flexible in terms of user input. The following is a sample run of the program, indicating how we want the finished program to work. The user is given two tries at a question. If she gets it wrong both times, the program tells her the correct answer.

What is the capital of Trinidad? Tobago

Wrong. Try again.

What is the capital of Trinidad? Port of Spain

Correct!

What is the capital of Jamaica? Kingston

Correct!

What is the capital of Grenada? Georgetown

Wrong. Try again.

What is the capital of Grenada? Castries

Wrong. Answer is St. George's

We will store the names of the countries and their capitals in a file (quizdata.txt, say). For each country, we will store its name, its capital and a special string consisting only of the letters in the capital, all converted to uppercase. This last string will be used to enable users to type their answers with a lot of flexibility. and it will enable us to write a more efficient program. It is not absolutely necessary to provide this last string since we can get the program to create it for us (see note after Program P8.7). The string "*" is used to indicate the end of the data. The following shows some sample data:

"Trinidad" "Port of Spain" "PORTOFSPAIN"

"Jamaica" "Kingston" "KINGSTON"

"Grenada" "St. George's" "STGEORGES"

"*"

We show 3 strings per line but this is not necessary. The only requirement is that they are supplied in the right order. If you wish, you can have 1 string per line or 6 strings per line or different numbers of strings per line. Also, you can use any character to delimit a string, provided it is not a character in the string. And you can use different delimiters for different strings. It is perfectly okay to supply the above data as follows:

"Trinidad" $Port of Spain$ *PORTOFSPAIN*

%Jamaica% "Kingston" &KINGSTON&

$Grenada$ %St. George's% ^STGEORGES^

#*#

We can do this because of the flexibility of getString. We will use getString to read strings from the file and gets to get the user’s answers typed at the keyboard.

Suppose a country’s data are read into the variables country, capital and CAPITAL, respectively. (Remember that, in C, capital is a different variable from CAPITAL). When the user types an answer (answer, say), it must be compared with capital. If we use a straightforward comparison like

if (strcmp(answer, capital) == 0) ...

to check if answer is the same as capital, then answers like "Portof Spain", "port of spain", " Port ofSpain" and "st georges" would all be considered wrong. If we want these answers to be correct (and we probably should) we must convert all user answers to a common format before comparing.

We take the view that as long as all the letters are there, in the correct order, regardless of case, the answer is considered correct. When the user types an answer, we ignore spaces and punctuation and convert the letters only to uppercase. This is then compared with CAPITAL. For example, the answers above would be converted to "PORTOFSPAIN" and "STGEORGES" and would elicit a "Correct!" response.

In the palindrome program (P8.5), we wrote a function lettersOnlyLower which kept the letters only from a string and converted them to lowercase. Here, we want the same function but we convert to uppercase instead. We name the function lettersOnlyUpper. The code is identical to lettersOnlyLower except that tolower is replaced by toupper. Our test for correctness now becomes this:

lettersOnlyUpper(answer, ANSWER);

if (strcmp(ANSWER, CAPITAL) == 0) printf("Correct!\n");

All the details are captured in Program P8.7.

Program P8.7

#include <stdio.h>

#include <string.h>

#include <ctype.h>

#include <stdlib.h>

#define MaxLength 50

int main() {

void getString(FILE *, char[]);

void askOneQuestion(char[], char[], char[]);

char EndOfData[] = "*", country[MaxLength+1] ;

char capital[MaxLength+1], CAPITAL[MaxLength+1];

FILE * in = fopen("quizdata.txt", "r");

if (in == NULL){

printf("Cannot find file\n");

exit(1);

}

getString(in, country);

while (strcmp(country, EndOfData) != 0) {

getString(in, capital);

getString(in, CAPITAL);

askOneQuestion(country, capital, CAPITAL);

getString(in, country);

}

} // end main

void askOneQuestion(char country[], char capital[], char CAPITAL[]) {

void lettersOnlyUpper(char [], char[]);

char answer[MaxLength+1], ANSWER[MaxLength+1];

printf("\nWhat is the capital of %s?", country);

gets(answer);

lettersOnlyUpper(answer, ANSWER);

if (strcmp(ANSWER, CAPITAL) == 0) printf("Correct!\n");

else {

printf("Wrong. Try again\n");

printf("\nWhat is the capital of %s?", country);

gets(answer);

lettersOnlyUpper(answer, ANSWER);

if (strcmp(ANSWER, CAPITAL) == 0) printf("Correct!\n");

else printf("Wrong. Answer is %s\n", capital);

}

} // end askOneQuestion

void lettersOnlyUpper(char word[], char WORD[]) {

// stores the letters in word (converted to uppercase) in WORD

int i = 0, n = 0;

char c;

while ((c = word[i++]) != '\0')

if (isalpha(c)) WORD[n++] = toupper(c);

WORD[n] = '\0';

} // end lettersOnlyUpper

void getString(FILE * in, char str[]) {

//stores, in str, the next string within delimiters

// the first non-whitespace character is the delimiter

// the string is read from the file 'in'

char ch, delim;

int n = 0;

str[0] = '\0';

// read over white space

while (isspace(ch = getc(in))) ; //empty while body

if (ch == EOF) return;

delim = ch;

while (((ch = getc(in)) != delim) && (ch != EOF))

str[n++] = ch;

str[n] = '\0';

} // end getString

As mentioned earlier, it is not absolutely necessary to store CAPITAL in the file. We can store country and capital only, and when these are read, convert capital with

lettersOnlyUpper(capital, CAPITAL);

You can use the idea of this program to write many similar ones. On the Geography theme, you can ask about mountains and heights, rivers and lengths, countries and population, countries and prime ministers, and so on. For a different application, you can use it to drill a user in English-Spanish (or any other combination of languages) vocabulary. Your questions could take the form:

What is the Spanish word for water?

or, if you prefer,

What is the English word for agua?

Better yet, let the user choose whether she is given English or Spanish words.

You can ask about books and authors, songs and singers, movies and stars. As an exercise, think of five other areas in which the idea of this program can be used to quiz a user.

8.13 Find the Largest Number

Let us consider the problem of finding the largest of a set of values stored in an array. The principle of finding the largest is the same as we discussed in Section 5.5. Suppose the integer array num contains the following values:

num

25

72

17

43

84

14

61

0

1

2

3

4

5

6

We can easily see that the largest number is 84 and that it is in location 4. But how does a program determine this? One approach is as follows:

§ Assume that the first element (the one in position 0) is the largest; we do this by setting big to 0. As we step through the array, we will use big to hold the position of the largest number encountered so far; num[big] will refer to the actual number.

§ Next, starting at position 1, we look at the number in each successive position, up to 6, and compare the number with the one in position big.

§ The first time, we compare num[1] with num[0]; since num[1], 72, is larger than num[0], 25, we update big to 1. This means that the largest number so far is in position 1.

§ Next, we compare num[2], 17, with num[big] (that is, num[1]), 72; since num[2] is smaller than num[1], we go on to the next number, leaving big at 1.

§ Next, we compare num[3], 43, with num[big] (that is, num[1]), 72; since num[3] is smaller than num[1], we go on to the next number, leaving big at 1.

§ Next, we compare num[4], 84, with num[big] (that is, num[1]), 72; since num[4] is larger than num[1], we update big to 4. This means that the largest number so far is in position 4.

§ Next, we compare num[5], 14, with num[big] (that is, num[4]), 84; since num[5] is smaller than num[4], we go on to the next number, leaving big at 4.

§ Next, we compare num[6], 61, with num[big] (that is, num[4]), 84; since num[6] is smaller than num[4], we go on to the next number, leaving big at 4.

§ Since there is no next number, the process ends with the value of big being 4, the position of the largest number. The actual number is denoted by num[big]; since big is 4, this is num[4], which is 84.

We can express the process just described by the following pseudocode:

big = 0

for h = 1 to 6

if num[h] > num[big] then big = h

endfor

print "Largest is ", num[big], " in position ", big

We will now write a function, getLargest, to find the largest value in an array. To be general, we will specify which portion of the array to search for the value. This is important since, most times, we declare an array to be of some maximum size (100, say) but do not always put 100 values in the array.

When we declare the array to be of size 100, we are catering for 100 values. But, at any time, the array may have less than this amount. We use another variable (n, say) to tell us how many values are currently stored in the array. For example, if n is 36, it means that values are stored in elements 0 to 35 of the array.

So when we are finding the largest, we must specify which elements of the array to search. We will write the function such that it takes three arguments—the array num, and two integers lo and hi—and returns the position of the largest number from num[lo] to num[hi], inclusive. It is up to the caller to ensure that lo and hi are within the range of subscripts declared for the array. For instance, the call

§ getLargest(score, 0, 6) will return the position of the largest number from score[0] to score[6]; and the call

§ getLargest(mark, 10, 20) will return the position of the largest number from mark[10] to mark[20].

Here is the function, getLargest:

int getLargest(int num[], int lo, int hi) {

int big = lo;

for (int h = lo + 1; h <= hi; h++)

if (num[h] > num[big]) big = h;

return big;

} //end getLargest

The function assumes the largest number is in position lo, the first one, by setting big to lo. In turn, it compares the numbers in locations lo+1 up to hi with the one in location big. If a bigger one is found, big is updated to the location of the bigger number.

8.14 Find the Smallest Number

The function, getLargest, could be easily modified to find the smallest value in an array. Simply change big to small, say, and replace > by <, giving this:

int getSmallest(int num[], int lo, int hi) {

int small = lo;

for (int h = lo + 1; h <= hi; h++)

if (num[h] < num[small]) small = h;

return small;

} //end getSmallest

This function returns the location of the smallest element from num[lo] to num[hi], inclusive. Later, we will show you how to use this function to arrange a set of numbers in ascending order.

We have shown how to find the largest and smallest values in an integer array. The procedure is exactly the same for arrays of other types such as double, char or float. The only change which has to be made is in the declaration of the arrays. Keep in mind that when we compare two characters, the ‘larger’ one is the one with the higher numeric code.

8.15 A Voting Problem

We now illustrate how to use some of the ideas just discussed to solve the following problem.

In an election, there are seven candidates. Each voter is allowed one vote for the candidate of his/her choice. The vote is recorded as a number from 1 to 7. The number of voters is unknown beforehand but the votes are terminated by a vote of 0. Any vote which is not a number from 1 to 7 is an invalid (spoilt) vote.

A file, votes.txt, contains the names of the candidates. The first name is considered as candidate 1, the second as candidate 2, and so on. The names are followed by the votes. Write a program to read the data and evaluate the results of the election. Print all output to the file, results.txt.

Your output should specify the total number of votes, the number of valid votes and the number of spoilt votes. This is followed by the votes obtained by each candidate and the winner(s) of the election.

Suppose you are given the following data in the file, votes.txt:

Victor Taylor

Denise Duncan

Kamal Ramdhan

Michael Ali

Anisa Sawh

Carol Khan

Gary Olliverie

3 1 2 5 4 3 5 3 5 3 2 8 1 6 7 7 3 5

6 9 3 4 7 1 2 4 5 5 1 4 0

Your program should send the following output to the file, results.txt:

Invalid vote: 8

Invalid vote: 9

Number of voters: 30

Number of valid votes: 28

Number of spoilt votes: 2

Candidate

Score

Victor Taylor

4

Denise Duncan

3

Kamal Ramdhan

6

Michael Ali

4

Anisa Sawh

6

Carol Khan

2

Gary Olliverie

3

The winner(s):

Kamal Ramdhan

Anisa Sawh

We need to store the names of the 7 candidates and the votes obtained by each. We will use an int array for the votes. In order to work naturally with candidates 1 to 7, we will write the declaration

int vote[8];

and use vote[1] to vote[7] for counting the votes for the candidates; vote[c] will hold the count for candidate c. We will not use vote[0].

But what kind of array can we use for the names, since a name itself is stored in a char array? We will need an “array of arrays”—a two-dimensional array. Consider the declaration

char name[8][15];

We can think of name as having 8 rows and 15 columns. If we store one name in each row, then we can store 8 names. Each name is stored in an array of 15 characters. The rows are numbered from 0 to 7 and the columns are numbered from 0 to 14. In our program, we will not use row 0. We will store the names in rows 1 to 7. If we store the sample names in this array, it will look like this:

0

1

2

3

4

5

6

7

8

9

10

11

12

13

14

0

\0

1

V

i

c

t

o

r

T

a

y

l

o

r

\0

2

D

e

n

i

s

e

D

u

n

c

a

n

\0

3

K

a

m

a

l

R

a

m

d

h

a

n

\0

4

M

i

c

h

a

e

l

A

l

i

\0

5

A

n

i

s

a

S

a

w

h

\0

6

C

a

r

o

l

K

h

a

n

\0

7

G

a

r

y

O

l

l

i

v

e

r

i

e

\0

To cater for longer names, we will use the following declaration to store the names of the candidates:

char name[8][31];

We will store the name of candidate c in name[c]; name[0] will not be used.

To make the program flexible, we will define the following symbolic constants:

#define MaxCandidates 7

#define MaxNameLength 30

and, in main, use these declarations:

char name[MaxCandidates + 1][MaxNameLength + 1];

int vote[MaxCandidates + 1];

The #define directives will be placed at the top of the program, before main. When we do this, the symbolic constants will be available to any function that needs to use them.

In general, variables and identifiers declared outside of any function are said to be external and are available to any function that comes after it in the same file. (The rules are a bit more complicated than this, but this will suffice for our purposes.) So if the declarations are placed at the top of the program, the variables and identifiers would be available to all functions in the program, assuming the entire program is stored in one file (as is the case with our programs).

One of the first things the program must do is read the names and set the vote counts to 0. We will write a function initialize to do this. This will also let us show you how to pass a 2-dimensional array to a function.

As explained previously, we will read a candidate’s name in two parts (first name and last name) and then join them together to create a single name which we will store in name[c]. Here is the function:

void initialize(char name[][MaxNameLength + 1], int vote[]) {

char lastName[MaxNameLength];

for (int c = 1; c <= MaxCandidates; c++) {

fscanf(in, "%s %s", name[c], lastName);

strcat(name[c], " ");

strcat(name[c], lastName);

vote[c] = 0;

}

} //end initialize

As we see in the case of the parameter vote, we just need the square brackets to signify that vote is a one-dimensional array. However, in the case of the two-dimensional array name, we must specify the size of the second dimension and we must use a constant or an expression whose value can be determined when the program is compiled. (C99 and later versions of C allow variable-length arrays in which an array subscript can be specified at run-time. We will see an example in Section 9.4.1.) The size of the first dimension may remain unspecified as indicated by empty square brackets. This holds for any two-dimensional array used as a parameter.

Next, we must read and process the votes. Processing vote v involves checking that it is valid. If it is, we want to add 1 to the score for candidate v. We will read and process the votes with the following:

fscanf(in, "%d", &v);

while (v != 0) {

if (v < 1 || v > MaxCandidates) {

fprintf(out, "Invalid vote: %d\n", v);

++spoiltVotes;

}

else {

++vote[v];

++validVotes;

}

fscanf(in, "%d", &v);

}

The key statement here is

++vote[v];

This is a clever way of using the vote v as a subscript to add 1 for the right candidate. For example, if v is 3, we have a vote for candidate 3, Kamal Ramdhan. We wish to add 1 to the vote count for candidate 3. This count is stored in vote[3]. When v is 3, the statement becomes

++vote[3];

This adds 1 to vote[3]. The beauty is that the same statement will add 1 for any of the candidates, depending on the value of v. This illustrates some of the power of using arrays. It does not matter whether there are 7 candidates or 700; the one statement will work for all.

Now that we know how to read and process the votes, it remains only to determine the winner(s) and print the results. We will delegate this task to the function printResults.

Using the sample data, the array vote will contain the following values after all the votes have been tallied (remember we are not using vote[0]).

vote

4

3

6

4

6

2

3

1

2

3

4

5

6

7

To find the winner, we must first find the largest value in the array. To do this, we will call getLargest (Section 8.12) with

int win = getLargest(vote, 1, MaxCandidates);

This will set win to the subscript of the largest value from vote[1] to vote[7] (since MaxCandidates is 7). In our example, win will be set to 3 since the largest value, 6, is in position 3. (6 is also in position 5 but the way the code is written, it will return the first position which contains the largest, if there is more than one.)

Now that we know the largest value is in vote[win], we can ‘step through’ the array, looking for those candidates with that value. This way, we will find all the candidates (one or more) with the highest vote and declare them as winners.

The details are given in the function printResults shown as part of Program P8.8, our solution to the voting problem posed at the beginning of this section.

Program P8.8

#include <stdio.h>

#include <string.h>

#define MaxCandidates 7

#define MaxNameLength 30

FILE *in, *out;

int main() {

char name[MaxCandidates + 1][MaxNameLength + 1];

int vote[MaxCandidates + 1];

int v, validVotes = 0, spoiltVotes = 0;

void initialize(char [][MaxNameLength + 1], int []);

void printResults(char [][MaxNameLength + 1], int [], int, int);

in = fopen("votes.txt", "r");

out = fopen("results.txt", "w");

initialize(name, vote);

fscanf(in, "%d", &v);

while (v != 0) {

if (v < 1 || v > MaxCandidates) {

fprintf(out, "Invalid vote: %d\n", v);

++spoiltVotes;

}

else {

++vote[v];

++validVotes;

}

fscanf(in, "%d", &v);

}

printResults(name, vote, validVotes, spoiltVotes);

fclose(in);

fclose(out);

} // end main

void initialize(char name[][MaxNameLength + 1], int vote[]) {

char lastName[MaxNameLength];

for (int c = 1; c <= MaxCandidates; c++) {

fscanf(in, "%s %s", name[c], lastName);

strcat(name[c], " ");

strcat(name[c], lastName);

vote[c] = 0;

}

} // end initialize

int getLargest(int num[], int lo, int hi) {

int big = lo;

for (int h = lo + 1; h <= hi; h++)

if (num[h] > num[big]) big = h;

return big;

} //end getLargest

void printResults(char name[][MaxNameLength + 1], int vote[],

int valid, int spoilt) {

int getLargest(int v[], int, int);

fprintf(out, "\nNumber of voters: %d\n", valid + spoilt);

fprintf(out, "Number of valid votes: %d\n", valid);

fprintf(out, "Number of spoilt votes: %d\n", spoilt);

fprintf(out, "\nCandidate Score\n\n");

for (int c = 1; c <= MaxCandidates; c++)

fprintf(out, "%-15s %3d\n", name[c], vote[c]);

fprintf(out, "\nThe winner(s)\n");

int win = getLargest(vote, 1, MaxCandidates);

int winningVote = vote[win];

for (int c = 1; c <= MaxCandidates; c++)

if (vote[c] == winningVote) fprintf(out, "%s\n", name[c]);

} //end printResults

EXERCISES 8

1. Explain the difference between a simple variable and an array variable.

2. Write array declarations for each of the following: (a) a floating-point array of size 25 (b) an integer array of size 50 (c) a character array of size 32.

3. What is a subscript? Name 3 ways in which we can write a subscript.

4. What values are stored in an array when it is first declared?

5. Name 2 ways in which we can store a value in an array element.

6. Write a function which, given a number from 1 to 12 and a character array, stores the name of the month in the array. For example, given 8, it stores August in the array. Store the empty string if the number given is not valid.

7. You declare an array of size 500. Must you store values in all elements of the array?

8. Write code to read 200 names from a file and store them in an array.

9. An array num is of size 100. You are given two values i and k, with 0 ≤ i < k ≤ 99. Write code to find the average of the numbers from num[i] to num[k], inclusive.

10. Write a function which, given a string of arbitrary characters, returns the number of consonants in the string.

11. Modify the letter frequency count program (Program P8.2) to count the number of non-letters as well. Make sure you do not count the end-of-line characters.

12. Write a function which, given an array of integers and an integer n, reverses the first n elements of the array.

13. Write a program to read names and phone numbers into two arrays. Request a name and print the person’s phone number. Use at least one function.

14. Write a function indexOf which, given a string s and a character c, returns the position of the first occurrence of c in s. If c is not in s, return -1. For example, indexOf("brother", 'h') returns 4 but indexOf("brother", 'a')returns -1.

15. Write a function substring which, given two strings s1 and s2, returns the starting position of the first occurrence of s1 in s2. If s1 is not in s2, return -1. For example, substring("mom", "thermometer") returns 4 but substring("dad", "thermometer") returns -1.

16. Write a function remove which, given a string str and a character c, removes all occurrences of c from str. For example, if str contains "brother", remove(str, 'r') should change str to "bothe".

17. Write a program to read English words and their equivalent Spanish words into two arrays. Request the user to type several English words. For each, print the equivalent Spanish word. Choose a suitable end-of-data marker. Modify the program so that the user types Spanish words instead.

18. The number 27472 is said to be palindromic since it reads the same forwards or backwards. Write a function which, given an integer n, returns 1 if n is palindromic and 0 if it is not.

19. Write a program to find out, for a class of students, the number of families with 1, 2, 3, ... up to 8 or more children. The data consists of the number of children in each pupil’s family, terminated by 0. (Why is 0 a good value to use?)

20. A survey of 10 pop artists is made. Each person votes for an artist by specifying the number of the artist (a value from 1 to 10). Write a program to read the names of the artists, followed by the votes, and find out which artist is the most popular. Choose a suitable end-of-data marker.

21. The children’s game of ‘count-out’ is played as follows. n children (numbered 1 to n) are arranged in a circle. A sentence consisting of m words is used to eliminate one child at a time until one child is left. Starting at child 1, the children are counted from 1 to m and the mth child is eliminated. Starting with the child after the one just eliminated, the children are again counted from 1 to m and the mth child eliminated. This is repeated until one child is left. Counting is done circularly and eliminated children are not counted. Write a program to read values for n (assumed <= 100) and m (> 0) and print the number of the last remaining child.

22. The prime numbers from 1 to 2500 can be obtained as follows. From a list of the numbers 1 to 2500, cross out all multiples of 2 (but not 2 itself). Then, find the next number (n, say) that is not crossed out and cross out all multiples of n (but not n). Repeat this last step provided that n has not exceeded 50 (the square root of 2500). The numbers remaining in the list (except 1) are prime. Write a program which uses this method to print all primes from 1 to 2500. Store your output in a file called primes.out. This method is called the Sieve of Eratosthenes, named after the Greek mathematician, geographer and philosopher.

23. There are 500 light bulbs (numbered 1 to 500) arranged in a row. Initially, they are all OFF. Starting with bulb 2, all even numbered bulbs are turned ON. Next, starting with bulb 3, and visiting every third bulb, it is turned ON if it is OFF, and it is turned OFF if it is ON. This procedure is repeated for every fourth bulb, then every fifth bulb, and so on up to the 500th bulb. Write a program to determine which bulbs are OFF at the end of the above exercise.

24. There is something special about the bulbs that are OFF. What is it? Can you explain why it is so?