The Preprocessor - Programming in C (Fourth Edition) (2015)

Programming in C (Fourth Edition) (2015)

12. The Preprocessor

This chapter describes yet another unique feature of the C language that is not found in many other higher-level programming languages. The C preprocessor provides the tools that enable you to develop programs that are easier to develop, easier to read, easier to modify, and easier to port to different computer systems. You can also use the preprocessor to literally customize the C language to suit a particular programming application or to satisfy your own programming style. In this chapter, you cover

Image Creating your own constants and macros with the #define statement

Image Building your own library files with the #include statement

Image Making more powerful programs with the conditional #ifdef, #endif, #else, and #ifndef Statements

The preprocessor is a part of the C compilation process that recognizes special statements that might be interspersed throughout a C program. As its name implies, the preprocessor actually analyzes these statements before analysis of the C program itself takes place. Preprocessor statements are identified by the presence of a pound sign, #, which must be the first nonspace character on the line. As you will see, preprocessor statements have a syntax that is slightly different from that of normal C statements. Almost every program you’ve written to this point has used preprocessor directive, specifically the #include directive. There’s more you can do with that directive which will be covered later in this chapter, but you begin by examining the #define statement.

The #define Statement

One of the primary uses of the #define statement is to assign symbolic names to program constants. The preprocessor statement

#define YES 1

defines the name YES and makes it equivalent to the value 1. The name YES can subsequently be used anywhere in the program where the constant 1 could be used. Whenever this name appears, its defined value of 1 is automatically substituted into the program by the preprocessor. For example, you might have the following C statement that uses the defined name YES:

gameOver = YES;

This statement assigns the value of YES to gameOver. You don’t need to concern yourself with the actual value that you defined for YES, but because you do know that it is defined as 1, the preceding statement has the effect of assigning 1 to gameOver. The preprocessor statement

#define NO 0

defines the name NO and makes its subsequent use in the program equivalent to specifying the value 0. Therefore, the statement

gameOver = NO;

assigns the value of NO to gameOver, and the statement

if ( gameOver == NO )
...

compares the value of gameOver against the defined value of NO. Just about the only place that you cannot use a defined name is inside a character string; so the statement

char *charPtr = "YES";

sets charPtr pointing to the string "YES" and not to the string "1".

A defined name is not a variable. Therefore, you cannot assign a value to it, unless the result of substituting the defined value is in fact a variable. Whenever a defined name is used in a program, whatever appears to the right of the defined name in the #define statement gets automatically substituted into the program by the preprocessor. It’s analogous to doing a search and replace with a text editor; in this case, the preprocessor replaces all occurrences of the defined name with its associated text.

Notice that the #define statement has a special syntax: There is no equal sign used to assign the value 1 to YES. Furthermore, a semicolon does not appear at the end of the statement. Soon, you will understand why this special syntax exists. But first, take a look at a small program that uses the YES and NO defines as previously illustrated. The function isEven in Program 12.1 simply returns YES if its argument is even and NO if its argument is odd.

Program 12.1 Introducing the #define Statement


#include <stdio.h>

#define YES 1
#define NO 0

// Function to determine if an integer is even

int isEven (int number)
{
int answer;

if ( number % 2 == 0 )
answer = YES;
else
answer = NO;

return answer;
}

int main (void)
{
int isEven (int number);

if ( isEven (17) == YES )
printf ("yes ");
else
printf ("no ");

if ( isEven (20) == YES )
printf ("yes\n");
else
printf ("no\n");

return 0;
}


Program 12.1 Output


no yes


The #define statements appear first in the program. This is not required; they can appear anywhere in the program. What is required is that a name be defined before it is referenced by the program. Defined names do not behave like variables: There is no such thing as a local define. After a name has been defined in a program, either inside or outside a function, it can subsequently be used anywhere in the program. Most programmers group their #define statements at the beginning of the program (or inside an include file1) where they can be quickly referenced and shared by more than one source file.

1. Read on to learn how defines can be set up inside special files that you can include in your program.

The defined name NULL is frequently used by programmers to represent the null pointer.2

2. NULL is already defined on your system inside a file named <stddef.h>. Again, include files are discussed in more detail shortly.

By including a definition such as

#define NULL 0

in a program, you can then write more readable statements, such as

while ( listPtr != NULL )
...

to set up a while loop that will execute as long as the value of listPtr is not equal to the null pointer.

As another example of the use of a defined name, suppose you want to write three functions to find the area of a circle, the circumference of a circle, and the volume of a sphere of a given radius. Because all these functions need to use the constant π, which is not a particularly easy constant to remember, it makes sense to define the value of this constant once at the start of the program and then use this value where needed in each function.3

3. The identifier M_PI is already defined for you in the header file <math.h>. By including that file in your program, you can use it directly in your programs.

Program 12.2 shows how a definition for this constant can be set up and used in a program.

Program 12.2 More on Working with Defines


/* Function to calculate the area and circumference of a
circle, and the volume of a sphere of a given radius */

#include <stdio.h>

#define PI 3.141592654

double area (double r)
{
return PI * r * r;
}

double circumference (double r)
{
return 2.0 * PI * r;
}

double volume (double r)
{
return 4.0 / 3.0 * PI * r * r * r;
}

int main (void)
{
double area (double r), circumference (double r),
volume (double r);

printf ("radius = 1: %.4f %.4f %.4f\n",
area(1.0), circumference(1.0), volume(1.0));

printf ("radius = 4.98: %.4f %.4f %.4f\n",
area(4.98), circumference(4.98), volume(4.98));

return 0;
}


Program 12.2 Output


radius = 1: 3.1416 6.2832 4.1888
radius = 4.98: 77.9128 31.2903 517.3403


The symbolic name PI is defined as the value 3.141592654 at the beginning of the program. Subsequent use of the name PI inside the area(), circumference(), and volume() functions has the effect of causing its defined value to be automatically substituted at the appropriate point.

Assignment of a constant to a symbolic name frees you from having to remember the particular constant value every time you want to use it in a program. Furthermore, if you ever need to change the value of the constant (if, perhaps, you find out that you are using the wrong value, for example), you only have to change the value in one place in the program: in the #define statement. Without this approach, you would have to otherwise search throughout the program and explicitly change the value of the constant whenever it was used.

You might have realized that all the #define statements you have seen so far (YES, NO, NULL, and PI) have been written in capital letters. The reason this is done is to visually distinguish a defined value from a variable. Some programmers adopt the convention that all defined names be capitalized, so that it becomes easy to determine when a name represents a variable and when it represents a defined name. Another common convention is to prefix the defined value with the letter k. In that case, the following characters of the name are not capitalized. kMaximumValuesand kSignificantDigits are two examples of defined names that adhere to this convention.

Program Extendability

Using a defined name for a constant value helps to make programs more readily extendable. For example, when you define an array, you must specify the number of elements in the array—either explicitly or implicitly (by specifying a list of initializers). Subsequent program statements will likely use the knowledge of the number of elements contained inside the array. For example, if the array dataValues is defined in a program as follows:

float dataValues[1000];

there is a good chance that you will see statements in the program that use the fact that dataValues contains 1,000 elements. For instance, in a for loop

for ( i = 0; i < 1000; ++i )
...

you would use the value 1000 as an upper bound for sequencing through the elements of the array. A statement such as

if ( index > 999 )
...

might also be used in the program to test if an index value exceeds the maximum size of the array.

Now suppose that you had to increase the size of the dataValues array from 1,000 to 2,000 elements. This would necessitate changing all statements that used the fact that dataValues contained 1,000 elements.

A better way of dealing with array bounds, which makes programs easier to extend, is to define a name for the upper array bound. So, if you define a name such as MAXIMUM_DATAVALUES with an appropriate #define statement:

#define MAXIMUM_DATAVALUES 1000

you can subsequently define the dataValues array to contain MAXIMUM_DATAVALUES elements with the following program line:

float dataValues[MAXIMUM_DATAVALUES];

Statements that use the upper array bound can also make use of this defined name. To sequence through the elements in dataValues, for example, the for statement

for ( i = 0; i < MAXIMUM_DATAVALUES; ++i )
...

could be used. To test if an index value is greater than the upper bound of the array, you could write

if ( index > MAXIMUM_DATAVALUES - 1 )
...

and so on. The nicest thing about the preceding approach is that you can now easily change the size of the dataValues array to 2,000 elements by simply changing the definition:

#define MAXIMUM_DATAVALUES 2000

And if the program is written to use MAXIMUM_DATAVALUES in all cases where the size of the array was used, the preceding definition could be the only statement in the program that would have to be changed.

Program Portability

Another nice use of the #define statement is that it helps to make programs more portable from one computer system to another. At times, it might be necessary to use constant values that are related to the particular computer on which the program is running. This might have to do with the use of a particular computer memory address, a filename, or the number of bits contained in a computer word, for example. You will recall that your rotate() function from Program 11.4 used the knowledge that an int contained 32 bits on the machine on which the program was executed.

If you want to execute this program on a different machine, on which an int contained 64 bits, the rotate function would not work correctly.4 Study the following code. In situations in which the program must be written to make use of machine-dependent values, it makes sense to isolate such dependencies from the program as much as possible. The #define statement can help significantly in this respect. The new version of the rotate function would be easier to port to another machine, even though it is a rather simple case in point. Here’s the new function:

4. Of course, you can write the rotate function so that it determines the number of bits in an int by itself and, therefore, is completely machine independent. Refer to exercises 3 and 4 at the end of Chapter 11, “Operations on Bits.”

#include <stdio.h>

#define kIntSize 32 // *** machine dependent !!! ***

// Function to rotate an unsigned int left or right

unsigned int rotate (unsigned int value, int n)
{
unsigned int result, bits;

/* scale down the shift count to a defined range */

if ( n > 0 )
n = n % kIntSize;
else
n = -(-n % kIntSize);

if ( n == 0 )
result = value;
else if ( n > 0 ) /* left rotate */
{
bits = value >> (kIntSize - n);
result = value << n | bits;
}
else /* right rotate */
{
n = -n;
bits = value << (kIntSize - n) ;
result = value >> n | bits;
}

return result;
}

More Advanced Types of Definitions

A definition for a name can include more than a simple constant value. It can include an expression, and, as you will see shortly, just about anything else!

The following defines the name TWO_PI as the product of 2.0 and 3.141592654:

#define TWO_PI 2.0 * 3.141592654

You can subsequently use this defined name anywhere in a program where the expression 2.0 × 3.141592654 would be valid. So you could have replaced the return statement of the circumference function from the previous program with the following statement, for example:

return TWO_PI * r;

Whenever a defined name is encountered in a C program, everything that appears to the right of the defined name in the #define statement is literally substituted for the name at that point in the program. So, when the C preprocessor encounters the name TWO_PI in the return statement shown previously, it substitutes for this name whatever appeared in the #define statement for this name. Therefore, 2.0 × 3.141592654 is literally substituted by the preprocessor whenever the defined name TWO_PI occurs in the program.

The fact that the preprocessor performs a literal text substitution whenever the defined name occurs explains why you don’t usually want to end your #define statement with a semicolon. If you did, then the semicolon would also be substituted into the program wherever the defined name appeared. If you had defined PI as

#define PI 3.141592654;

and then written

return 2.0 * PI * r;

the preprocessor would replace the occurrence of the defined name PI with 3.141592654;. The compiler would therefore see this statement as

return 2.0 * 3.141592654; * r;

after the preprocessor had made its substitution, which would result in a syntax error.

A preprocessor definition does not have to be a valid C expression in its own right—just so long as wherever it is used the resulting expression is valid. For instance, the definition

#define LEFT_SHIFT_8 << 8

is legitimate, even though what appears after LEFT_SHIFT_8 is not a syntactically valid expression. You can use your definition of LEFT_SHIFT_8 in a statement such as

x = y LEFT_SHIFT_8;

to shift the contents of y to the left eight bits and assign the result to x. Of a much more practical nature, you can set up the definitions

#define AND &&
#define OR ||

and then write expressions such as

if ( x > 0 AND x < 10 )
...

and

if ( y == 0 OR y == value )
...

You can even include a #define statement for the equality test:

#define EQUALS ==

and then write the statement

if ( y EQUALS 0 OR y EQUALS value )
...

thus removing the very real possibility of mistakenly using a single equal sign for the equality test, as well as improving the statement’s readability.

Although these examples illustrate the power of the #define, you should note that it is commonly considered poor programming practice to redefine the syntax of the underlying language in such a manner. Moreover, it can make it harder for someone else to understand your code.

To make things even more interesting, a defined value can itself reference another defined value. So the two defines

#define PI 3.141592654
#define TWO_PI 2.0 * PI

are perfectly valid. The name TWO_PI is defined in terms of the previously defined name PI, thus obviating the need to spell out the value 3.141592654 again.

Reversing the order of the defines, as in

#define TWO_PI 2.0 * PI
#define PI 3.141592654

is also valid. The rule is that you can reference other defined values in your definitions provided everything is defined at the time the defined name is used in the program. For readability, it is suggested that you don’t use a defined term until it has been defined, though.

Good use of defines often reduces the need for comments within the program. Consider the following statement:

if ( year % 4 == 0 && year % 100 != 0 || year % 400 == 0 )
...

You know from previous programs in this book that the preceding expression tests whether the variable year is a leap year. Now consider the following define and the subsequent if statement:

#define IS_LEAP_YEAR year % 4 == 0 && year % 100 != 0 \
|| year % 400 == 0
...
if ( IS_LEAP_YEAR )
...

Normally, the preprocessor assumes that a definition is contained on a single line of the program. If a second line is needed, the final character on the line must be a backslash character. This character signals a continuation to the preprocessor and is otherwise ignored. The same holds true for more than one continuation line; each line to be continued must be ended with a backslash character.

The preceding if statement is far easier to understand than the one shown directly before it. There is no need for a comment as the statement is self-explanatory. The purpose that the define IS_LEAP_YEAR serves is analogous to that served by a function. You could have used a call to a function named is_leap_year to achieve the same degree of readability. The choice of which to use in this case is completely subjective. Of course, the is_leap_year function could be made more general than the preceding define because it could be written to take an argument. This would enable you to test if the value of any variable were a leap year and not just the variable year to which the IS_LEAP_YEAR define restricts you. Actually, you can write a definition to take one or more arguments, which leads to our next point of discussion.

Arguments and Macros

IS_LEAP_YEAR can be defined to take an argument called y as follows:

#define IS_LEAP_YEAR(y) y % 4 == 0 && y % 100 != 0 \
|| y % 400 == 0

Unlike a function, you do not define the type of the argument y here because you are merely performing a literal text substitution and not invoking a function.

Note that no spaces are permitted in the #define statement between the defined name and the left parenthesis of the argument list.

With the preceding definition, you can write a statement such as

if ( IS_LEAP_YEAR (year) )
...

to test whether the value of year were a leap year, or

if ( IS_LEAP_YEAR (next_year) )
...

to test whether the value of next_year were a leap year. In the preceding statement, the definition for IS_LEAP_YEAR would be directly substituted inside the if statement, with the argument next_year replacing y wherever it appeared in the definition. So the if statement would actually be seen by the compiler as

if ( next_year % 4 == 0 && next_year % 100 != 0 \
|| next_year % 400 == 0 )
...

In C, definitions are frequently called macros. This terminology is more often applied to definitions that take one or more arguments. An advantage of implementing something in C as a macro, as opposed to as a function, is that in a macro, the type of the argument is not important. For example, consider a macro called SQUARE that simply squares its argument. The definition

#define SQUARE(x) x * x

enables you to subsequently write statements, such as

y = SQUARE (v);

to assign the value of v2 to y. The point to be made here is that v can be of type int, long, or float, for example, and the same macro can be used. If SQUARE were implemented as a function that took an int argument, for example, you couldn’t use it to calculate the square of adouble value. One consideration about macro definitions, which might be relevant to your application: Because macros are directly substituted into the program by the preprocessor, they inevitably use more memory space than an equivalently defined function. On the other hand, because a function takes time to call and to return, this overhead is avoided when a macro definition is used instead.

Although the macro definition for SQUARE is straightforward, there is an interesting pitfall to avoid when defining macros. As has been described, the statement

y = SQUARE (v);

assigns the value of v2 to y. What do you think would happen in the case of the statement

y = SQUARE (v + 1);

This statement does not assign the value of (v + 1)2 to y as you would expect. Because the preprocessor performs a literal text substitution of the argument into the macro definition, the preceding expression would actually be evaluated as

y = v + 1 * v + 1;

which would obviously not produce the expected results. To handle this situation properly, parentheses are needed in the definition of the SQUARE macro:

#define SQUARE(x) ( (x) * (x) )

Even though the preceding definition might look strange, remember that it is the entire expression as given to the SQUARE macro that is literally substituted wherever x appears in the definition. With your new macro definition for SQUARE, the statement

y = SQUARE (v + 1);

is then correctly evaluated as

y = ( (v + 1) * (v + 1) );

The conditional expression operator can be particularly handy when defining macros. The following defines a macro called MAX that gives the maximum of two values:

#define MAX(a,b) ( ((a) > (b)) ? (a) : (b) )

This macro enables you to subsequently write statements such as

limit = MAX (x + y, minValue);

which would assign to limit the maximum of x + y and minValue. Parentheses were placed around the entire MAX definition to ensure that an expression such as

MAX (x, y) * 100

gets evaluated properly; and parentheses were individually placed around each argument to ensure that expressions such as

MAX (x & y, z)

get correctly evaluated. The bitwise AND operator has lower precedence than the > operator used in the macro. Without the parentheses in the macro definition, the > operator would be evaluated before the bitwise AND, producing the incorrect result.

The following macro tests if a character is a lowercase letter:

#define IS_LOWER_CASE(x) ( ((x) >= 'a') && ((x) <= 'z') )

and thereby permits expressions such as

if ( IS_LOWER_CASE (c) )
...

to be written. You can even use this macro in a subsequent macro definition to convert an ASCII character from lowercase to uppercase, leaving any nonlowercase character unchanged:

#define TO_UPPER(x) ( IS_LOWER_CASE (x) ? (x) - 'a' + 'A' : (x) )

The program loop

while ( *string != '\0' )
{
*string = TO_UPPER (*string);
++string;
}

would sequence through the characters pointed to by string, converting any lowercase characters in the string to uppercase.5

5. There are a host of functions in the library for doing character tests and conversions. For example, islower and toupper serve the same purpose as the macros IS_LOWER_CASE and TO_UPPER. For more details, consult Appendix B, “The Standard C Library.”

Variable Number of Arguments to Macros

A macro can be defined to take an indeterminate or variable number of arguments. This is specified to the preprocessor by putting three dots at the end of the argument list. The remaining arguments in the list are collectively referenced in the macro definition by the special identifier _ _VA_ARGS_ _. As an example, the following defines a macro called debugPrintf to take a variable number of arguments:

#define debugPrintf(...) printf ("DEBUG:" _ _VA_ARGS_ _);

Legitimate macro uses would include

debugPrintf ("Hello world!\n");

as well as

debugPrintf ("i = %i, j = %i\n", i, j);

In the first case, the output would be

DEBUG: Hello world!

And in the second case, if i had the value 100 and j the value 200, the output would be

DEBUG: i = 100, j = 200

The printf() call in the first case gets expanded into

printf ("DEBGUG: " "Hello world\n");

by the preprocessor, which also concatenates the adjacent character string constants together. So the final printf() call looks like this:

printf ("DEBGUG: Hello world\n");

The # Operator

If you place a # in front of a parameter in a macro definition, the preprocessor creates a constant string out of the macro argument when the macro is invoked. For example, the definition

#define str(x) # x

causes the subsequent invocation

str (testing)

to be expanded into

"testing"

by the preprocessor. The printf() call

printf (str (Programming in C is fun.\n));

is therefore equivalent to

printf ("Programming in C is fun.\n");

The preprocessor literally inserts double quotation marks around the actual macro argument. Any double quotation marks or backslashes in the argument are preserved by the preprocessor. So

str ("hello")

produces

"\"hello\""

A more practical example of the use of the # operator might be in the following macro definition:

#define printint(var) printf (# var " = %i\n", var)

This macro is used to display the value of an integer variable. If count is an integer variable with a value of 100, the statement

printint (count);

is expanded into

printf ("count" " = %i\n", count);

which, after string concatenation is performed on the two adjacent strings, becomes

printf ("count = %i\n", count);

So the # operator gives you a means to create a character string out of a macro argument. Incidentally, a space between the # and the parameter name is optional.

The ## Operator

This operator is used in macro definitions to join two tokens together. It is preceded (or followed) by the name of a parameter to the macro. The preprocessor takes the actual argument to the macro that is supplied when the macro is invoked and creates a single token out of that argument and whatever token follows (or precedes) the ##.

Suppose, for example, you have a list of variables x1 through x100. You can write a macro called printx that simply takes as its argument an integer value 1 through 100 and that displays the corresponding x variable as shown:

#define printx(n) printf ("%i\n", x ## n)

The portion of the define that reads

x ## n

says to take the tokens that occur before and after the ## (the letter x and the argument n, respectively) and make a single token out of them. So the call

printx (20);

is expanded into

printf ("%i\n", x20);

The printx macro can even use the previously defined printint macro to get the variable name as well as its value displayed:

#define printx(n) printint(x ## n)

The invocation

printx (10);

first expands into

printint (x10);

and then into

printf ("x10" " = %i\n", x10);

and finally into

printf ("x10 = %i\n", x10);

The #include Statement

After you have programmed in C for a while, you will find yourself developing your own set of macros and functions that you will want to use in each of your programs. But instead of having to type these macros into each new program you write, the preprocessor enables you to collect all your definitions into a separate file and then include the file and all its macros and user-defined functions in your program, using the #include statement. These files normally end with the characters .h and are referred to as header or include files.

Suppose you are writing a series of programs for performing various metric conversions. You might want to set up some defines for all of the constants that you need to perform your conversions:

#define INCHES_PER_CENTIMETER 0.394
#define CENTIMETERS_PER_INCH 1 / INCHES_PER_CENTIMETER

#define QUARTS_PER_LITER 1.057
#define LITERS_PER_QUART 1 / QUARTS_PER_LITER

#define OUNCES_PER_GRAM 0.035
#define GRAMS_PER_OUNCE 1 / OUNCES_PER_GRAM
...

Suppose you entered the previous definitions into a separate file on the system called metric.h. Any program that subsequently needed to use any of the definitions contained in the metric.h file could then do so by simply issuing the preprocessor directive

#include "metric.h"

This statement must appear before any of the defines contained in metric.h are referenced and is typically placed at the beginning of the source file. The preprocessor looks for the specified file on the system and effectively copies the contents of the file into the program at the precise point that the #include statement appears. So, any statements inside the file are treated just as if they had been directly typed into the program at that point.

The double quotation marks around the include filename instruct the preprocessor to look for the specified file in one or more file directories (typically first in the same directory that contains the source file, but the actual places the preprocessor searches are system dependent). If the file isn’t located, the preprocessor automatically searches other system directories as described next.

Enclosing the filename within the characters < and > instead, as in

#include <stdio.h>

causes the preprocessor to look for the include file in the special system include file directory or directories. Once again, these directories are system dependent. On Unix systems (including Mac OS X systems), the system include file directory is /usr/include, so the standard header filestdio.h can be found in /usr/include/stdio.h.

To see how include files are used in an actual program example, type the six defines given previously into a file called metric.h. Then type in and run Program 12.3.

Program 12.3 Using the #include Statement


/* Program to illustrate the use of the #include statement
Note: This program assumes that definitions are
set up in a file called metric.h */

#include <stdio.h>
#include "metric.h"

int main (void)
{
float liters, gallons;

printf ("*** Liters to Gallons ***\n\n");
printf ("Enter the number of liters: ");
scanf ("%f", &liters);

gallons = liters * QUARTS_PER_LITER / 4.0;
printf ("%g liters = %g gallons\n", liters, gallons);

return 0;
}


Program 12.3 Output


*** Liters to Gallons ***

Enter the number of liters: 55.75
55.75 liters = 14.73 gallons.


The preceding example is a rather simple one because it only shows a single defined value (QUARTS_PER_LITER) being referenced from the included file metric.h. Nevertheless, the point is well made: After the definitions have been entered into metric.h, they can be used in any program that uses an appropriate #include statement.

One of the nicest things about the include file capability is that it enables you to centralize your definitions, thus ensuring that all programs reference the same value. Furthermore, errors discovered in one of the values contained in the include file need only be corrected in that one spot, thus eliminating the need to correct each and every program that uses the value. Any program that references the incorrect value simply needs to be recompiled and does not have to be edited.

You can actually put anything you want in an include file—not just #define statements, as might have been implied. Using include files to centralize commonly used preprocessor definitions, structure definitions, prototype declarations, and global variable declarations is good programming technique.

One last point to be made about include files in this chapter: Include files can be nested. That is, an include file can itself include another file, and so on.

System Include Files

It was noted that the include file <stddef.h> contains a define for NULL and is often used for testing to see whether a pointer has a null value. Earlier in this chapter, it was also noted that the header file <math.h> contains the definition M_PI, which is set to an approximation for the value of π.

The <stdio.h> header file contains information about the I/O routines contained in the standard I/O library. This header file is described in more detail in Chapter 15, “Input and Output Operations in C.” You should include this file whenever you use any I/O library routine in your program.

Two other useful system include files are <limits.h> and <float.h>. The first file, <limits.h>, contains system-dependent values that specify the sizes of various character and integer data types. For instance, the maximum size of an int is defined by the name INT_MAX inside this file. The maximum size of an unsigned long int is defined by ULONG_MAX, and so on.

The <float.h> header file gives information about floating-point data types. For example, FLT_MAX specifies the maximum floating-point number, and FLT_DIG specifies the number of decimal digits of precision for a float type.

Other system include files contain prototype declarations for various functions stored inside the system library. For example, the include file <string.h> contains prototype declarations for the library routines that perform character string operations, such as copying, comparing, and concatenating.

For more details on these header files, consult Appendix B.

Conditional Compilation

The C preprocessor offers a feature known as conditional compilation. Conditional compilation is often used to create one program that can be compiled to run on different computer systems. It is also often used to switch on or off various statements in the program, such as debugging statements that print out the values of various variables or trace the flow of program execution.

The #ifdef, #endif, #else, and #ifndef Statements

You were shown earlier in this chapter how you could make the rotate() function from Chapter 11 more portable. You saw there how the use of a #define would help in this regard. The definition

#define kIntSize 32

was used to isolate the dependency on the specific number of bits contained in an unsigned int. It was noted in several places that this dependency does not have to be made at all because the program can itself determine the number of bits stored inside an unsigned int.

Unfortunately, a program sometimes must rely on system-dependent parameters—on a filename, for example—that might be specified differently on different systems or on a particular feature of the operating system.

If you had a large program that had many such dependencies on the particular hardware and/or software of the computer system (and this should be minimized as much as possible), you might end up with many defines whose values would have to be changed when the program was moved to another computer system.

You can help reduce the problem of having to change these defines when the program is moved and can incorporate the values of these defines for each different machine into the program by using the conditional compilation capabilities of the preprocessor. As a simple example, the statements

#ifdef UNIX
# define DATADIR "/uxn1/data"
#else
# define DATADIR "\usr\data"
#endif

have the effect of defining DATADIR to "/uxn1/data" if the symbol UNIX has been previously defined and to "\usr\data" otherwise. As you can see here, you are allowed to put one or more spaces after the # that begins a preprocessor statement.

The #ifdef, #else, and #endif statements behave as you would expect. If the symbol specified on the #ifdef line has been already defined—through a #define statement or through the command line when the program is compiled—then lines that follow up to a #else, #elif, or#endif are processed by the compiler; otherwise, they are ignored.

To define the symbol UNIX to the preprocessor, the statement

#define UNIX 1

or even just

#define UNIX

suffices. Most compilers also permit you to define a name to the preprocessor when the program is compiled by using a special option to the compiler command. The gcc command line

gcc -D UNIX program.c

defines the name UNIX to the preprocessor, causing all #ifdef UNIX statements inside program.c to evaluate as TRUE (note that the -D UNIX must be typed before the program name on the command line). This technique enables names to be defined without having to edit the source program.

A value can also be assigned to the defined name on the command line. For example,

gcc -D GNUDIR=/c/gnustep program.c

invokes the gcc compiler, defining the name GNUDIR to be the text /c/gnustep.

Avoiding Multiple Inclusion of Header Files

The #ifndef statement follows along the same lines as the #ifdef. This statement is used the same way the #ifdef statement is used, except that it causes the subsequent lines to be processed if the indicated symbol is not defined. This statement is often used to avoid multiple inclusion of a file in a program. For example, inside a header file, if you want to make certain it is included only once in a program, you can define a unique identifier that can be tested later. Consider the sequence of statements:

#ifndef _MYSTDIO_H
#define _MYSTDIO_H
...
#endif /* _MYSTDIO_H */

Suppose you typed this into a file called mystdio.h. If you included this file in your program with a statement like this:

#include "mystdio.h"

the #ifndef inside the file would test whether _MYSTDIO_H were defined. Because it wouldn’t be, the lines between the #ifndef and the matching #endif would be included in the program. Presumably, this would contain all of the statements that you want included in your program from this header file. Notice that the very next line in the header file defines _MYSTDIO_H. If an attempt were made to again include the file in the program, _MYSTDIO_H would be defined, so the statements that followed (up to the #endif, which presumably is placed at the very end of your header file) would not be included in the program, thus avoiding multiple inclusion of the file in the program.

This method as shown is used in the system header files to avoid their multiple inclusion in your programs. Take a look at some and see!

The #if and #elif Preprocessor Statements

The #if preprocessor statement offers a more general way of controlling conditional compilation. The #if statement can be used to test whether a constant expression evaluates to nonzero. If the result of the expression is nonzero, subsequent lines up to a #else, #elif, or #endif are processed; otherwise, they are skipped. As an example of how this might be used, assume you define the name OS, which is set to 1 if the operating system is Macintosh OS, to 2 if the operating system is Windows, to 3 if the operating system is Linux, and so on. You could write a sequence of statements to conditionally compile statements based upon the value of OS as follows:

#if OS == 1 /* Mac OS */
...
#elif OS == 2 /* Windows */
...
#elif OS == 3 /* Linux */
...
#else
...
#endif

With most compilers, you can assign a value to the name OS on the command line using the -D option discussed earlier. The command line

gcc -D OS=2 program.c

compiles program.c with the name OS defined as 2. This causes the program to be compiled to run under Windows.

The special operator

defined (name)

can also be used in #if statements. The set of preprocessor statements

#if defined (DEBUG)
...
#endif

and

#ifdef DEBUG
...
#endif

does the same thing. The statements

#if defined (WINDOWS) || defined (WINDOWSNT)
# define BOOT_DRIVE "C:/"
#else
# define BOOT_DRIVE "D:/"
#endif

define BOOT_DRIVE as "C:/" if either WINDOWS or WINDOWSNT is defined and as "D:/" otherwise.

The #undef Statement

On some occasions, you might need to cause a defined name to become undefined. This is done with the #undef statement. To remove the definition of a particular name, you write

#undef name

So the statement

#undef LINUX

removes the definition of LINUX. Subsequent #ifdef LINUX or #if defined (LINUX) statements will evaluate as FALSE.

This concludes the discussion of the preprocessor. You have seen how the preprocessor can be used to make programs easier to read, write, and modify. You’ve also seen how you can use include files to group common definitions and declarations together into a file that can be shared among different files. Some other preprocessor statements that weren’t described here are described in Appendix A, “C Language Summary.”

In the next chapter, you’ll learn more about data types and type conversions. Before proceeding, try the following exercises.

Exercises

1. Type in and run the three programs presented in this chapter, remembering to type in the .h include file associated with Program 12.3. Compare the output produced by each program with the output presented in the text.

2. Locate the system header files <stdio.h>, <limits.h>, and <float.h> on your system (on Unix systems, look inside the /usr/include directory). Examine the files to see what’s in them.

3. Define a macro MIN that gives the minimum of two values. Then write a program to test the macro definition.

4. Define a macro MAX3 that gives the maximum of three values. Write a program to test the definition.

5. Write a macro SHIFT to perform the identical purpose as the shift function of Program 11.3.

6. Write a macro IS_UPPER_CASE that gives a nonzero value if a character is an uppercase letter.

7. Write a macro IS_ALPHABETIC that gives a nonzero value if a character is an alphabetic character. Have the macro use the IS_LOWER_CASE macro defined in the chapter text and the IS_UPPER_CASE macro defined in exercise 6.

8. Write a macro IS_DIGIT that gives a nonzero value if a character is a digit '0' through '9'. Use this macro in the definition of another macro IS_SPECIAL, which gives a nonzero result if a character is a special character; that is, not alphabetic and not a digit. Be certain to use the IS_ALPHABETIC macro developed in exercise 7.

9. Write a macro ABSOLUTE_VALUE that computes the absolute value of its argument. Make certain that an expression such as

ABSOLUTE_VALUE (x + delta)

is properly evaluated by the macro.

10. Consider the definition of the printint macro from this chapter:

#define printint(n) printf ("%i\n", x ## n)

Could the following be used to display the values of the 100 variables x1–x100? Why or why not?

for (i = 1; i < 100; ++i)
printx (i);

11. Test the system library functions that are equivalent to the macros you developed in the preceding three exercises. The library functions are called isupper, isalpha, and isdigit. You need to include the system header file <ctype.h> in your program in order to use them.