Programming in C (Fourth Edition) (2015)
13. Extending Data Types with the Enumerated Data Type, Type Definitions, and Data Type Conversions
This chapter introduces you to a data type that has not yet been described: the enumerated data type. You also learn about the typedef statement, which enables you to assign your own names to basic data types or to derived data types. Finally, in this chapter you see the precise rules that are used by the compiler in the conversion of data types in an expression. Although the three topics covered in this chapter are diverse, understanding them is an important step in maximizing the power of data use in your programs. The topics covered include
Using enumerated data types
Creating your own labels of C’s existing data types with the typedef statement
Converting existing data types to others
Enumerated Data Types
Wouldn’t it be nice if you could define a variable and specify the valid values that could be stored into that variable? For example, suppose you had a variable called myColor and you wanted to use it to store only one of the primary colors, red, yellow, or blue, and no other values. This type of capability is provided by the enumerated data type.
An enumerated data type definition is initiated by the keyword enum. Immediately following this keyword is the name of the enumerated data type, followed by a list of identifiers (enclosed in a set of curly braces) that define the permissible values that can be assigned to the type. For example, the statement
enum primaryColor { red, yellow, blue };
defines a data type primaryColor. Variables declared to be of this data type can be assigned the values red, yellow, and blue inside the program, and no other values. That’s the theory anyway! An attempt to assign another value to such a variable causes some compilers to issue an error message. Other compilers simply don’t check.
To declare a variable to be of type enum primaryColor, you again use the keyword enum, followed by the enumerated type name, followed by the variable list. So the statement
enum primaryColor myColor, gregsColor;
defines the two variables myColor and gregsColor to be of type primaryColor. The only permissible values that can be assigned to these variables are the names red, yellow, and blue. So statements such as
myColor = red;
and
if ( gregsColor == yellow )
...
are valid. As another example of an enumerated data type definition, the following defines the type enum month, with permissible values that can be assigned to a variable of this type being the months of the year:
enum month { January, February, March, April, May, June,
July, August, September, October, November, December };
The C compiler actually treats enumeration identifiers as integer constants. Beginning with the first name in the list, the compiler assigns sequential integer values to these names, starting with 0. If your program contains these two lines:
enum month thisMonth;
...
thisMonth = February;
the value 1 is assigned to thisMonth (and not the name February) because it is the second identifier listed inside the enumeration list.
If you want to have a specific integer value associated with an enumeration identifier, the integer can be assigned to the identifier when the data type is defined. Enumeration identifiers that subsequently appear in the list are assigned sequential integer values beginning with the specified integer value plus 1. For example, in the definition
enum direction { up, down, left = 10, right };
an enumerated data type direction is defined with the values up, down, left, and right. The compiler assigns the value 0 to up because it appears first in the list; 1 to down because it appears next; 10 to left because it is explicitly assigned this value; and 11 to right because it appears immediately after left in the list.
Program 13.1 shows a simple program using enumerated data types. The enumerated data type month sets January to 1 so that the month numbers 1 through 12 correspond to the enumeration values January, February, and so on. The program reads a month number and then enters aswitch statement to see which month was entered. Recall that enumeration values are treated as integer constants by the compiler, so they’re valid case values. The variable days is assigned the number of days in the specified month, and its value is displayed after the switch is exited. A special test is included to see if the month is February.
Program 13.1 Using Enumerated Data Types
// Program to print the number of days in a month
#include <stdio.h>
int main (void)
{
enum month { January = 1, February, March, April, May, June,
July, August, September, October, November, December };
enum month aMonth;
int days;
printf ("Enter month number: ");
scanf ("%i", &aMonth);
switch (aMonth ) {
case January:
case March:
case May:
case July:
case August:
case October:
case December:
days = 31;
break;
case April:
case June:
case September:
case November:
days = 30;
break;
case February:
days = 28;
break;
default:
printf ("bad month number\n");
days = 0;
break;
}
if ( days != 0 )
printf ("Number of days is %i\n", days);
if ( aMonth == february )
printf ("...or 29 if it's a leap year\n");
return 0;
}
Program 13.1 Output
Enter month number: 5
Number of days is 31
Program 13.1 Output (Rerun)
Enter month number: 2
Number of days is 28
...or 29 if it's a leap year
Enumeration identifiers can share the same value. For example, in
enum switch { no=0, off=0, yes=1, on=1 };
assigning either the value no or off to an enum switch variable assigns it the value 0; assigning either yes or on assigns it the value 1.
Explicitly assigning an integer value to an enumerated data type variable can be done with the type cast operator. So if monthValue is an integer variable that has the value 6, for example, the expression
thisMonth = (enum month) (monthValue - 1);
is permissible and assigns the value 5 to thisMonth.
When writing programs with enumerated data types, try not to rely on the fact that the enumerated values are treated as integers. Instead, try to treat them as distinct data types. The enumerated data type gives you a way to associate a symbolic name with an integer number. If you subsequently need to change the value of that number, you must change it only in the place where the enumeration is defined. If you make assumptions based on the actual value of the enumerated data type, you defeat this benefit of using an enumeration.
The variations permitted when defining an enumerated data type are similar to those permitted with structure definitions: The name of the data type can be omitted, and variables can be declared to be of the particular enumerated data type when the type is defined. As an example showing both of these options, the statement
enum { east, west, south, north } direction;
defines an (unnamed) enumerated data type with values east, west, south, or north, and declares a variable direction to be of that type.
Enumerated type definitions behave like structure and variable definitions as far as their scope is concerned: Defining an enumerated data type within a block limits the scope of that definition to the block. On the other hand, defining an enumerated data type at the beginning of the program, outside of any function, makes the definition global to the file.
When defining an enumerated data type, you must make certain that the enumeration identifiers are unique with respect to other variable names and enumeration identifiers defined within the same scope.
The typedef Statement
C provides a capability that enables you to assign an alternate name to a data type. This is done with a statement known as typedef. The statement
typedef int Counter;
defines the name Counter to be equivalent to the C data type int. Variables can subsequently be declared to be of type Counter, as in the following statement:
Counter j, n;
The C compiler actually treats the declaration of the variables j and n, shown in the preceding code, as normal integer variables. The main advantage of the use of the typedef in this case is in the added readability that it lends to the definition of the variables. It is clear from the definition of j and n what the intended purpose of these variables is in the program. Declaring them to be of type int in the traditional fashion would not have made the intended use of these variables at all clear. Of course, choosing more meaningful variable names would have helped as well!
In many instances, a typedef statement can be equivalently substituted by the appropriate #define statement. For example, you could have instead used the statement
#define Counter int
to achieve the same results as the preceding statement. However, because the typedef is handled by the C compiler proper, and not by the preprocessor, the typedef statement provides more flexibility than does the #define when it comes to assigning names to derived data types. For example, the following typedef statement:
typedef char Linebuf [81];
defines a type called Linebuf, which is an array of 81 characters. Subsequently declaring variables to be of type Linebuf, as in
Linebuf text, inputLine;
has the effect of defining the variables text and inputLine to be arrays containing 81 characters. This is equivalent to the following declaration:
char text[81], inputLine[81];
Note that, in this case, Linebuf could not have been equivalently defined with a #define preprocessor statement.
The following typedef defines a type name StringPtr to be a char pointer:
typedef char *StringPtr;
Variables subsequently declared to be of type StringPtr, as in
StringPtr buffer;
are treated as character pointers by the C compiler.
To define a new type name with typedef, follow these steps:
1. Write the statement as if a variable of the desired type were being declared.
2. Where the name of the declared variable would normally appear, substitute the new type name.
3. In front of everything, place the keyword typedef.
As an example of this procedure, to define a type called Date to be a structure containing three integer members called month, day, and year, you write out the structure definition, substituting the name Date where the variable name would normally appear (before the last semicolon). Before everything, you place the keyword typedef:
typedef struct
{
int month;
int day;
int year;
} Date;
With this typedef in place, you can subsequently declare variables to be of type Date, as in
Date birthdays[100];
This defines birthdays to be an array containing 100 Date structures.
When working on programs in which the source code is contained in more than one file (as described in Chapter 14, “Working with Larger Programs”), it’s a good idea to place the common typedef statements into a separate file that can be included into each source file with an #includestatement.
As another example, suppose you’re working on a graphics package that needs to deal with drawing lines, circles, and so on. You probably will be working very heavily with the coordinate system. Here’s a typedef statement that defines a type named Point, where a Point is a structure containing two float members x and y:
typedef struct
{
float x;
float y;
} Point;
You can now proceed to develop your graphics library, taking advantage of this Point type. For example, the declaration
Point origin = { 0.0, 0.0 }, currentPoint;
defines origin and currentPoint to be of type Point and sets the x and y members of origin to 0.0.
Here’s a function called distance that calculates the distance between two points.
#include <math.h>
double distance (Point p1, Point p2)
{
double diffx, diffy;
diffx = p1.x - p2.x;
diffy = p1.y - p2.y;
return sqrt (diffx * diffx + diffy * diffy);
}
As previously noted, sqrt is the square root function from the standard library. It is declared in the system header file math.h, thus the reason for the #include.
Remember, the typedef statement does not actually define a new type—only a new type name. So the Counter variables j and n, as defined in the beginning of this section, would in all respects be treated as normal int variables by the C compiler.
Data Type Conversions
Chapter 3, “Variables, Data Types, and Arithmetic Expressions,” briefly addressed the fact that sometimes conversions are implicitly made by the system when expressions are evaluated. The case you examined was with the data types float and int. You saw how an operation that involved a float and an int was carried out as a floating-point operation, the integer data item being automatically converted to floating point.
You have also seen how the type cast operator can be used to explicitly dictate a conversion. So in the statement
average = (float) total / n;
the value of the variable total is converted to type float before the operation is performed, thereby guaranteeing that the division will be carried out as a floating-point operation.
The C compiler adheres to strict rules when it comes to evaluating expressions that consist of different data types.
The following summarizes the order in which conversions take place in the evaluation of two operands in an expression:
1. If either operand is of type long double, the other is converted to long double, and that is the type of the result.
2. If either operand is of type double, the other is converted to double, and that is the type of the result.
3. If either operand is of type float, the other is converted to float, and that is the type of the result.
4. If either operand is of type _Bool, char, short int, bit field, or of an enumerated data type, it is converted to int.
5. If either operand is of type long long int, the other is converted to long long int, and that is the type of the result.
6. If either operand is of type long int, the other is converted to long int, and that is the type of the result.
7. If this step is reached, both operands are of type int, and that is the type of the result.
This is actually a simplified version of the steps that are involved in converting operands in an expression. The rules get more complicated when unsigned operands are involved. For the complete set of rules, refer to Appendix A, “C Language Summary.”
Realize from this series of steps that whenever you reach a step that says “that is the type of the result,” you’re done with the conversion process.
As an example of how to follow these steps, see how the following expression would be evaluated, where f is defined to be a float, i an int, l a long int, and s a short int variable:
f * i + l / s
Consider first the multiplication of f by i, which is the multiplication of a float by an int. From step 3, you find that, because f is of type float, the other operand, i, is also converted to type float, and that is the type of the result of the multiplication.
Next, the division of l by s occurs, which is the division of a long int by a short int. Step 4 tells you that the short int is promoted to an int. Continuing, you find from step 6 that because one of the operands (l) is a long int, the other operand is converted to a long int, which is also the type of the result. This division, therefore, produces a value of type long int, with any fractional part resulting from the division truncated.
Finally, step 3 indicates that if one of the operands in an expression is of type float (as is the result of multiplying f * i), the other operand is converted to type float, which is the type of the result. Therefore, after the division of l by s has been performed, the result of the operation is converted to type float and then added into the product of f and i. The final result of the preceding expression is, therefore, a value of type float.
Remember, the type cast operator can always be used to explicitly force conversions and thereby control the way that a particular expression is evaluated.
So, if you didn’t want the result of dividing l by s to be truncated in the preceding expression evaluation, you could have type cast one of the operands to type float, thereby forcing the evaluation to be performed as a floating-point division:
f * i + (float) l / s
In this expression, l would be converted to float before the division operation was performed, because the type cast operator has higher precedence than the division operator. Because one of the operands of the division would then be of type float, the other (s) would be automatically converted to type float, and that would be the type of the result.
Sign Extension
Whenever a signed int or signed short int is converted into an integer of a larger size, the sign is extended to the left when the conversion is performed. This ensures that a short int having a value of −5, for example, will also have the value −5 when converted to a long int. Whenever an unsigned integer is converted to an integer of a larger size, as you would expect, no sign extension occurs.
On some systems, characters are treated as signed quantities. This means that when a character is converted to an integer, sign extension occurs. As long as characters are used from the standard ASCII character set, this fact will never pose a problem. However, if a character value is used that is not part of the standard character set, its sign might be extended when converted to an integer. For example on a Mac, the character constant '\377' is converted to the value −1 because its value is negative when treated as a signed, eight-bit quantity.
Recall that the C language permits character variables to be declared unsigned, thus avoiding this potential problem. That is, an unsigned char variable will never have its sign extended when converted to an integer; its value will always be greater than or equal to 0. For the typical eight-bit character, a signed character variable, therefore, has the range of values from −128 to +127, inclusive. An unsigned character variable can range in value from 0 to 255, inclusive.
If you want to force sign extension on your character variables, you can declare such variables to be of type signed char. This ensures that sign extension will occur when the character value is converted to an integer, even on machines that don’t do so by default.
Argument Conversion
You have used prototype declarations for all the functions that you have written in this book. In Chapter 7, “Working with Functions,” you learned this was prudent because you can physically locate the function either before or after its call, or even in another source file, with a prototype declaration. It was also noted that the compiler automatically converts your arguments to the appropriate types as long as it knows the types of arguments the function expects. The only way it can know this is by having previously encountered the actual function definition or a prototype declaration.
Recall that, if the compiler sees neither the function definition nor a prototype declaration before it encounters a call to a function, it assumes the function returns an int. The compiler also makes assumptions about its argument types. In the absence of information about the argument types to a function, the compiler automatically converts _Bool, char, or short arguments to ints and converts float arguments to double.
For example, assume that the compiler encounters in your program
float x;
...
y = absoluteValue (x);
Having not previously seen the definition of the absoluteValue function, and with no prototype declaration for it either, the compiler generates code to convert the value stored inside the float variable x to double and passes the result to the function. The compiler also assumes the function returns an int.
If the absoluteValue function is defined inside another source file like this:
float absoluteValue (float x)
{
if ( x < 0.0 )
x = -x;
return x;
}
you’re in trouble. First, the function returns a float, yet the compiler thinks it returns an int. Second, the function expects to see a float argument, but you know the compiler will pass a double.
Remember, the bottom line here is that you should always include prototype declarations for the functions you use. This prevents the compiler from making mistaken assumptions about return types and argument types.
Now that you have learned more about data types, it’s time to learn about how to work with programs that can be split into multiple source files. Chapter 14 covers this topic in detail. Before you start that chapter, try the following exercises to make certain you understand the concepts you just learned.
Exercises
1. Define a type FunctionPtr() (using typedef) that represents a pointer to a function that returns an int and that takes no arguments. Refer to Chapter 10, “Pointers,” for the details on how to declare a variable of this type.
2. Write a function called monthName() that takes as its argument a value of type enum month (as defined in this chapter) and returns a pointer to a character string containing the name of the month. In this way, you can display the value of an enum month variable with a statement such as:
printf ("%s\n", monthName (aMonth));
3. Given the following variable declarations:
float f = 1.00;
short int i = 100;
long int l = 500L;
double d = 15.00;
and the seven steps outlined in this chapter for conversion of operands in expressions, determine the type and value of the following expressions:
f + i
l / d
i / l + f
l * i
f / 2
i / (d + f)
l / (i * 2.0)
l + i / (double) l