Input and Output Operations in C - Programming in C (Fourth Edition) (2015)

Programming in C (Fourth Edition) (2015)

15. Input and Output Operations in C

All reading and writing of data up to this point has been done through your output window, otherwise known as the console or terminal. When you wanted to input some information, you either used the scanf() or getchar() functions. All program results were displayed in your window with a call to the printf() function.

The C language itself does not have any special statements for performing input/output (I/O) operations; all I/O operations in C must be carried out through function calls. These functions are contained in the standard C library. This chapter covers some additional input and output functions as well as how to work with files. Topics covered include

Image Covering basic I/O with putchar() and getchar()

Image Maximizing printf() and scanf() with flags and modifiers

Image Redirecting input and output from files

Image Using file functions and pointers. Recall the use of the following include statement from previous programs that used the printf() or scanf() function:

#include <stdio.h>

This include file contains function declarations and macro definitions associated with the I/O routines from the standard library. Therefore, whenever using a function from this library, you should include this file in your program.

In this chapter, you learn about many of the I/O functions that are provided in the standard library. Unfortunately, space does not permit lengthy details about these functions or discussions of each function that is offered. Refer to Appendix B, “The Standard C Library,” for a list of most of the functions in the library.

Character I/O: getchar() and putchar()

The getchar() function proved convenient when you wanted to read data a single character at a time. You saw how you could develop a function called readLine() to read an entire line of text that the user inputted. This function repeatedly called getchar() until a newline character was read.

There is an analogous function for writing data a single character at a time. The name of this function is putchar().

A call to the putchar() function is quite simple: The only argument it takes is the character to be displayed. So, the call

putchar (c);

in which c is defined as type char, has the effect of displaying the character contained in c.

The call

putchar ('\n');

has the effect of displaying the newline character, which, as you know, causes the cursor to move to the beginning of the next line.

Formatted I/O: printf() and scanf()

You have been using the printf() and scanf() functions throughout this book. In this section, you learn about all of the options that are available for formatting data with these functions.

The first argument to both printf() and scanf() is a character pointer. This points to the format string. The format string specifies how the remaining arguments to the function are to be displayed in the case of printf(), and how the data that is read is to be interpreted in the case ofscanf().

The printf() Function

You have seen in various program examples how you could place certain characters between the % character and the specific so-called conversion character to more precisely control the formatting of the output. For example, you saw in Program 4.3A how an integer value before the conversion character could be used to specify a field width. The format characters %2i specified the display of an integer value right-justified in a field width of two columns. You also saw in exercise 6 in Chapter 4, “Program Looping,” how a minus sign could be used to left-justify a value in a field.

The general format of a printf() conversion specification is as follows:

%[flags][width][.prec][hlL]type

Optional fields are enclosed in brackets and must appear in the order shown.

Tables 15.1, 15.2, and 15.3 summarize all possible characters and values that can be placed directly after the % sign and before the type specification inside a format string.

Image

Table 15.1 printf() Flags

Image

Table 15.2 printf() Width and Precision Modifiers

Image

Table 15.3 printf() Type Modifiers

Table 15.4 lists the conversion characters that can be specified in the format string.

Image

Table 15.4 printf() Conversion Characters

Tables 15.1 to 15.4 might appear a bit overwhelming. As you can see, many different combinations can be used to precisely control the format of your output. The best way to become familiar with the various possibilities is through experimentation. Just make certain that the number of arguments you give to the printf() function matches the number of % signs in the format string (with %% as the exception, of course). And, in the case of using an * in place of an integer for the field width or precision modifiers, remember that printf() is expecting an argument for each asterisk as well.

Program 15.1 shows some of the formatting possibilities using printf().

Program 15.1 Illustrating the printf() Formats


// Program to illustate various printf() formats
#include <stdio.h>

int main (void)
{
char c = 'X';
char s[] = "abcdefghijklmnopqrstuvwxyz";
int i = 425;
short int j = 17;
unsigned int u = 0xf179U;
long int l = 75000L;
long long int L = 0x1234567812345678LL;
float f = 12.978F;
double d = -97.4583;
char *cp = &c;
int *ip = &i;
int c1, c2;

printf ("Integers:\n");
printf ("%i %o %x %u\n", i, i, i, i);
printf ("%x %X %#x %#X\n", i, i, i, i);
printf ("%+i % i %07i %.7i\n", i, i, i, i);
printf ("%i %o %x %u\n", j, j, j, j);
printf ("%i %o %x %u\n", u, u, u, u);
printf ("%ld %lo %lx %lu\n", l, l, l, l);
printf ("%lli %llo %llx %llu\n", L, L, L, L);

printf ("\nFloats and Doubles:\n");
printf ("%f %e %g\n", f, f, f);
printf ("%.2f %.2e\n", f, f);
printf ("%.0f %.0e\n", f, f);
printf ("%7.2f %7.2e\n", f, f);
printf ("%f %e %g\n", d, d, d);
printf ("%.*f\n", 3, d);
printf ("%*.*f\n", 8, 2, d);

printf ("\nCharacters:\n");
printf ("%c\n", c);
printf ("%3c%3c\n", c, c);
printf ("%x\n", c);

printf ("\nStrings:\n");
printf ("%s\n", s);
printf ("%.5s\n", s);
printf ("%30s\n", s);
printf ("%20.5s\n", s);
printf ("%-20.5s\n", s);

printf ("\nPointers:\n");
printf ("%p %p\n\n", ip, cp);

printf ("This%n is fun.%n\n", &c1, &c2);
printf ("c1 = %i, c2 = %i\n", c1, c2);

return 0;
}


Program 15.1 Output


Integers:
425 651 1a9 425
1a9 1A9 0x1a9 0X1A9
+425 425 0000425 0000425
17 21 11 17
61817 170571 f179 61817
75000 222370 124f8 75000
1311768465173141112 110642547402215053170 1234567812345678 1311768465173141112

Floats and Doubles:
12.978000 1.297800e+01 12.978
12.98 1.30e+01
13 1e+01
12.98 1.30e+01
-97.458300 -9.745830e+01 -97.4583
-97.458
-97.46

Characters:
X
X X
58

Strings:
abcdefghijklmnopqrstuvwxyz
abcde
abcdefghijklmnopqrstuvwxyz
abcde
abcde

Pointers:
0xbffffc20 0xbffffbf0

This is fun.
c1 = 4, c2 = 12


It’s worthwhile to take some time to explain the output in detail. The first set of output deals with the display of integers: short, long, unsigned, and “normal” ints. The first line displays i in decimal (%i), octal (%o), hexadecimal (%x), and unsigned (%u) formats. Notice that octal numbers are not preceded by a leading 0 when they are displayed.

The next line of output displays the value of i again. First, i is displayed in hexadecimal notation using %x. The use of a capital X (%#X) causes printf() to use uppercase letters A–F instead of lowercase letters when displaying numbers in hexadecimal. The # modifier (%#x) causes a leading 0x to appear before the number and causes a leading 0X to appear when the capital X is used as the conversion character (%#X).

The fourth printf() call first uses the + flag to force a sign to appear, even if the value is positive (normally, no sign is displayed). Then, the space modifier is used to force a leading space in front of a positive value. (Sometimes this is useful for aligning data that might be positive or negative; the positive values have a leading space; the negative ones have a minus sign.) Next, %07 is used to display the value of i right-justified within a field width of seven characters. The 0 flag specifies zero fill. Therefore, four leading zeroes are placed in front of the value of i, which is 425. The final conversion in this call, %.7i is used to display the value of i using a minimum of seven digits. The net effect is the same as specifying %07i: Four leading zeroes are displayed, followed by the three-digit number 425.

The fifth printf() call displays the value of the short int variable j in various formats. Any integer format can be specified to display the value of a short int.

The next printf() call shows what happens when %i is used to display the value of an unsigned int. Because the value assigned to u is larger than the maximum positive value that can be stored in a signed int on the machine on which this program was run, it is displayed as a negative number when the %i format characters are used.

The next-to-last printf() call in this set shows how the l modifier is used to display long integers, and the final printf() call in the set shows how long long integers can be displayed.

The second set of output illustrates various formatting possibilities for displaying floats and doubles. The first output line of this set shows the result of displaying a float value using %f, %e, and %g formats. As mentioned, unless specified otherwise, the %f and %e formats default to six decimal places. With the %g format, printf() decides whether to display the value in either %e or %f format, depending upon the magnitude of the value and on the specified precision. If the exponent is less than −4 or greater than the optionally specified precision (remember, the default is 6), %e is used; otherwise, %f is used. In either case, trailing zeroes are automatically removed, and a decimal point is displayed only if nonzero digits follow it. In general, %g is the best format to use for displaying floating-point numbers in the most aesthetically pleasing format.

In the next line of output, the precision modifier .2 is specified to limit the display of f to two decimal places. As you can see, printf() is nice enough to automatically round the value of f for you. The line that immediately follows shows the use of the .0 precision modifier to suppress the display of any decimal places, including the decimal point, in the %f format. Once again, the value of f is automatically rounded.

The modifiers 7.2, as used for generating the next line of output, specify that the value is to be displayed in a minimum of seven columns, to two decimal places of accuracy. Because both values need fewer than seven columns to be displayed, printf() right-justifies the value (adding spaces on the left) within the specified field width.

In the next three lines of output, the value of the double variable d is displayed with various formats. The same format characters are used for the display of floats and double values, because, as you’ll once again recall, floats are automatically converted to doubles when passed as arguments to functions. The printf() call

printf ("%.*f\n", 3, d);

specifies that the value of d is to be displayed to three decimal places. The asterisk after the period in the format specification instructs printf() to take the next argument to the function as the value of the precision. In this case, the next argument is 3. This value could also have been specified by a variable, as in

printf ("%.*f\n", accuracy, d);

which makes this feature useful for dynamically changing the format of a display.

The final line of the floats and doubles set shows the result of using the format characters %*.*f for displaying the value of d. In this case, both the field width and the precision are given as arguments to the function, as indicated by the two asterisks in the format string. Because the first argument after the format string is 8, this is taken as the field width. The next argument, 2, is taken as the precision. The value of d is, therefore, displayed to two decimal places in a field size of eight characters. Notice that the minus sign as well as the decimal point are included in the field-width count. This is true for any field specifier.

In the next set of program output, the character c, which was initially set to the character X, is displayed in various formats. The first time it is displayed using the familiar %c format characters. On the next line, it is displayed twice with a field-width specification of 3. This results in the display of the character with two leading spaces.

A character can be displayed using any integer format specification. In the next line of output, the value of c is displayed in hexadecimal. The output indicates that on this machine the character X is internally represented by the number hexadecimal 58.

In the final set of program output, the character string s is displayed. The first time it is displayed with the normal %s format characters. Then, a precision specification of 5 is used to display just the first five characters from the string. This results in the display of the first five letters of the alphabet.

In the third output line from this set, the entire character string is once again displayed, this time using a field-width specification of 30. As you can see, the string is displayed right-justified in the field.

The final two lines from this set show five characters from the string s being displayed in a field-width size of 20. The first time, these five characters are displayed right-justified in the field. The second time, the minus sign results in the display of the first five letters left-justified in the field. The vertical bar character was printed to verify that the format characters %-20.5s actually result in the display of 20 characters at the terminal (five letters followed by 15 spaces).

The %p characters are used to display the value of a pointer. Here, you are displaying the integer pointer ip and the character pointer cp. You should note that you will probably get different values displayed on your system because your pointers will most likely contain different addresses.

The format of the output when using %p is implementation-defined, but in this example, the pointers are displayed in hexadecimal format. According to the output, the pointer variable ip contained the address bffffc20 hexadecimal, and the pointer cp contained the address bffffbf0.

The final set of output shows the use of the %n format characters. In this case, the corresponding argument to printf() must be of type pointer to int, unless a type modifier of h, hh, h, l, ll, j, z, or t is specified. printf() actually stores the number of characters it has written so far into the integer pointed to by this argument. So, the first occurrence of %n causes printf to store the value 4 inside the integer variable c1 because that’s how many characters have been written so far by this call. The second occurrence of %n causes the value 12 to be stored inside c2. This is because 12 characters had been displayed at that point by printf(). Notice that inclusion of the %n inside the format string has no effect on the actual output produced by printf().

The scanf() Function

Like the printf() function, many more formatting options can be specified inside the format string of a scanf() call than have been illustrated up to this point. As with printf(), scanf() takes optional modifiers between the % and the conversion character. These optional modifiers are summarized in Table 15.5. The possible conversion characters that can be specified are summarized in Table 15.6.

Image

Table 15.5 scanf() Conversion Modifiers

Image

Image

Table 15.6 scanf() Conversion Characters

When the scanf() function searches the input stream for a value to be read, it always bypasses any leading so-called whitespace characters, where whitespace refers to either a blank space, horizontal tab ('\t'), vertical tab ('\v'), carriage return ('\r'), newline ('\n'), or form-feed character ('\f'). The exceptions are in the case of the %c format characters—in which case the next character from the input, no matter what it is, is read—and in the case of the bracketed character string read—in which case, the characters contained in the brackets (or not contained in the brackets) specify the permissible characters of the string.

When scanf() reads in a particular value, reading of the value terminates as soon as the number of characters specified by the field width is reached (if supplied) or until a character that is not valid for the value being read is encountered. In the case of integers, valid characters are an optionally signed sequence of digits that are valid for the base of the integer that is being read (decimal: 0–9, octal: 0–7, hexadecimal: 0–9, a–f, or A–F). For floats, permissible characters are an optionally signed sequence of decimal digits, followed by an optional decimal point and another sequence of decimal digits, all of which can be followed by the letter e (or E) and an optionally signed exponent. In the case of %a, a hexadecimal floating value can be supplied in the format of a leading 0x, followed by a sequence of hexadecimal digits with an optional decimal point, followed by an optional exponent preceded by the letter p (or P).

For character strings read with the %s format, any nonwhitespace character is valid. In the case of %c format, all characters are valid. Finally, in the case of the bracketed string read, valid characters are only those enclosed within the brackets (or not enclosed within the brackets if the ^character is used after the open bracket).

Recall from Chapter 8, “Working with Structures,” when you wrote the programs that prompted the user to enter the time from the terminal, any nonformat characters that were specified in the format string of the scanf() call were expected on the input. So, for example, the scanf() call

scanf ("%i:%i:%i", &hour, &minutes, &seconds);

means that three integer values are to be read in and stored in the variables hour, minutes, and seconds, respectively. Inside the format string, the : character specifies that colons are expected as separators between the three integer values.

To specify that a percent sign is expected as input, double percent signs are included in the format string, as follows:

scanf ("%i%%", &percentage);

Whitespace characters inside a format string match an arbitrary number of whitespace characters on the input. So, the call

scanf ("%i%c", &i, &c);

with the line of text

29 w

assigns the value 29 to i and a space character to c because this is the character that appears immediately after the characters 29 on the input. If the following scanf() call is made instead:

scanf ("%i %c", &i, &c);

and the same line of text is entered, the value 29 is assigned to i and the character 'w' to c because the blank space in the format string causes the scanf() function to ignore any leading whitespace characters after the characters 29 have been read.

Table 15.5 indicates that an asterisk can be used to skip fields. If the scanf() call

scanf ("%i %5c %*f %s", &i1, text, string);

is executed and the following line of text is typed in:

144abcde 736.55 (wine and cheese)

the value 144 is stored in i1; the five characters abcde are stored in the character array text; the floating value 736.55 is matched but not assigned; and the character string "(wine" is stored in string, terminated by a null. The next call to scanf() picks up where the last one left off. So, a subsequent call such as

scanf ("%s %s %i", string2, string3, &i2);

has the effect of storing the character string "and" in string2 and the string "cheese)" in string3, and causes the function to wait for an integer value to be typed.

Remember that scanf expects pointers to the variables where the values that are read in are to be stored. You know from Chapter 10, “Pointers,” why this is necessary—so that scanf() can make changes to the variables; that is, store the values that it reads into them. Remember also that to specify a pointer to an array, only the name of the array needs be specified. So, if text is defined as an appropriately sized array of characters, the scanf() call

scanf ("%80c", text);

reads the next 80 characters from the input and stores them in text.

The scanf() call

scanf ("%[^/]", text);

indicates that the string to be read can consist of any character except for a slash. Using the preceding call on the following line of text

(wine and cheese)/

has the effect of storing the string "(wine and cheese)" in text because the string is not terminated until the / is matched (which is also the character read by scanf on the next call).

To read an entire line from the terminal into the character array buf, you can specify that the newline character at the end of the line is your string terminator:

scanf ("%[^\n]\n", buf);

The newline character is repeated outside the brackets so that scanf() matches it and does not read it the next time it’s called. (Remember, scanf() always continues reading from the character that terminated its last call.)

When a value is read that does not match a value expected by scanf() (for example, typing in the character x when an integer is expected), scanf() does not read any further items from the input and immediately returns. Because the function returns the number of items that were successfully read and assigned to variables in your program, this value can be tested to determine if any errors occurred on the input. For example, the call

if ( scanf ("%i %f %i", &i, &f, &l) != 3 )
printf ("Error on input\n");

tests to make certain that scanf() successfully read and assigned three values. If not, an appropriate message is displayed.

Remember, the return value from scanf() indicates the number of values read and assigned, so the call

scanf ("%i %*d %i", &i1, &i3)

returns 2 when successful and not 3 because you are reading and assigning two integers (skipping one in between). Note also that the use of %n (to obtain the number of characters read so far) does not get included in the value returned by scanf().

Experiment with the various formatting options provided by the scanf() function. As with the printf() function, a good understanding of these various formats can be obtained only by trying them in actual program examples.

Input and Output Operations with Files

So far, when a call was made to the scanf() function by one of the programs in this book, the data that was requested by the call was always read in from keyboard input by your program’s user. Similarly, all calls to the printf() function resulted in the display of the desired information to your active window. To improve the utility of your programs, you need to be able to read data from, and write data to, files, which are covered in this section.

Redirecting I/O to a File

Both read and write file operations can be easily performed under many operating systems, including Windows, Linux, and Unix, without anything special being done at all to the program. Type Program 15.2, a very simple example that takes a number, then performs some very simple calculations on it.

Program 15.2 A Simple Example


//Taking a single number and outputting several calculations
#include <stdio.h>

main()
{

float d = 6.5;
float half, square, cube;

half = d/2;
square = d*d;
cube = d*d*d;

printf("\nYour number is %.2f\n", d);
printf("Half of it is %.2f\n", half);
printf("Square it to get %.2f\n", square);
printf("Cube it to get %.2f\n", cube);

return 0;
}


Simple enough, but suppose you want to keep the results in a file. If you want to write the results from this program into a file called results.txt, for example, all that you need to do under Unix or Windows if running from a command prompt is to redirect the output from the program into the file results.txt by executing the program with the following command:

program1502 > results.txt

This command instructs the system to execute the program program1502 but to redirect the output normally written to the command prompt into a file called results.txt instead. So, any values displayed by printf() do not appear in your window but are instead written into the file calledresults.txt.

While program 15.2 is interesting, it would be even more valuable to prompt the user to enter a number and then perform the calculations on the number and display the results. Program 15.3 shows that slightly tweaked program.

Program 15.3 A Simple, yet More Interactive, Example


//Inputting a single number and outputting several calculations
#include <stdio.h>

main()
{

float d ;
float half, square, cube;

printf("Enter a number between 1 and 100: \n");
scanf("%f", &d);
half = d/2;
square = d*d;
cube = d*d*d;

printf("\nYour number is %.2f\n", d);
printf("Half of it is %.2f\n", half);
printf("Square it to get %.2f\n", square);
printf("Cube it to get %.2f\n", cube);

return 0;
}


Now suppose you want to save the data from this program to a file, called results2.txt. You would type the following line at your command prompt:

program1503 > results.txt

This time, it may look as though the program is hung up and not responding. This is partially true. The program is not advancing because it is waiting for input from the user—looking for the user to enter a number to perform the calculations on. This is the downside to directing output to a file in this manner. All output goes into the file, even the printf() statement used to prompt the user for data entry. If you examine the contents of results2.txt, you would get the following, assuming you entered 6.5 as your number.

Enter a number between 1 and 100:

Your number is 6.50
Half of it is 3.25
Square it to get 42.25
Cube it to get 274.63

This verifies that the output from the program went into the file results2.txt as described previously. You might want to try running the program with different filenames and different numbers to see that it works repeatedly.

You can do a similar type of redirection for the input to your programs. Any call to a function that normally reads data from your window, such as scanf() and getchar(), can be easily made to read its information from a file. Create a file with just a single number in it (for this example, the filename is simp4.txt, and contains simply the number 4), and then run program1503 again, but use the following command line:

program1503 < simp4.txtn

The following output appears at the terminal after this command is entered:

Enter a number between 1 and 100:

Your number is 4.00
Half of it is 2.00
Square it to get 16.00
Cube it to get 64.00

Notice that the program requested that a number be entered but did not wait for you to type in a number. This is because the input to program1503—but not its output—was redirected from the file called simp4.txt. Therefore, the scanf() call from the program had the effect of reading the value from the file simp4.txt and not from your command prompt. The information must be entered in the file the same way that it would be typed in. The scanf() function itself does not actually know (or care) whether its input is coming from your window or from a file; all it cares about is that it is properly formatted.

Naturally, you can redirect the input and the output to a program at the same time. The command

program1503 < simp4.txt > results3.txt

causes execution of the program contained in program1503 to read all program input from the file simp4.txt and to write all program results into the file results3.txt.

The method of redirecting the program’s input and/or its output is often practical. For example, suppose you are writing an article for a magazine and have typed the text into a file called article. Program 9.8 counted the number of words that appeared in lines of text entered at the terminal. You could use this very same program to count the number of words in your article simply by typing in the following command1:

1. Unix systems provide a wc command, which can also count words. Also, recall that this program was designed to work on text files, not word processing files, such as MS Word .doc files.

wordcount < article

Of course, you have to remember to include an extra carriage return at the end of the article file because your program was designed to recognize an end-of-data condition by the presence of a single newline character on a line.

Note that I/O redirection, as described here, is not actually part of the ANSI definition of C. This means that you might find operating systems that don’t support it. Luckily, most do.

End of File

The preceding point about end of data is worthy of more discussion. When dealing with files, this condition is called end of file. An end-of-file condition exists when the final piece of data has been read from a file. Attempting to read past the end of the file might cause the program to terminate with an error, or it might cause the program to go into an infinite loop if this condition is not checked by the program. Luckily, most of the functions from the standard I/O library return a special flag to indicate when a program has reached the end of a file. The value of this flag is equal to a special name called EOF, which is defined in the standard I/O include file <stdio.h>.

As an example of the use of the EOF test in combination with the getchar() function, Program 15.4 reads in characters and echoes them back in the terminal window until an end of file is reached. Notice the expression contained inside the while loop. As you can see, an assignment does not have to be made in a separate statement.

Program 15.4 Copying Characters from Standard Input to Standard Output


// Program to echo characters until an end of file

#include <stdio.h>

int main (void)
{
int c;

while ( (c = getchar ()) != EOF )
putchar (c);

return 0;
}


If you compile and execute Program 15.4, redirecting the input to a file with a command such as

program1504 < infile

the program displays the contents of the file infile at the terminal. Try it and see! Actually, the program serves the same basic function as the cat command under Unix, and you can use it to display the contents of any text file you choose.

In the while loop of Program 15.4, the character that is returned by the getchar() function is assigned to the variable c and is then compared against the defined value EOF. If the values are equal, this means that you have read the final character from the file. One important point must be mentioned with respect to the EOF value that is returned by the getchar() function: The function actually returns an int and not a char. This is because the EOF value must be unique; that is, it cannot be equal to the value of any character that would normally be returned bygetchar(). Therefore, the value returned by getchar() is assigned to an int and not a char variable in the preceding program. This works out okay because C allows you to store characters inside ints, even though, in general, it might not be the best of programming practices.

If you store the result of the getchar() function inside a char variable, the results are unpredictable. On systems that do sign extension of characters, the code might still work okay. On systems that don’t do sign extension, you might end up in an infinite loop.

The bottom line is to always remember to store the result of getchar() inside an int so that you can properly detect an end-of-file condition.

The fact that you can make an assignment inside the conditional expression of the while loop illustrates the flexibility that C provides in the formation of expressions. The parentheses are required around the assignment because the assignment operator has lower precedence than the not equals operator.

Special Functions for Working with Files

It is very likely that many of the programs you will develop will be able to perform all their I/O operations using just the getchar(), putchar(), scanf(), and printf() functions and the notion of I/O redirection. However, situations do arise when you need more flexibility to work with files. For example, you might need to read data from two or more different files or to write output results into several different files. To handle these situations, special functions have been designed expressly for working with files. Several of these functions are described in the following sections.

The fopen Function

Before you can begin to do any I/O operations on a file, the file must first be opened. To open a file, you must specify the name of the file. The system then checks to make certain that this file actually exists and, in certain instances, creates the file for you if it does not. When a file is opened, you must also specify to the system the type of I/O operations that you intend to perform with the file. If the file is to be used to read in data, you normally open the file in read mode. If you want to write data into the file, you open the file in write mode. Finally, if you want to append information to the end of a file that already contains some data, you open the file in append mode. In the latter two cases, write and append mode, if the specified file does not exist on the system, the system creates the file for you. In the case of read mode, if the file does not exist, an error occurs.

Because a program can have many different files open at the same time, you need a way to identify a particular file in your program when you want to perform some I/O operation on the file. This is done by means of a file pointer.

The function called fopen() in the standard library serves the function of opening a file on the system and of returning a unique file pointer with which to subsequently identify the file. The function takes two arguments: The first is a character string specifying the name of the file to be opened; the second is also a character string that indicates the mode in which the file is to be opened. The function returns a file pointer that is used by other library functions to identify the particular file.

If the file cannot be opened for some reason, the function returns the value NULL, which is defined inside the header file <stdio.h>.2 Also defined in this file is the definition of a type called FILE. To store the result returned by the fopen() function in your program, you must define a variable of type “pointer to FILE.”

2. NULL is “officially” defined in the header file <stddef.h>; however, it is most likely also defined in <stdio.h>.

If you take the preceding comments into account, the statements

#include <stdio.h>

FILE *inputFile;

inputFile = fopen ("data", "r");

have the effect of opening a file called data in read mode. (Write mode is specified by the string "w", and append mode is specified by the string "a".) The fopen() call returns an identifier for the opened file that is assigned to the FILE pointer variable inputFile. Subsequent testing of this variable against the defined value NULL, as in the following:

if ( inputFile == NULL )
printf ("*** data could not be opened.\n");
else
// read the data from the file

tells you whether the open was successful.

You should always check the result of an fopen() call to make certain it succeeds. Using a NULL pointer can produce unpredictable results.

Frequently, in the fopen() call, the assignment of the returned FILE pointer variable and the test against the NULL pointer are combined into a single statement, as follows:

if ( (inputFile = fopen ("data", "r")) == NULL )
printf ("*** data could not be opened.\n");

The fopen() function also supports three other types of modes, called update modes ("r+", "w+", and "a+"). All three update modes permit both reading and writing operations to be performed on a file. Read update ("r+") opens an existing file for both reading and writing. Write update ("w+") is like write mode (if the file already exists, the contents are destroyed; if one doesn’t exist, it’s created), but once again both reading and writing are permitted. Append update ("a+") opens an existing file or creates a new one if one doesn’t exist. Read operations can occur anywhere in the file, but write operations can only add data to the end.

Under operating systems such as Windows, which distinguish text files from binary files, a b must be added to the end of the mode string to read or write a binary file. If you forget to do this, you will get strange results, even though your program will still run. This is because on these systems, carriage return/line feed character pairs are converted to return characters when they are read from or written to text files. Furthermore, on input, a file that contains a Ctrl+Z character causes an end-of-file condition if the file was not opened as a binary file. So,

inputFile = fopen ("data", "rb");

opens the binary file data for reading.

The getc() and putc() Functions

The function getc() enables you to read in a single character from a file. This function behaves identically to the getchar() function described previously. The only difference is that getc() takes an argument: a FILE pointer that identifies the file from which the character is to be read. So, if fopen() is called as shown previously, then subsequent execution of the statement

c = getc (inputFile);

has the effect of reading a single character from the file data. Subsequent characters can be read from the file simply by making additional calls to the getc() function.

The getc() function returns the value EOF when the end of file is reached, and as with the getchar() function, the value returned by getc() should be stored in a variable of type int.

As you might have guessed, the putc() function is equivalent to the putchar() function, only it takes two arguments instead of one. The first argument to putc() is the character that is to be written into the file. The second argument is the FILE pointer. So the call

putc ('\n', outputFile);

writes a newline character into the file identified by the FILE pointer outputFile. Of course, the identified file must have been previously opened in either write or append mode (or in any of the update modes) for this call to succeed.

The fclose() Function

One operation that you can perform on a file, which must be mentioned, is that of closing the file. The fclose() function, in a sense, does the opposite of what the fopen() does: It tells the system that you no longer need to access the file. When a file is closed, the system performs some necessary housekeeping chores (such as writing all the data that it might be keeping in a buffer in memory to the file) and then dissociates the particular file identifier from the file. After a file has been closed, it can no longer be read from or written to unless it is reopened.

When you have completed your operations on a file, it is a good habit to close the file. When a program terminates normally, the system automatically closes any open files for you. It is generally better programming practice to close a file as soon as you are done with it. This can be beneficial if your program has to deal with a large number of files, as there are practical limits on the number of files that can be kept simultaneously open by a program. Your system might have various limits on the number of files that you can have open simultaneously. This might only be an issue if you are working with multiple files in your program.

By the way, the argument to the fclose() function is the FILE pointer of the file to be closed. So, the call

fclose (inputFile);

closes the file associated with the FILE pointer inputFile.

With the functions fopen(), putc(), getc(), and fclose(), you can now proceed to write a program that will copy one file to another. Program 15.5 prompts the user for the name of the file to be copied and the name of the resultant copied file. This program is based upon Program 15.4. You might want to refer to that program for comparison purposes.

Assume that the following three lines of text have been previously typed into the file copyme:

This is a test of the file copy program
that we have just developed using the
fopen, fclose, getc, and putc functions.

Program 15.5 Copying Files


// Program to copy one file to another

#include <stdio.h>

int main (void)
{
char inName[64], outName[64];
FILE *in, *out;
int c;

// get file names from user

printf ("Enter name of file to be copied: ");
scanf ("%63s", inName);
printf ("Enter name of output file: ");
scanf ("%63s", outName);

// open input and output files

if ( (in = fopen (inName, "r")) == NULL ) {
printf ("Can't open %s for reading.\n", inName);
return 1;
}

if ( (out = fopen (outName, "w")) == NULL ) {
printf ("Can't open %s for writing.\n", outName);
return 2;
}

// copy in to out

while ( (c = getc (in)) != EOF )
putc (c, out);

// Close open files

fclose (in);
fclose (out);

printf ("File has been copied.\n");


return 0;
}


Program 15.5 Output


Enter name of file to be copied: copyme
Enter name of output file: here
File has been copied.


Now examine the contents of the file here. The file should contain the same three lines of text as contained in the copyme file.

The scanf() function call in the beginning of the program is given a field-width count of 63 just to ensure that you don’t overflow your inName or outName character arrays. The program then opens the specified input file for reading and the specified output file for writing. If the output file already exists and is opened in write mode, its previous contents are overwritten on most systems.

If either of the two fopen() calls is unsuccessful, the program displays an appropriate message at the terminal and proceeds no further, returning a nonzero exit status to indicate the failure. Otherwise, if both opens succeed, the file is copied one character at a time by means of successivegetc() and putc() calls until the end of the file is encountered. The program then closes the two files and returns a zero exit status to indicate success.

The feof Function

To test for an end-of-file condition on a file, the function feof() is provided. The single argument to the function is a FILE pointer. The function returns an integer value that is nonzero if an attempt has been made to read past the end of a file, and is zero otherwise. So, the statements

if ( feof (inFile) ) {
printf ("Ran out of data.\n");
return 1;
}

have the effect of displaying the message “Ran out of data” at the terminal if an end-of-file condition exists on the file identified by inFile.

Remember, feof() tells you that an attempt has been made to read past the end of the file, which is not the same as telling you that you just read the last data item from a file. You have to read one past the last data item for feof() to return nonzero.

The fprintf() and fscanf() Functions

The functions fprintf() and fscanf() are provided to perform the analogous operations of the printf() and scanf() functions on a file. These functions take an additional argument, which is the FILE pointer that identifies the file to which the data is to be written or from which the data is to be read. So, to write the character string "Programming in C is fun.\n" into the file identified by outFile, you can write the following statement:

fprintf (outFile, "Programming in C is fun.\n");

Similarly, to read in the next floating-point value from the file identified by inFile into the variable fv, the statement

fscanf (inFile, "%f", &fv);

can be used. As with scanf(), fscanf() returns the number of arguments that are successfully read and assigned or the value EOF, if the end of the file is reached before any of the conversion specifications have been processed.

The fgets() and fputs() Functions

For reading and writing entire lines of data from and to a file, the fputs() and fgets() functions can be used. The fgets() function is called as follows:

fgets (buffer, n, filePtr);

buffer is a pointer to a character array where the line that is read in will be stored; n is an integer value that represents the maximum number of characters to be stored into buffer; and filePtr identifies the file from which the line is to be read.

The fgets() function reads characters from the specified file until a newline character has been read (which will get stored in the buffer) or until n-1 characters have been read, whichever occurs first. The function automatically places a null character after the last character in buffer. It returns the value of buffer (the first argument) if the read is successful, and the value NULL if an error occurs on the read or if an attempt is made to read past the end of the file.

The fgets() function can be combined with sscanf() (see Appendix B, “The Standard C Library”) to perform line-oriented reading in a more orderly and controlled fashion than by using scanf() alone.

The fputs() function writes a line of characters to a specified file. The function is called as follows:

fputs (buffer, filePtr);

Characters stored in the array pointed to by buffer are written to the file identified by filePtr until the null character is reached. The terminating null character is not written to the file.

There are also analogous functions called gets() and puts() that can be used to read a line from the terminal and write a line to the terminal, respectively. These functions are described in Appendix B.

stdin, stdout, and stderr

When a C program is executed, three files are automatically opened by the system for use by the program. These files are identified by the constant FILE pointers stdin, stdout, and stderr, which are defined in <stdio.h>. The FILE pointer stdin identifies the standard input of the program and is normally associated with your terminal window. All standard I/O functions that perform input and do not take a FILE pointer as an argument get their input from stdin. For example, the scanf() function reads its input from stdin, and a call to this function is equivalent to a call to the fscanf() function with stdin as the first argument. So, the call

fscanf (stdin, "%i", &i);

reads in the next integer value from the standard input, which is normally your terminal window. If the input to your program has been redirected to a file, this call reads the next integer value from the file to which the standard input has been redirected.

As you might have guessed, stdout refers to the standard output, which is normally also associated with your terminal window. So, a call such as

printf ("hello there.\n");

can be replaced by an equivalent call to the fprintf() function with stdout as the first argument:

fprintf (stdout, "hello there.\n");

The FILE pointer stderr identifies the standard error file. This is where most of the error messages produced by the system are written and is also normally associated with your terminal window. The reason stderr exists is so that error messages can be logged to a device or file other than where the normal output is written. This is particularly desirable when the program’s output is redirected to a file. In such a case, the normal output is written into the file, but any system error messages still appear in your window. You might want to write your own error messages tostderr for this same reason. As an example, the fprintf() call in the following statement:

if ( (inFile = fopen ("data", "r")) == NULL )
{
fprintf (stderr, "Can't open data for reading.\n");
...
}

writes the indicated error message to stderr if the file data cannot be opened for reading. In addition, if the standard output has been redirected to a file, this message still appears in your window.

The exit() Function

At times, you might want to force the termination of a program, such as when an error condition is detected by a program. You know that program execution is automatically terminated whenever the last statement in main() is executed or when executing a return from main(). To explicitly terminate a program, no matter from what point you are executing, the exit() function can be called. The function call

exit (n);

has the effect of terminating (exiting from) the current program. Any open files are automatically closed by the system. The integer value n is called the exit status, and has the same meaning as the value returned from main().

The standard header file <stdlib.h> defines EXIT_FAILURE as an integer value that you can use to indicate the program has failed and EXIT_SUCCESS to be one that you can use to indicate it has succeeded.

When a program terminates simply by executing the last statement in main, its exit status is undefined. If another program needs to use this exit status, you mustn’t let this happen. In such a case, make certain that you exit or return from main() with a defined exit status.

As an example of the use of the exit() function, the following function causes the program to terminate with an exit status of EXIT_FAILURE if the file specified as its argument cannot be opened for reading. Naturally, you might want to return the fact that the open failed instead of taking such a drastic action by terminating the program.

#include <stdlib.h>
#include <stdio.h>

FILE *openFile (const char *file)
{
FILE *inFile;

if ( (inFile = fopen (file, "r")) == NULL ) {
fprintf (stderr, "Can't open %s for reading.\n", file);
exit (EXIT_FAILURE);
}

return inFile;
}

Remember that there’s no real difference between exiting or returning from main(). They both terminate the program, sending back an exit status. The main difference between exit() and return() is when they’re executed from inside a function other than main(). The exit() call terminates the program immediately whereas return() simply transfers control back to the calling routine.

Renaming and Removing Files

The rename() function from the library can be used to change the name of a file. It takes two arguments: the old filename and the new filename. If for some reason the renaming operation fails (for example, if the first file doesn’t exist, or the system doesn’t allow you to rename the particular file), rename() returns a nonzero value. The code

if ( rename ("tempfile", "database") ) {
fprintf (stderr, "Can't rename tempfile\n");
exit (EXIT_FAILURE);
}

renames the file called tempfile to database and checks the result of the operation to ensure it succeeded.

The remove() function deletes the file specified by its argument. It returns a nonzero value if the file removal fails. The code

if ( remove ("tempfile") )
{
fprintf (stderr, "Can't remove tempfile\n");
exit (EXIT_FAILURE);
}

attempts to remove the file tempfile and writes an error message to standard error and exit if the removal fails.

Incidentally, you might be interested in using the perror() function to report errors from standard library routines. For more details, consult Appendix B.

This concludes our discussion of I/O operations under C. As mentioned, not all of the library functions are covered here due to lack of space. The standard C library contains a wide selection of functions for performing operations with character strings, for random I/O, mathematical calculations, and dynamic memory management. Appendix B lists many of the functions inside this library.

Exercises

1. Type in and run the three programs presented in this chapter. Compare the output produced by each program with the output presented in the text.

2. Go back to programs developed earlier in this book and experiment with redirecting their input and output to files.

3. Write a program to copy one file to another, replacing all lowercase characters with their uppercase equivalents.

4. Write a program that merges lines alternately from two files and writes the results to stdout. If one file has fewer lines than the other, the remaining lines from the larger file should simply be copied to stdout.

5. Write a program that writes columns m through n of each line of a file to stdout. Have the program accept the values of m and n from the terminal window.

6. Write a program that displays the contents of a file at the terminal 20 lines at a time. At the end of each 20 lines, have the program wait for a character to be entered from the terminal. If the character is the letter q, the program should stop the display of the file; any other character should cause the next 20 lines from the file to be displayed.