Dynamic Memory Management in C - Understanding and Using C Pointers (2013)

Understanding and Using C Pointers (2013)

Chapter 2. Dynamic Memory Management in C

Much of the power of pointers stems from their ability to track dynamically allocated memory. The management of this memory through pointers forms the basis for many operations, including those used to manipulate complex data structures. To be able to fully exploit these capabilities, we need to understand how dynamic memory management occurs in C.

A C program executes within a runtime system. This is typically the environment provided by an operating system. The runtime system supports the stack and heap along with other program behavior.

Memory management is central to all programs. Sometimes memory is managed by the runtime system implicitly, such as when memory is allocated for automatic variables. In this case, variables are allocated to the enclosing function’s stack frame. In the case of static and global variables, memory is placed in the application’s data segment, where it is zeroed out. This is a separate area from executable code and other data managed by the runtime system.

The ability to allocate and then deallocate memory allows an application to manage its memory more efficiently and with greater flexibility. Instead of having to allocate memory to accommodate the largest possible size for a data structure, only the actual amount required needs to be allocated.

For example, arrays are fixed size in versions of C prior to C99. If we need to hold a variable number of elements, such as employee records, we would be forced to declare an array large enough to hold the maximum number of employees we believe would be needed. If we underestimate the size, we are forced to either recompile the application with a larger size or to take other approaches. If we overestimate the size, then we will waste space. The ability to dynamically allocate memory also helps when dealing with data structures using a variable number of elements, such as a linked list or a queue.

NOTE

C99 introduced Variable Length Arrays (VLAs). The array’s size is determined at runtime and not at compile time. However, once created, arrays still do not change size.

Languages such as C also support dynamic memory management where objects are allocated memory from the heap. This is done manually using functions to allocate and deallocate memory. The process is referred to as dynamic memory management.

We start this chapter with a quick overview of how memory is allocated and freed. Next, we present basic allocation functions such as malloc and realloc. The free function is discussed, including the use of NULL along with such problems as double free.

Dangling pointers are a common problem. We will present examples to illustrate when dangling pointers occur and techniques to handle the problem. The last section presents alternate techniques for managing memory. Improper use of pointers can result in unpredictable behavior. By this we mean the program can produce invalid results, corrupt data, or possibly terminate the program.

Dynamic Memory Allocation

The basic steps used for dynamic memory allocation in C are:

1. Use a malloc type function to allocate memory

2. Use this memory to support the application

3. Deallocate the memory using the free function

While there are some minor variations to this approach, this is the most common technique. In the following example, we allocate memory for an integer using the malloc function. The pointer assigns five to the allocated memory, and then the memory is released using the free function:

int *pi = (int*) malloc(sizeof(int));

*pi = 5;

printf("*pi: %d\n", *pi);

free(pi);

When this sequence is executed, it will display the number 5. Figure 2-1 illustrates how memory is allocated right before the free function is executed. For the purposes of this chapter, we will assume that the example code is found in the main function unless otherwise noted.

Allocating memory for an integer

Figure 2-1. Allocating memory for an integer

The malloc function single argument specifies the number of bytes to allocate. If successful, it returns a pointer to memory allocated from the heap. If it fails, it returns a null pointer. Testing the validity of an allocated pointer is discussed in Using the malloc Function. The sizeof operator makes the application more portable and determines the correct number of bytes to allocate for the host system.

In this example, we are trying to allocate enough memory for an integer. If we assume its size is 4, we can use:

int *pi = (int*) malloc(4));

However, the size of an integer can vary, depending on the memory model used. A portable approach is to use the sizeof operator. This will return the correct size regardless of where the program is executing.

NOTE

A common error involving the dereference operator is demonstrated below:

int *pi;

*pi = (int*) malloc(sizeof(int));

The problem is with the lefthand side of the assignment operation. We are dereferencing the pointer. This will assign the address returned by malloc to the address stored in pi. If this is the first time an assignment is made to the pointer, then the address contained in the pointer is probably invalid. The correct approach is shown below:

pi = (int*) malloc(sizeof(int));

The dereference operator should not be used in this situation.

The free function, also discussed in more detail later, works in conjunction with malloc to deallocate the memory when it is no longer needed.

NOTE

Each time the malloc function (or similar function) is called, a corresponding call to the free function must be made when the application is done with the memory to avoid memory leaks.

Once memory has been freed, it should not be accessed again. Normally, you would not intentionally access it after it had been deallocated. However, this can occur accidentally, as illustrated in the section Dangling Pointers. The system behaves in an implementation-dependent manner when this happens. A common practice is to always assign NULL to a freed pointer, as discussed in Assigning NULL to a Freed Pointer.

When memory is allocated, additional information is stored as part of a data structure maintained by the heap manager. This information includes, among other things, the block’s size, and is typically placed immediately adjacent to the allocated block. If the application writes outside of this block of memory, then the data structure can be corrupted. This can lead to strange program behavior or corruption of the heap, as we will see in Chapter 7.

Consider the following code sequence. Memory is allocated for a string, allowing it to hold up to five characters plus the byte for the NUL termination character. The for loop writes zeros to each location but does not stop after writing six bytes. The for statement’s terminal condition requires that it write eight bytes. The zeros being written are binary zeros and not the ASCII value for the character zero:

char *pc = (char*) malloc(6);

for(int i=0; i<8; i++) {

*pc[i] = 0;

}

In Figure 2-2, extra memory has been allocated at the end of the six-byte string. This represents the extra memory used by the heap manager to keep track of the memory allocation. If we write past the end of the string, this extra memory will be corrupted. The extra memory is shown following the string in this example. However, its actual placement and its original content depend on the compiler.

Extra memory used by heap manager

Figure 2-2. Extra memory used by heap manager

Memory Leaks

A memory leak occurs when allocated memory is never used again but is not freed. This can happen when:

§ The memory’s address is lost

§ The free function is never invoked though it should be (sometimes called a hidden leak)

A problem with memory leaks is that the memory cannot be reclaimed and used later. The amount of memory available to the heap manager is decreased. If memory is repeatedly allocated and then lost, then the program may terminate when more memory is needed but malloc cannot allocate it because it ran out of memory. In extreme cases, the operating system may crash.

This is illustrated in the following simple example:

char *chunk;

while (1) {

chunk = (char*) malloc(1000000);

printf("Allocating\n");

}

The variable chunk is assigned memory from the heap. However, this memory is not freed before another block of memory is assigned to it. Eventually, the application will run out of memory and terminate abnormally. At minimum, memory is not being used efficiently.

Losing the address

An example of losing the address of memory is illustrated in the following code sequence where pi is reassigned a new address. The address of the first allocation of memory is lost when pi is allocated memory a second time.

int *pi = (int*) malloc(sizeof(int));

*pi = 5;

...

pi = (int*) malloc(sizeof(int));

This is illustrated in Figure 2-3 where the before and after images refer to the program’s state before and after the second malloc’s execution. The memory at address 500 has not been released, and the program no longer holds this address anywhere.

Losing an address

Figure 2-3. Losing an address

Another example allocates memory for a string, initializes it, and then displays the string character by character:

char *name = (char*)malloc(strlen("Susan")+1);

strcpy(name,"Susan");

while(*name != 0) {

printf("%c",*name);

name++;

}

However, it increments name by one with each loop iteration. At the end, name is left pointing to the string’s NUL termination character, as illustrated in Figure 2-4. The allocated memory’s starting address has been lost.

Losing address of dynamically allocated memory

Figure 2-4. Losing address of dynamically allocated memory

Hidden memory leaks

Memory leaks can also occur when the program should release memory but does not. A hidden memory leak occurs when an object is kept in the heap even though the object is no longer needed. This is frequently the result of programmer oversight. The primary problem with this type of leak is that the object is using memory that is no longer needed and should be returned to the heap. In the worst case, the heap manager may not be able to allocate memory when requested, possibly forcing the program to terminate. At best, we are holding unneeded memory.

Memory leaks can also occur when freeing structures created using the struct keyword. If the structure contains pointers to dynamically allocated memory, then these pointers may need to be freed before the structure is freed. An example of this is found in Chapter 6.

Dynamic Memory Allocation Functions

Several memory allocation functions are available to manage dynamic memory. While what is available may be system dependent, the following functions are found on most systems in the stdlib.h header file:

§ malloc

§ realloc

§ calloc

§ free

The functions are summarized in Table 2-1.

Table 2-1. Dynamic memory allocation functions

Function

Description

malloc

Allocates memory from the heap

realloc

Reallocates memory to a larger or smaller amount based on a previously allocated block of memory

calloc

Allocates and zeros out memory from the heap

free

Returns a block of memory to the heap

Dynamic memory is allocated from the heap. With successive memory allocation calls, there is no guarantee regarding the order of the memory or the continuity of memory allocated. However, the memory allocated will be aligned according to the pointer’s data type. For example, a four-byte integer would be allocated on an address boundary evenly divisible by four. The address returned by the heap manager will contain the lowest byte’s address.

In Figure 2-3, the malloc function allocates four bytes at address 500. The second use of the malloc function allocates memory at address 600. They both are on four-byte address boundaries, and they did not allocate memory from consecutive memory locations.

Using the malloc Function

The function malloc allocates a block of memory from the heap. The number of bytes allocated is specified by its single argument. Its return type is a pointer to void. If memory is not available, NULL is returned. The function does not clear or otherwise modify the memory, thus the contents of memory should be treated as if it contained garbage. The function’s prototype follows:

void* malloc(size_t);

The function possesses a single argument of type size_t. This type is discussed in Chapter 1. You need to be careful when passing variables to this function, as problems can arise if the argument is a negative number. On some systems, a NULL value is returned if the argument is negative.

When malloc is used with an argument of zero, its behavior is implementation-specific. It may return a pointer to NULL or it may return a pointer to a region with zero bytes allocated. If the malloc function is used with a NULL argument, then it will normally generate a warning and execute returning zero bytes.

The following shows a typical use of the malloc function:

int *pi = (int*) malloc(sizeof(int));

The following steps are performed when the malloc function is executed:

1. Memory is allocated from the heap

2. The memory is not modified or otherwise cleared

3. The first byte’s address is returned

NOTE

Since the malloc function may return a NULL value if it is unable to allocate memory, it is a good practice to check for a NULL value before using the pointer as follows:

int *pi = (int*) malloc(sizeof(int));

if(pi != NULL) {

// Pointer should be good

} else {

// Bad pointer

}

To cast or not to cast

Before the pointer to void was introduced to C, explicit casts were required with malloc to stop the generation of warnings when assignments were made between incompatible pointer types. Since a pointer to void can be assigned to any other pointer type, explicit casting is no longer required. Some developers consider explicit casts to be a good practice because:

§ They document the intention of the malloc function

§ They make the code compatible with C++ (or earlier C compiler), which require explicit casts

Using casts will be a problem if you fail to include the header file for malloc. The compiler may generate warnings. By default, C assumes functions return an integer. If you fail to include a prototype for malloc, it will complain when you try to assign an integer to a pointer.

Failing to allocate memory

If you declare a pointer but fail to allocate memory to the address it points to before using it, that memory will usually contain garbage, resulting typically in an invalid memory reference. Consider the following code sequence:

int *pi;

...

printf("%d\n",*pi);

The allocation of memory is shown in Figure 2-5. This issue is covered in more detail in Chapter 7.

Failure to allocate memory

Figure 2-5. Failure to allocate memory

When executed, this can result in a runtime exception. This type of problem is common with strings, as shown below:

char *name;

printf("Enter a name: ");

scanf("%s",name);

While it may seem like this would execute correctly, we are using memory referenced by name. However, this memory has not been allocated. This problem can be illustrated graphically by changing the variable, pi, in Figure 2-5 to name.

Not using the right size for the malloc function

The malloc function allocates the number of bytes specified by its argument. You need to be careful when using the function to allocate the correct number of bytes. For example, if we want to allocate space for 10 doubles, then we need to allocate 80 bytes. This is achieved as shown below:

double *pd = (double*)malloc(NUMBER_OF_DOUBLES * sizeof(double));

NOTE

Use the sizeof operator when specifying the number of bytes to allocate for data types whenever possible.

In the following example, an attempt is made to allocate memory for 10 doubles:

const int NUMBER_OF_DOUBLES = 10;

double *pd = (double*)malloc(NUMBER_OF_DOUBLES);

However, the code only allocated 10 bytes.

Determining the amount of memory allocated

There is no standard way to determine the total amount of memory allocated by the heap. However, some compilers provide extensions for this purpose. In addition, there is no standard way of determining the size of a memory block allocated by the heap manager.

For example, if we allocate 64 bytes for a string, the heap manager will allocate additional memory to manage this block. The total size allocated, and the amount used by the heap manager, is the sum of these two quantities. This was illustrated in Figure 2-2.

The maximum size that can be allocated with malloc is system dependent. It would seem like this size should be limited by size_t. However, limitations can be imposed by the amount of physical memory present and other operating system constraints.

When malloc executes, it is supposed to allocate the amount of memory requested and then return the memory’s address. What happens if the underlying operating system uses “lazy initialization” where it does not actually allocate the memory until it is accessed? A problem can arise at this point if there is not enough memory available to allocate. The answer depends on the runtime and operating systems. A typical developer normally would not need to deal with this question because such initialization schemes are quite rare.

Using malloc with static and global pointers

You cannot use a function call when initializing a static or global variable. In the following code sequence, we declare a static variable and then attempt to initialize it using malloc:

static int *pi = malloc(sizeof(int));

This will generate a compile-time error message. The same thing happens with global variables but can be avoided for static variables by using a separate statement to allocate memory to the variable as follows. We cannot use a separate assignment statement with global variables because global variables are declared outside of a function and executable code, such as the assignment statement, must be inside of a function:

static int *pi;

pi = malloc(sizeof(int));

NOTE

From the compiler standpoint, there is a difference between using the initialization operator, =, and using the assignment operator, =.

Using the calloc Function

The calloc function will allocate and clear memory at the same time. Its prototype follows:

void *calloc(size_t numElements, size_t elementSize);

NOTE

To clear memory means its contents are set to all binary zeros.

The function will allocate memory determined by the product of the numElements and elementSize parameters. A pointer is returned to the first byte of memory. If the function is unable to allocate memory, NULL is returned. Originally, this function was used to aid in the allocation of memory for arrays.

If either numElements or elementSize is zero, then a null pointer may be returned. If calloc is unable to allocate memory, a null pointer is returned and the global variable, errno, is set to ENOMEM (out of memory). This is a POSIX error code and may not be available on all systems.

Consider the following example where pi is allocated a total of 20 bytes, all containing zeros:

int *pi = calloc(5,sizeof(int));

Instead of using calloc, the malloc function along with the memset function can be used to achieve the same results, as shown below:

int *pi = malloc(5 * sizeof(int));

memset(pi, 0, 5* sizeof(int));

NOTE

The memset function will fill a block with a value. The first argument is a pointer to the buffer to fill. The second is the value used to fill the buffer, and the last argument is the number of bytes to be set.

Use calloc when memory needs to be zeroed out. However, the execution of calloc may take longer than using malloc.

NOTE

The function cfree is no longer needed. In the early days of C it was used to free memory allocated by calloc.

Using the realloc Function

Periodically, it may be necessary to increase or decrease the amount of memory allocated to a pointer. This is particularly useful when a variable size array is needed, as will be demonstrated in Chapter 4. The realloc function will reallocate memory. Its prototype follows:

void *realloc(void *ptr, size_t size);

The function realloc returns a pointer to a block of memory. The function takes two arguments. The first is a pointer to the original block, and the second is the requested size. The reallocated block’s size will be different from the size of the block referenced by the first argument. The return value is a pointer to the reallocated memory.

The requested size may be smaller or larger than the currently allocated amount. If the size is less than what is currently allocated, then the excess memory is returned to the heap. There is no guarantee that the excess memory will be cleared. If the size is greater than what is currently allocated, then if possible, the memory will be allocated from the region immediately following the current allocation. Otherwise, memory is allocated from a different region of the heap and the old memory is copied to the new region.

If the size is zero and the pointer is not null, then the pointer will be freed. If space cannot be allocated, then the original block of memory is retained and is not changed. However, the pointer returned is a null pointer and the errno is set to ENOMEM.

The function’s behavior is summarized in Table 2-2.

Table 2-2. Behavior of realloc function

First Parameter

Second Parameter

Behavior

null

NA

Same as malloc

Not null

0

Original block is freed

Not null

Less than the original block’s size

A smaller block is allocated using the current block

Not null

Larger than the original block’s size

A larger block is allocated either from the current location or another region of the heap

In the following example, we use two variables to allocate memory for a string. Initially, we allocate 16 bytes but only use the first 13 bytes (12 hexadecimal digits and the null termination character (0)):

char *string1;

char *string2;

string1 = (char*) malloc(16);

strcpy(string1, "0123456789AB");

Next, we use the realloc function to specify a smaller region of memory. The address and contents of these two variables are then displayed:

string2 = realloc(string1, 8);

printf("string1 Value: %p [%s]\n", string1, string1);

printf("string2 Value: %p [%s]\n", string2, string2);

The output follows:

string1 Value: 0x500 [0123456789AB]

string2 Value: 0x500 [0123456789AB]

The allocation of memory is illustrated in Figure 2-6.

realloc example

Figure 2-6. realloc example

The heap manager was able to reuse the original block, and it did not modify its contents. However, the program continued to use more than the eight bytes requested. That is, we did not change the string to fit into the eight-byte block. In this example, we should have adjusted the length of the string so that it fits into the eight reallocated bytes. The simplest way of doing this is to assign a NUL character to address 507. Using more space than allocated is not a good practice and should be avoided, as detailed in Chapter 7.

In this next example, we will reallocate additional memory:

string1 = (char*) malloc(16);

strcpy(string1, "0123456789AB");

string2 = realloc(string1, 64);

printf("string1 Value: %p [%s]\n", string1, string1);

printf("string2 Value: %p [%s]\n", string2, string2);

When executed, you may get results similar to the following:

string1 Value: 0x500 [0123456789AB]

string2 Value: 0x600 [0123456789AB]

In this example, realloc had to allocate a new block of memory. Figure 2-7 illustrates the allocation of memory.

Allocating additional memory

Figure 2-7. Allocating additional memory

The alloca Function and Variable Length Arrays

The alloca function (Microsoft’s malloca) allocates memory by placing it in the stack frame for the function. When the function returns, the memory is automatically freed. This function can be difficult to implement if the underlying runtime system is not stack-based. As a result, this function is nonstandard and should be avoided if the application needs to be portable.

In C99, Variable Length Arrays (VLAs) were introduced, allowing the declaration and creation of an array within a function whose size is based on a variable. In the following example, an array of char is allocated for use in a function:

void compute(int size) {

char* buffer[size];

...

}

This means the allocation of memory is done at runtime and memory is allocated as part of the stack frame. Also, when the sizeof operator is used with the array, it will be executed at runtime rather than compile time.

A small runtime penalty will be imposed. Also, when the function exits, the memory is effectively deallocated. Since we did not use a malloc type function to create it, we should not use the free function to deallocate it. The function should not return a pointer to this memory either. This issue is addressed in Chapter 5.

NOTE

VLAs do not change size. Their size is fixed once they are allocated. If you need an array whose size actually changes, then an approach such as using the realloc function, as discussed in the section Using the realloc Function, is needed.

Deallocating Memory Using the free Function

With dynamic memory allocation, the programmer is able to return memory when it is no longer being used, thus freeing it up for other uses. This is normally performed using the free function, whose prototype is shown below:

void free(void *ptr);

The pointer argument should contain the address of memory allocated by a malloc type function. This memory is returned to the heap. While the pointer may still point to the region, always assume it points to garbage. This region may be reallocated later and populated with different data.

In the simple example below, pi is allocated memory and is eventually freed:

int *pi = (int*) malloc(sizeof(int));

...

free(pi);

Figure 2-8 illustrates the allocation of memory immediately before and right after the free function executes. The dashed box at address 500 that indicates the memory has been freed but still may contain its value. The variable pi still contains the address 500. This is called a dangling pointer and is discussed in detail in the section Dangling Pointers.

Release of memory using free

Figure 2-8. Release of memory using free

If the free function is passed a null pointer, then it normally does nothing. If the pointer passed has been allocated by other than a malloc type function, then the function’s behavior is undefined. In the following example, pi is allocated the address of num. However, this is not a valid heap address:

int num;

int *pi = &num;

free(pi); // Undefined behavior

NOTE

Manage memory allocation/deallocation at the same level. For example, if a pointer is allocated within a function, deallocate it in the same function.

Assigning NULL to a Freed Pointer

Pointers can cause problems even after they have been freed. If we try to dereference a freed pointer, its behavior is undefined. As a result, some programmers will explicitly assign NULL to a pointer to designate the pointer as invalid. Subsequent use of such a pointer will result in a runtime exception.

An example of this approach follows:

int *pi = (int*) malloc(sizeof(int));

...

free(pi);

pi = NULL;

The allocation of memory is illustrated in Figure 2-9.

Assigning NULL after using free

Figure 2-9. Assigning NULL after using free

This technique attempts to address problems like dangling pointers. However, it is better to spend time addressing the conditions that caused the problems rather than crudely catching them with a null pointer. In addition, you cannot assign NULL to a constant pointer except when it is initialized.

Double Free

The term double free refers to an attempt to free a block of memory twice. A simple example follows:

int *pi = (int*) malloc(sizeof(int));

*pi = 5;

free(pi);

...

free(pi);

The execution of the second free function will result in a runtime exception. A less obvious example involves the use of two pointers, both pointing to the same block of memory. As shown below, the same runtime exception will result when we accidentally try to free the same memory a second time:

p1 = (int*) malloc(sizeof(int));

int *p2 = p1;

free(p1);

...

free(p2);

This allocation of memory is illustrated in Figure 2-10.

NOTE

When two pointers reference the same location, it is referred to as aliasing. This concept is discussed in Chapter 8.

Double free

Figure 2-10. Double free

Unfortunately, heap managers have a difficult time determining whether a block has already been deallocated. Thus, they don’t attempt to detect the same memory being freed twice. This normally results in a corrupt heap and program termination. Even if the program does not terminate, it represents questionable problem logic. There is no reason to free the same memory twice.

It has been suggested that the free function should assign a NULL or some other special value to its argument when it returns. However, since pointers are passed by value, the free function is unable to explicitly assign NULL to the pointer. This is explained in more detail in the sectionPassing a Pointer to a Pointer.

The Heap and System Memory

The heap typically uses operating system functions to manage its memory. The heap’s size may be fixed when the program is created, or it may be allowed to grow. However, the heap manager does not necessarily return memory to the operating system when the free function is called. The deallocated memory is simply made available for subsequent use by the application. Thus, when a program allocates and then frees up memory, the deallocation of memory is not normally reflected in the application’s memory usage as seen from the operating system perspective.

Freeing Memory upon Program Termination

The operating system is responsible for maintaining the resources of an application, including its memory. When an application terminates, it is the operating system’s responsibility to reallocate this memory for other applications. The state of the terminated application’s memory, corrupted or uncorrupted, is not an issue. In fact, one of the reasons an application may terminate is because its memory is corrupted. With an abnormal program termination, cleanup may not be possible. Thus, there is no reason to free allocated memory before the application terminates.

With this said, there may be other reasons why this memory should be freed. The conscientious programmer may want to free memory as a quality issue. It is always a good habit to free memory after it is no longer needed, even if the application is terminating. If you use a tool to detect memory leaks or similar problems, then deallocating memory will clean up the output of such tools. In some less complex operating systems, the operating system may not reclaim memory automatically, and it may be the program’s responsibility to reclaim memory before terminating. Also, a later version of the application could add code toward the end of the program. If the previous memory has not been freed, problems could arise.

Thus, ensuring that all memory is free before program termination:

§ May be more trouble than it’s worth

§ Can be time consuming and complicated for the deallocation of complex structures

§ Can add to the application’s size

§ Results in longer running time

§ Introduces the opportunity for more programming errors

Whether memory should be deallocated prior to program termination is application-specific.

Dangling Pointers

If a pointer still references the original memory after it has been freed, it is called a dangling pointer. The pointer does not point to a valid object. This is sometimes referred to as a premature free.

The use of dangling pointers can result in a number of different types of problems, including:

§ Unpredictable behavior if the memory is accessed

§ Segmentation faults when the memory is no longer accessible

§ Potential security risks

These types of problems can result when:

§ Memory is accessed after it has been freed

§ A pointer is returned to an automatic variable in a previous function call (discussed in the section Pointers to Local Data)

Dangling Pointer Examples

Below is a simple example where we allocate memory for an integer using the malloc function. Next, the memory is released using the free function:

int *pi = (int*) malloc(sizeof(int));

*pi = 5;

printf("*pi: %d\n", *pi);

free(pi);

The variable pi will still hold the integer’s address. However, this memory may be reused by the heap manager and may hold data other than an integer. Figure 2-11 illustrates the program’s state immediately before and after the free function is executed. The pi variable is assumed to be part of the main function and is located at address 100. The memory allocated using malloc is found at address 500.

When the free function is executed, the memory at address 500 has been deallocated and should not be used. However, most runtime systems will not prevent subsequent access or modification. We may still attempt to write to the location as shown below. The result of this action is unpredictable.

free(pi);

*pi = 10;

Dangling pointer

Figure 2-11. Dangling pointer

A more insidious example occurs when more than one pointer references the same area of memory and one of them is freed. As shown below, p1 and p2 both refer to the same area of memory, which is called pointer aliasing. However, p1 is freed:

int *p1 = (int*) malloc(sizeof(int));

*p1 = 5;

...

int *p2;

p2 = p1;

...

free(p1);

...

*p2 = 10; // Dangling pointer

Figure 2-12 illustrates the allocation of memory where the dotted box represents freed memory.

Dangling pointer with aliased pointers

Figure 2-12. Dangling pointer with aliased pointers

A subtle problem can occur when using block statements, as shown below. Here pi is assigned the address of tmp. The variable pi may be a global variable or a local variable. However, when tmp’s enclosing block is popped off of the program stack, the address is no longer valid:

int *pi;

...

{

int tmp = 5;

pi = &tmp;

}

// pi is now a dangling pointer

foo();

Most compilers will treat a block statement as a stack frame. The variable tmp was allocated on the stack frame and subsequently popped off the stack when the block statement was exited. The pointer pi is now left pointing to a region of memory that may eventually be overridden by a different activation record, such as the function foo. This condition is illustrated in Figure 2-13.

Block statement problem

Figure 2-13. Block statement problem

Dealing with Dangling Pointers

Debugging pointer-induced problems can be difficult to resolve at times. Several approaches exist for dealing with dangling pointers, including:

§ Setting a pointer to NULL after freeing it. Its subsequent use will terminate the application. However, problems can still persist if multiple copies of the pointer exist. This is because the assignment will only affect one of the copies, as illustrated in the section Double Free.

§ Writing special functions to replace the free function (see Writing your own free function).

§ Some systems (runtime/debugger) will overwrite data when it is freed (e.g., 0xDEADBEEF - Visual Studio will use 0xCC, 0xCD, or 0xDD, depending on what is freed). While no exceptions are thrown, when the programmer sees memory containing these values where they are not expected, he knows that the program may be accessing freed memory.

§ Use third-party tools to detect dangling pointers and other problems.

Displaying pointer values can be helpful in debugging dangling pointers, but you need to be careful how they are displayed. We have already discussed how to display pointer values in Displaying Pointer Values. Make sure you display them consistently to avoid confusion when comparing pointer values. The assert macro can also be useful, as demonstrated in Dealing with Uninitialized Pointers.

Debug Version Support for Detecting Memory Leaks

Microsoft provides techniques for addressing overwriting of dynamically allocated memory and memory leaks. This approach uses special memory management techniques in debug versions of a program to:

§ Check the heap’s integrity

§ Check for memory leaks

§ Simulate low heap memory situations

Microsoft does this by using a special data structure to manage memory allocation. This structure maintains debug information, such as the filename and line number where malloc is called. In addition, buffers are allocated before and after the actual memory allocation to detect overwriting of the actual memory. More information about this technique can be found at Microsoft Developer Network.

The Mudflap Libraries provide a similar capability for the GCC compiler. Its runtime library supports the detection of memory leaks, among other things. This detection is accomplished by instrumenting the pointer dereferencing operations.

Dynamic Memory Allocation Technologies

So far, we have talked about the heap manager’s allocating and deallocating memory. However, the implementation of this technology can vary by compiler. Most heap managers use a heap or data segment as the source for memory. However, this approach is subject to fragmentation and may collide with the program stack. Nevertheless, it is the most common way of implementing the heap.

Heap managers need to address many issues, such as whether heaps are allocated on a per process and/or per thread basis and how to protect the heap from security breaches.

There are a number of heap managers, including OpenBSD’s malloc, Hoard’s malloc, and TCMalloc developed by Google. The GNU C library allocator is based on the general-purpose allocator dlmalloc. It provides facilities for debugging and can help in tracking memory leaks. The dlmalloc’s logging feature tracks memory usage and memory transaction, among other actions.

A manual technique for managing the memory used for structures is presented in Avoiding malloc/free Overhead.

Garbage Collection in C

The malloc and free functions provide a way of manually allocating and deallocating memory. However, there are numerous issues regarding the use of manual memory management in C, such as performance, achieving good locality of reference, threading problems, and cleaning up memory gracefully.

Several nonstandard techniques can be used to address some of these issues, and this section explores some of them. A key feature of these techniques is the automatic deallocation of memory. When memory is no longer needed, it is collected and made available for use later in the program. The deallocated memory is referred to as garbage. Hence, the term garbage collection denotes the processing of this memory.

Garbage collection is useful for a number of reasons, including:

§ Freeing the programmer from having to decide when to deallocate memory

§ Allowing the programmer to focus on the application’s problem

One alternative to manual memory management is the Boehm-Weiser Collector. However, this is not part of the language.

Resource Acquisition Is Initialization

Resource Acquisition Is Initialization (RAII) is a technique invented by Bjarne Stroustrup. It addresses the allocation and deallocation of resources in C++. The technique is useful for guaranteeing the allocation and subsequent deallocation of a resource in the presence of exceptions. Allocated resources will eventually be released.

There have been several approaches for using RAII in C. The GNU compiler provides a nonstandard extension to support this. We will illustrate this extension by showing how memory can be allocated and then freed within a function. When the variable goes out of scope, the deallocation process occurs automatically.

The GNU extension uses a macro called RAII_VARIABLE. It declares a variable and associates with the variable:

§ A type

§ A function to execute when the variable is created

§ A function to execute when the variable goes out of scope

The macro is shown below:

#define RAII_VARIABLE(vartype,varname,initval,dtor) \

void _dtor_ ## varname (vartype * v) { dtor(*v); } \

vartype varname __attribute__((cleanup(_dtor_ ## varname))) = (initval)

In the following example, we declare a variable called name as a pointer to char. When it is created, the malloc function is executed, allocating 32 bytes to it. When the function is terminated, name goes out of scope and the free function is executed:

void raiiExample() {

RAII_VARIABLE(char*, name, (char*)malloc(32), free);

strcpy(name,"RAII Example");

printf("%s\n",name);

}

When this function is executed, the string “RAII_Example” will be displayed.

Similar results can be achieved without using the GNU extension.

Using Exception Handlers

Another approach to deal with the deallocation of memory is to use exception handling. While exception handling is not a standard part of C, it can be useful if available and possible portability issues are not a concern. The following illustrates the approach using the Microsoft Visual Studio version of the C language.

Here the try block encloses any statements that might cause an exception to be thrown at runtime. The finally block will be executed regardless of whether an exception is thrown. The free function is guaranteed to be executed.

void exceptionExample() {

int *pi = NULL;

__try {

pi = (int*)malloc(sizeof(int));

*pi = 5;

printf("%d\n",*pi);

}

__finally {

free(pi);

}

}

You can implement exception handling in C using several other approaches.

Summary

Dynamic memory allocation is a significant C language feature. In this chapter, we focused on the manual allocation of memory using the malloc and free functions. We addressed a number of common problems involving these functions, including the failure to allocate memory and dangling pointers.

There are other nonstandard techniques for managing dynamic memory in C. We touched on a few of these garbage collection techniques, including RAII and exception handling.