Understanding Object Lifetime - Advanced C# Programming - C# 6.0 and the .NET 4.6 Framework (2015)

C# 6.0 and the .NET 4.6 Framework (2015)

PART IV

image

Advanced C# Programming

CHAPTER 13

image

Understanding Object Lifetime

At this point in the book, you have learned a great deal about how to build custom class types using C#. Now you will see how the CLR manages allocated class instances (aka objects) via garbage collection. C# programmers never directly deallocate a managed object from memory (recall there is no delete keyword in the C# language). Rather, .NET objects are allocated to a region of memory termed the managed heap, where they will be automatically destroyed by the garbage collector “sometime in the future.”

After you have looked at the core details of the collection process, you’ll learn how to programmatically interact with the garbage collector using the System.GC class type (which is something you will typically not be required to do for a majority of your .NET projects). Next, you’ll examine how the virtual System.Object.Finalize() method and IDisposable interface can be used to build classes that release internal unmanaged resources in a predictable and timely manner.

You will also delve into some functionality of the garbage collector introduced in .NET 4.0, including background garbage collections and lazy instantiation using the generic System.Lazy<> class. By the time you have completed this chapter, you will have a solid understanding of how .NET objects are managed by the CLR.

Classes, Objects, and References

To frame the topics covered in this chapter, it is important to further clarify the distinction between classes, objects, and reference variables. Recall that a class is nothing more than a blueprint that describes how an instance of this type will look and feel in memory. Classes, of course, are defined within a code file (which in C# takes a *.cs extension by convention). Consider the following simple Car class defined within a new C# Console Application project named SimpleGC:

// Car.cs
public class Car
{
public int CurrentSpeed {get; set;}
public string PetName {get; set;}

public Car(){}
public Car(string name, int speed)
{
PetName = name;
CurrentSpeed = speed;
}
public override string ToString()
{
return string.Format("{0} is going {1} MPH",
PetName, CurrentSpeed);
}
}

After a class has been defined, you may allocate any number of objects using the C# new keyword. Understand, however, that the new keyword returns a reference to the object on the heap, not the actual object. If you declare the reference variable as a local variable in a method scope, it is stored on the stack for further use in your application. When you want to invoke members on the object, apply the C# dot operator to the stored reference, like so:

class Program
{
static void Main(string[] args)
{
Console.WriteLine("***** GC Basics *****");

// Create a new Car object on
// the managed heap. We are
// returned a reference to this
// object ("refToMyCar").
Car refToMyCar = new Car("Zippy", 50);

// The C# dot operator (.) is used
// to invoke members on the object
// using our reference variable.
Console.WriteLine(refToMyCar.ToString());
Console.ReadLine();
}
}

Figure 13-1 illustrates the class, object, and reference relationship.

image

Figure 13-1. References to objects on the managed heap

Image Note Recall from Chapter 4 that structures are value types that are always allocated directly on the stack and are never placed on the .NET managed heap. Heap allocation occurs only when you are creating instances of classes.

The Basics of Object Lifetime

When you are building your C# applications, you are correct to assume that the .NET runtime environment (aka the CLR) will take care of the managed heap without your direct intervention. In fact, the golden rule of .NET memory management is simple.

Image Rule Allocate a class instance onto the managed heap using the new keyword and forget about it.

Once instantiated, the garbage collector will destroy an object when it is no longer needed. The next obvious question, of course, is, “How does the garbage collector determine when an object is no longer needed?” The short (i.e., incomplete) answer is that the garbage collector removes an object from the heap only if it is unreachable by any part of your code base. Assume you have a method in your Program class that allocates a local Car object as follows:

static void MakeACar()
{
// If myCar is the only reference to the Car object,
// it *may* be destroyed when this method returns.
Car myCar = new Car();
}

Notice that this Car reference (myCar) has been created directly within the MakeACar() method and has not been passed outside of the defining scope (via a return value or ref/out parameters). Thus, once this method call completes, the myCar reference is no longer reachable, and the associated Car object is now a candidate for garbage collection. Understand, however, that you can’t guarantee that this object will be reclaimed from memory immediately after MakeACar() has completed. All you can assume at this point is that when the CLR performs the next garbage collection, the myCar object could be safely destroyed.

As you will most certainly discover, programming in a garbage-collected environment greatly simplifies your application development. In stark contrast, C++ programmers are painfully aware that if they fail to manually delete heap-allocated objects, memory leaks are never far behind. In fact, tracking down memory leaks is one of the most time-consuming (and tedious) aspects of programming in unmanaged environments. By allowing the garbage collector to take charge of destroying objects, the burden of memory management has been lifted from your shoulders and placed onto those of the CLR.

The CIL of new

When the C# compiler encounters the new keyword, it emits a CIL newobj instruction into the method implementation. If you compile the current example code and investigate the resulting assembly using ildasm.exe, you’d find the following CIL statements within the MakeACar()method:

.method private hidebysig static void MakeACar() cil managed
{
// Code size 8 (0x8)
.maxstack 1
.locals init ([0] class SimpleGC.Car myCar)
IL_0000: nop
IL_0001: newobj instance void SimpleGC.Car::.ctor()
IL_0006: stloc.0
IL_0007: ret
} // end of method Program::MakeACar

Before you examine the exact rules that determine when an object is removed from the managed heap, let’s check out the role of the CIL newobj instruction in a bit more detail. First, understand that the managed heap is more than just a random chunk of memory accessed by the CLR. The .NET garbage collector is quite a tidy housekeeper of the heap, given that it will compact empty blocks of memory (when necessary) for the purposes of optimization.

To aid in this endeavor, the managed heap maintains a pointer (commonly referred to as the next object pointer or new object pointer) that identifies exactly where the next object will be located. That said, the newobj instruction tells the CLR to perform the following core operations:

1. Calculate the total amount of memory required for the object to be allocated (including the memory required by the data members and the base classes).

2. Examine the managed heap to ensure that there is indeed enough room to host the object to be allocated. If there is, the specified constructor is called, and the caller is ultimately returned a reference to the new object in memory, whose address just happens to be identical to the last position of the next object pointer.

3. Finally, before returning the reference to the caller, advance the next object pointer to point to the next available slot on the managed heap.

Figure 13-2 illustrates the basic process.

image

Figure 13-2. The details of allocating objects onto the managed heap

As your application is busy allocating objects, the space on the managed heap may eventually become full. When processing the newobj instruction, if the CLR determines that the managed heap does not have sufficient memory to allocate the requested type, it will perform a garbage collection in an attempt to free up memory. Thus, the next rule of garbage collection is also quite simple.

Image Rule If the managed heap does not have sufficient memory to allocate a requested object, a garbage collection will occur.

Exactly how this garbage collection occurs, however, depends on which version of the .NET platform your application is running under. You’ll look at the differences a bit later in this chapter.

Setting Object References to null

C/C++ programmers often set pointer variables to null to ensure they are no longer referencing unmanaged memory. Given this, you might wonder what the end result is of assigning object references to null under C#. For example, assume the MakeACar() subroutine has now been updated as follows:

static void MakeACar()
{
Car myCar = new Car();
myCar = null;
}

When you assign object references to null, the compiler generates CIL code that ensures the reference (myCar, in this example) no longer points to any object. If you once again made use of ildasm.exe to view the CIL code of the modified MakeACar(), you would find theldnull opcode (which pushes a null value on the virtual execution stack) followed by a stloc.0 opcode (which sets the null reference on the variable).

.method private hidebysig static void MakeACar() cil managed
{
// Code size 10 (0xa)
.maxstack 1
.locals init ([0] class SimpleGC.Car myCar)
IL_0000: nop
IL_0001: newobj instance void SimpleGC.Car::.ctor()
IL_0006: stloc.0
IL_0007: ldnull
IL_0008: stloc.0
IL_0009: ret
} // end of method Program::MakeACar

What you must understand, however, is that assigning a reference to null does not in any way force the garbage collector to fire up at that exact moment and remove the object from the heap. The only thing you have accomplished is explicitly clipping the connection between the reference and the object it previously pointed to. Given this point, setting references to null under C# is far less consequential than doing so in other C-based languages; however, doing so will certainly not cause any harm.

The Role of Application Roots

Now, back to the topic of how the garbage collector determines when an object is no longer needed. To understand the details, you need to be aware of the notion of application roots. Simply put, a root is a storage location containing a reference to an object on the managed heap. Strictly speaking, a root can fall into any of the following categories:

· References to global objects (though these are not allowed in C#, CIL code does permit allocation of global objects)

· References to any static objects/static fields

· References to local objects within an application’s code base

· References to object parameters passed into a method

· References to objects waiting to be finalized (described later in this chapter)

· Any CPU register that references an object

During a garbage collection process, the runtime will investigate objects on the managed heap to determine whether they are still reachable (i.e., rooted) by the application. To do so, the CLR will build an object graph, which represents each reachable object on the heap. Object graphs are explained in some detail during the discussion of object serialization in Chapter 20. For now, just understand that object graphs are used to document all reachable objects. As well, be aware that the garbage collector will never graph the same object twice, thus avoiding the nasty circular reference count found in COM programming.

Assume the managed heap contains a set of objects named A, B, C, D, E, F, and G. During a garbage collection, these objects (as well as any internal object references they may contain) are examined for active roots. After the graph has been constructed, unreachable objects (which you can assume are objects C and F) are marked as garbage. Figure 13-3 diagrams a possible object graph for the scenario just described (you can read the directional arrows using the phrase depends on or requires; for example, E depends on G and B, A depends on nothing, and so on).

image

Figure 13-3. Object graphs are constructed to determine which objects are reachable by application roots

After objects have been marked for termination (C and F in this case—as they are not accounted for in the object graph), they are swept from memory. At this point, the remaining space on the heap is compacted, which in turn causes the CLR to modify the set of active application roots (and the underlying pointers) to refer to the correct memory location (this is done automatically and transparently). Last but not least, the next object pointer is readjusted to point to the next available slot. Figure 13-4 illustrates the resulting readjustment.

image

Figure 13-4. A clean and compacted heap

Image Note Strictly speaking, the garbage collector uses two distinct heaps, one of which is specifically used to store large objects. This heap is less frequently consulted during the collection cycle, given possible performance penalties involved with relocating large objects. Regardless, it is safe to consider the managed heap as a single region of memory.

Understanding Object Generations

When the CLR is attempting to locate unreachable objects, it does not literally examine every object placed on the managed heap. Doing so, obviously, would involve considerable time, especially in larger (i.e., real-world) applications.

To help optimize the process, each object on the heap is assigned to a specific “generation.” The idea behind generations is simple: the longer an object has existed on the heap, the more likely it is to stay there. For example, the class that defined the main window of a desktop application will be in memory until the program terminates. Conversely, objects that have only recently been placed on the heap (such as an object allocated within a method scope) are likely to be unreachable rather quickly. Given these assumptions, each object on the heap belongs to one of the following generations:

· Generation 0: Identifies a newly allocated object that has never been marked for collection

· Generation 1: Identifies an object that has survived a garbage collection (i.e., it was marked for collection but was not removed because the sufficient heap space was acquired)

· Generation 2: Identifies an object that has survived more than one sweep of the garbage collector

Image Note Generations 0 and 1 are termed ephemeral generations. As explained in the next section, you will see that the garbage collection process does treat ephemeral generations differently.

The garbage collector will investigate all generation 0 objects first. If marking and sweeping (or said more plainly, getting rid of) these objects results in the required amount of free memory, any surviving objects are promoted to generation 1. To see how an object’s generation affects the collection process, ponder Figure 13-5, which diagrams how a set of surviving generation 0 objects (A, B, and E) are promoted once the required memory has been reclaimed.

image

Figure 13-5. Generation 0 objects that survive a garbage collection are promoted to generation 1

If all generation 0 objects have been evaluated but additional memory is still required, generation 1 objects are then investigated for reachability and collected accordingly. Surviving generation 1 objects are then promoted to generation 2. If the garbage collector still requires additional memory, generation 2 objects are evaluated. At this point, if a generation 2 object survives a garbage collection, it remains a generation 2 object, given the predefined upper limit of object generations.

The bottom line is that by assigning a generational value to objects on the heap, newer objects (such as local variables) will be removed quickly, while older objects (such as a program’s main window) are not “bothered” as often.

Concurrent Garbage Collection Prior to .NET 4.0

Prior to .NET 4.0, the runtime would clean up unused objects using a technique termed concurrent garbage collection. Under this model, when a collection takes place for any generation 0 or generation 1 objects (recall these are ephemeral generations), the garbage collector temporarily suspends all active threads within the current process to ensure that the application does not access the managed heap during the collection process.

You will examine the topic of threads in Chapter 19; for the time being, simply regard a thread as a path of execution within a running executable. After the garbage collection cycle has completed, the suspended threads are permitted to carry on their work. Thankfully, the .NET 3.5 (and earlier) garbage collector was highly optimized; you seldom (if ever) noticed this brief interruption in your application.

As an optimization, concurrent garbage collection allowed objects that were not located in one of the ephemeral generations to be cleaned up on a dedicated thread. This decreased (but didn’t eliminate) the need for the .NET runtime to suspect active threads. Moreover, concurrent garbage collection allowed your program to continue allocating objects on the heap during the collection of nonephemeral generations.

Background Garbage Collection Under .NET 4.0 and Beyond

Beginning with .NET 4.0, the garbage collector is able to deal with thread suspension when it cleans up objects on the managed heap, using background garbage collection. Despite its name, this does not mean that all garbage collection now takes place on additional background threads of execution. Rather, if a background garbage collection is taking place for objects living in a nonephemeral generation, the .NET runtime is now able to collect objects on the ephemeral generations using a dedicated background thread.

On a related note, the .NET 4.0 and higher garbage collection has been improved to further reduce the amount of time a given thread involved with garbage collection details must be suspended. The end result of these changes is that the process of cleaning up unused objects living in generation 0 or generation 1 has been optimized and can result in better runtime performance of your programs (which is really important for real-time systems that require small, and predictable, GC stop time).

Do understand, however, that the introduction of this new garbage collection model has no effect on how you build your .NET applications. For all practical purposes, you can simply allow the .NET garbage collector to perform its work without your direct intervention (and be happy that the folks at Microsoft are improving the collection process in a transparent manner).

The System.GC Type

The mscorlib.dll assembly provides a class type named System.GC that allows you to programmatically interact with the garbage collector using a set of static members. Now, do be aware that you will seldom (if ever) need to make use of this class directly in your code. Typically, the only time you will use the members of System.GC is when you are creating classes that make internal use of unmanaged resources. This could be the case if you are building a class that makes calls into the Windows C-based API using the .NET platform invocation protocol or perhaps because of some very low-level and complicated COM interop logic. Table 13-1 provides a rundown of some of the more interesting members (consult the .NET Framework SDK documentation for complete details).

Table 13-1. Select Members of the System.GC Type

System.GC Member

Description

AddMemoryPressure() RemoveMemoryPressure()

Allows you to specify a numerical value that represents the calling object’s “urgency level” regarding the garbage collection process. Be aware that these methods should alter pressure in tandem and, thus, never remove more pressure than the total amount you have added.

Collect()

Forces the GC to perform a garbage collection. This method has been overloaded to specify a generation to collect, as well as the mode of collection (via the GCCollectionMode enumeration).

CollectionCount()

Returns a numerical value representing how many times a given generation has been swept.

GetGeneration()

Returns the generation to which an object currently belongs.

GetTotalMemory()

Returns the estimated amount of memory (in bytes) currently allocated on the managed heap. A Boolean parameter specifies whether the call should wait for garbage collection to occur before returning.

MaxGeneration

Returns the maximum number of generations supported on the target system. Under Microsoft’s .NET 4.0, there are three possible generations: 0, 1, and 2.

SuppressFinalize()

Sets a flag indicating that the specified object should not have its Finalize() method called.

WaitForPendingFinalizers()

Suspends the current thread until all finalizable objects have been finalized. This method is typically called directly after invoking GC.Collect().

To illustrate how the System.GC type can be used to obtain various garbage collection–centric details, consider the following Main() method, which makes use of several members of GC:

static void Main(string[] args)
{
Console.WriteLine("***** Fun with System.GC *****");

// Print out estimated number of bytes on heap.
Console.WriteLine("Estimated bytes on heap: {0}",
GC.GetTotalMemory(false));

// MaxGeneration is zero based, so add 1 for display purposes.
Console.WriteLine("This OS has {0} object generations.\n",
(GC.MaxGeneration + 1));

Car refToMyCar = new Car("Zippy", 100);
Console.WriteLine(refToMyCar.ToString());

// Print out generation of refToMyCar object.
Console.WriteLine("Generation of refToMyCar is: {0}",
GC.GetGeneration(refToMyCar));
Console.ReadLine();
}

Forcing a Garbage Collection

Again, the whole purpose of the .NET garbage collector is to manage memory on your behalf. However, in some rare circumstances, it may be beneficial to programmatically force a garbage collection using GC.Collect(). Here are two common situations where you might consider interacting with the collection process:

· Your application is about to enter into a block of code that you don’t want interrupted by a possible garbage collection.

· Your application has just finished allocating an extremely large number of objects and you want to remove as much of the acquired memory as soon as possible.

If you determine it could be beneficial to have the garbage collector check for unreachable objects, you could explicitly trigger a garbage collection, as follows:

static void Main(string[] args)
{
...
// Force a garbage collection and wait for
// each object to be finalized.
GC.Collect();
GC.WaitForPendingFinalizers();
...
}

When you manually force a garbage collection, you should always make a call to GC.WaitForPendingFinalizers(). With this approach, you can rest assured that all finalizable objects (described in the next section) have had a chance to perform any necessary cleanup before your program continues. Under the hood, GC.WaitForPendingFinalizers() will suspend the calling thread during the collection process. This is a good thing, as it ensures your code does not invoke methods on an object currently being destroyed!

The GC.Collect() method can also be supplied a numerical value that identifies the oldest generation on which a garbage collection will be performed. For example, to instruct the CLR to investigate only generation 0 objects, you would write the following:

static void Main(string[] args)
{
...
// Only investigate generation 0 objects.
GC.Collect(0);
GC.WaitForPendingFinalizers();
...
}

As well, the Collect() method can also be passed in a value of the GCCollectionMode enumeration as a second parameter, to fine-tune exactly how the runtime should force the garbage collection. This enum defines the following values:

public enum GCCollectionMode
{
Default, // Forced is the current default.
Forced, // Tells the runtime to collect immediately!
Optimized // Allows the runtime to determine
// whether the current time is optimal to reclaim objects.
}

As with any garbage collection, calling GC.Collect() promotes surviving generations. To illustrate, assume that your Main() method has been updated as follows:

static void Main(string[] args)
{
Console.WriteLine("***** Fun with System.GC *****");

// Print out estimated number of bytes on heap.
Console.WriteLine("Estimated bytes on heap: {0}",
GC.GetTotalMemory(false));

// MaxGeneration is zero based.
Console.WriteLine("This OS has {0} object generations.\n",
(GC.MaxGeneration + 1));
Car refToMyCar = new Car("Zippy", 100);
Console.WriteLine(refToMyCar.ToString());

// Print out generation of refToMyCar.
Console.WriteLine("\nGeneration of refToMyCar is: {0}",
GC.GetGeneration(refToMyCar));

// Make a ton of objects for testing purposes.
object[] tonsOfObjects = new object[50000];
for (int i = 0; i < 50000; i++)
tonsOfObjects[i] = new object();

// Collect only gen 0 objects.
GC.Collect(0, GCCollectionMode.Forced);
GC.WaitForPendingFinalizers();

// Print out generation of refToMyCar.
Console.WriteLine("Generation of refToMyCar is: {0}",
GC.GetGeneration(refToMyCar));

// See if tonsOfObjects[9000] is still alive.
if (tonsOfObjects[9000] != null)
{
Console.WriteLine("Generation of tonsOfObjects[9000] is: {0}",
GC.GetGeneration(tonsOfObjects[9000]));
}
else
Console.WriteLine("tonsOfObjects[9000] is no longer alive.");

// Print out how many times a generation has been swept.
Console.WriteLine("\nGen 0 has been swept {0} times",
GC.CollectionCount(0));
Console.WriteLine("Gen 1 has been swept {0} times",
GC.CollectionCount(1));
Console.WriteLine("Gen 2 has been swept {0} times",
GC.CollectionCount(2));
Console.ReadLine();
}

Here, I have purposely created a large array of object types (50,000 to be exact) for testing purposes. As you can see from the output that follows, even though this Main() method made only one explicit request for a garbage collection (via the GC.Collect() method), the CLR performed a number of them in the background.

***** Fun with System.GC *****
Estimated bytes on heap: 70240
This OS has 3 object generations.

Zippy is going 100 MPH

Generation of refToMyCar is: 0
Generation of refToMyCar is: 1
Generation of tonsOfObjects[9000] is: 1

Gen 0 has been swept 1 times
Gen 1 has been swept 0 times
Gen 2 has been swept 0 times

At this point, I hope you feel more comfortable regarding the details of object lifetime. In the next section, you’ll examine the garbage collection process a bit further by addressing how you can build finalizable objects, as well as disposable objects. Be aware that the following techniques are typically necessary only if you are building C# classes that maintain internal unmanaged resources.

Image Source Code The SimpleGC project is included in the Chapter 13 subdirectory.

Building Finalizable Objects

In Chapter 6, you learned that the supreme base class of .NET, System.Object, defines a virtual method named Finalize(). The default implementation of this method does nothing whatsoever.

// System.Object
public class Object
{
...
protected virtual void Finalize() {}
}

When you override Finalize() for your custom classes, you establish a specific location to perform any necessary cleanup logic for your type. Given that this member is defined as protected, it is not possible to directly call an object’s Finalize() method from a class instance via the dot operator. Rather, the garbage collector will call an object’s Finalize() method (if supported) before removing the object from memory.

Image Note It is illegal to override Finalize() on structure types. This makes perfect sense given that structures are value types, which are never allocated on the heap to begin with and, therefore, are not garbage collected! However, if you create a structure that contains unmanaged resources that need to be cleaned up, you can implement the IDisposable interface (described shortly).

Of course, a call to Finalize() will (eventually) occur during a “natural” garbage collection or possibly when you programmatically force a collection via GC.Collect(). In addition, a type’s finalizer method will automatically be called when the application domain hosting your application is unloaded from memory. Depending on your background in .NET, you may know that application domains (or simply AppDomains) are used to host an executable assembly and any necessary external code libraries. If you are not familiar with this .NET concept, you will be by the time you’ve finished Chapter 17. For now, note that when your AppDomain is unloaded from memory, the CLR automatically invokes finalizers for every finalizable object created during its lifetime.

Now, despite what your developer instincts may tell you, the vast majority of your C# classes will not require any explicit cleanup logic or a custom finalizer. The reason is simple: if your classes are just making use of other managed objects, everything will eventually be garbage-collected. The only time you would need to design a class that can clean up after itself is when you are using unmanaged resources (such as raw OS file handles, raw unmanaged database connections, chunks of unmanaged memory, or other unmanaged resources). Under the .NET platform, unmanaged resources are obtained by directly calling into the API of the operating system using Platform Invocation Services (PInvoke) or as a result of some elaborate COM interoperability scenarios. Given this, consider the next rule of garbage collection.

Image Rule The only compelling reason to override Finalize() is if your C# class is using unmanaged resources via PInvoke or complex COM interoperability tasks (typically via various members defined by the System.Runtime.InteropServices.Marshal type). The reason is that under these scenarios you are manipulating memory that the CLR cannot manage.

Overriding System.Object.Finalize()

In the rare case that you do build a C# class that uses unmanaged resources, you will obviously want to ensure that the underlying memory is released in a predictable manner. Suppose you have created a new C# Console Application project named SimpleFinalize and inserted a class namedMyResourceWrapper that uses an unmanaged resource (whatever that might be) and you want to override Finalize(). The odd thing about doing so in C# is that you can’t do it using the expected override keyword.

class MyResourceWrapper
{
// Compile-time error!
protected override void Finalize(){ }
}

Rather, when you want to configure your custom C# class types to override the Finalize() method, you make use of a (C++-like) destructor syntax to achieve the same effect. The reason for this alternative form of overriding a virtual method is that when the C# compiler processes the finalizer syntax, it automatically adds a good deal of required infrastructure within the implicitly overridden Finalize() method (shown in just a moment).

C# finalizers look similar to constructors in that they are named identically to the class they are defined within. In addition, finalizers are prefixed with a tilde symbol (~). Unlike a constructor, however, a finalizer never takes an access modifier (they are implicitly protected), never takes parameters, and can’t be overloaded (only one finalizer per class).

The following is a custom finalizer for MyResourceWrapper that will issue a system beep when invoked. Obviously, this example is only for instructional purposes. A real-world finalizer would do nothing more than free any unmanaged resources and would not interact with other managed objects, even those referenced by the current object, as you can’t assume they are still alive at the point the garbage collector invokes your Finalize() method.

// Override System.Object.Finalize() via finalizer syntax.
class MyResourceWrapper
{
~MyResourceWrapper()
{
// Clean up unmanaged resources here.

// Beep when destroyed (testing purposes only!)
Console.Beep();
}
}

If you were to examine this C# destructor using ildasm.exe, you would see that the compiler inserts some necessary error-checking code. First, the code statements within the scope of your Finalize() method are placed within a try block (see Chapter 7). The related finallyblock ensures that your base classes’ Finalize() method will always execute, regardless of any exceptions encountered within the try scope.

.method family hidebysig virtual instance void
Finalize() cil managed
{
// Code size 13 (0xd)
.maxstack 1
.try
{
IL_0000: ldc.i4 0x4e20
IL_0005: ldc.i4 0x3e8
IL_000a: call
void [mscorlib]System.Console::Beep(int32, int32)
IL_000f: nop
IL_0010: nop
IL_0011: leave.s IL_001b
} // end .try
finally
{
IL_0013: ldarg.0
IL_0014:
call instance void [mscorlib]System.Object::Finalize()
IL_0019: nop
IL_001a: endfinally
} // end handler
IL_001b: nop
IL_001c: ret
} // end of method MyResourceWrapper::Finalize

If you then tested the MyResourceWrapper type, you would find that a system beep occurs when the application terminates, given that the CLR will automatically invoke finalizers upon AppDomain shutdown.

static void Main(string[] args)
{
Console.WriteLine("***** Fun with Finalizers *****\n");
Console.WriteLine("Hit the return key to shut down this app");
Console.WriteLine("and force the GC to invoke Finalize()");
Console.WriteLine("for finalizable objects created in this AppDomain.");
Console.ReadLine();
MyResourceWrapper rw = new MyResourceWrapper();
}

Image Source Code The SimpleFinalize project is included in the Chapter 13 subdirectory.

Detailing the Finalization Process

Not to beat a dead horse, but always remember that the role of the Finalize() method is to ensure that a .NET object can clean up unmanaged resources when it is garbage-collected. Thus, if you are building a class that does not make use of unmanaged memory (by far the most common case), finalization is of little use. In fact, if at all possible, you should design your types to avoid supporting a Finalize() method for the simple reason that finalization takes time.

When you allocate an object onto the managed heap, the runtime automatically determines whether your object supports a custom Finalize() method. If so, the object is marked as finalizable, and a pointer to this object is stored on an internal queue named the finalization queue. The finalization queue is a table maintained by the garbage collector that points to every object that must be finalized before it is removed from the heap.

When the garbage collector determines it is time to free an object from memory, it examines each entry on the finalization queue and copies the object off the heap to yet another managed structure termed the finalization reachable table (often abbreviated as freachable and pronounced “eff- reachable”). At this point, a separate thread is spawned to invoke the Finalize() method for each object on the freachable table at the next garbage collection. Given this, it will take, at the least, two garbage collections to truly finalize an object.

The bottom line is that while finalization of an object does ensure an object can clean up unmanaged resources, it is still nondeterministic in nature and, because of the extra behind-the-curtains processing, considerably slower.

Building Disposable Objects

As you have seen, finalizers can be used to release unmanaged resources when the garbage collector kicks in. However, given that many unmanaged objects are “precious items” (such as raw database or file handles), it could be valuable to release them as soon as possible instead of relying on a garbage collection to occur. As an alternative to overriding Finalize(), your class could implement the IDisposable interface, which defines a single method named Dispose() as follows:

public interface IDisposable
{
void Dispose();
}

When you do implement the IDisposable interface, the assumption is that when the object user is finished using the object, the object user manually calls Dispose() before allowing the object reference to drop out of scope. In this way, an object can perform any necessary cleanup of unmanaged resources without incurring the hit of being placed on the finalization queue and without waiting for the garbage collector to trigger the class’s finalization logic.

Image Note Structures and class types can both implement IDisposable (unlike overriding Finalize(), which is reserved for class types), as the object user (not the garbage collector) invokes the Dispose() method.

To illustrate the use of this interface, create a new C# Console Application project named SimpleDispose. Here is an updated MyResourceWrapper class that now implements IDisposable, rather than overriding System.Object.Finalize():

// Implementing IDisposable.
class MyResourceWrapper : IDisposable
{
// The object user should call this method
// when they finish with the object.
public void Dispose()
{
// Clean up unmanaged resources...

// Dispose other contained disposable objects...

// Just for a test.
Console.WriteLine("***** In Dispose! *****");
}
}

Notice that a Dispose() method not only is responsible for releasing the type’s unmanaged resources but can also call Dispose() on any other contained disposable methods. Unlike with Finalize(), it is perfectly safe to communicate with other managed objects within aDispose() method. The reason is simple: the garbage collector has no clue about the IDisposable interface and will never call Dispose(). Therefore, when the object user calls this method, the object is still living a productive life on the managed heap and has access to all other heap-allocated objects. The calling logic, shown here, is straightforward:

class Program
{
static void Main(string[] args)
{
Console.WriteLine("***** Fun with Dispose *****\n");
// Create a disposable object and call Dispose()
// to free any internal resources.
MyResourceWrapper rw = new MyResourceWrapper();
rw.Dispose();
Console.ReadLine();
}
}

Of course, before you attempt to call Dispose() on an object, you will want to ensure the type supports the IDisposable interface. While you will typically know which base class library types implement IDisposable by consulting the .NET Framework 4.5 SDK documentation, a programmatic check can be accomplished using the is or as keywords discussed in Chapter 6.

class Program
{
static void Main(string[] args)
{
Console.WriteLine("***** Fun with Dispose *****\n");
MyResourceWrapper rw = new MyResourceWrapper();
if (rw is IDisposable)
rw.Dispose();
Console.ReadLine();
}
}

This example exposes yet another rule regarding memory management.

Image Rule It is a good idea to call Dispose() on any object you directly create if the object supports IDisposable. The assumption you should make is that if the class designer chose to support the Dispose() method, the type has some cleanup to perform. If you forget, memory will eventually be cleaned up (so don’t panic), but it could take longer than necessary.

There is one caveat to the previous rule. A number of types in the base class libraries that do implement the IDisposable interface provide a (somewhat confusing) alias to the Dispose() method, in an attempt to make the disposal-centric method sound more natural for the defining type. By way of an example, while the System.IO.FileStream class implements IDisposable (and therefore supports a Dispose() method), it also defines the following Close() method that is used for the same purpose:

// Assume you have imported
// the System.IO namespace...
static void DisposeFileStream()
{
FileStream fs = new FileStream("myFile.txt", FileMode.OpenOrCreate);

// Confusing, to say the least!
// These method calls do the same thing!
fs.Close();
fs.Dispose();
}

While it does feel more natural to “close” a file rather than “dispose” of one, this doubling up of cleanup methods can be confusing. For the few types that do provide an alias, just remember that if a type implements IDisposable, calling Dispose() is always a safe course of action.

Reusing the C# using Keyword

When you are handling a managed object that implements IDisposable, it is quite common to make use of structured exception handling to ensure the type’s Dispose() method is called in the event of a runtime exception, like so:

static void Main(string[] args)
{
Console.WriteLine("***** Fun with Dispose *****\n");
MyResourceWrapper rw = new MyResourceWrapper ();
try
{
// Use the members of rw.
}
finally
{
// Always call Dispose(), error or not.
rw.Dispose();
}
}

While this is a fine example of defensive programming, the truth of the matter is that few developers are thrilled by the prospects of wrapping every disposable type within a try/finally block just to ensure the Dispose() method is called. To achieve the same result in a much less obtrusive manner, C# supports a special bit of syntax that looks like this:

static void Main(string[] args)
{
Console.WriteLine("***** Fun with Dispose *****\n");
// Dispose() is called automatically when the
// using scope exits.
using(MyResourceWrapper rw = new MyResourceWrapper())
{
// Use rw object.
}
}

If you looked at the following CIL code of the Main() method using ildasm.exe, you would find the using syntax does indeed expand to try/finally logic, with the expected call to Dispose():

.method private hidebysig static void Main(string[] args) cil managed
{
...
.try
{
...
} // end .try
finally
{
...
IL_0012: callvirt instance void
SimpleFinalize.MyResourceWrapper::Dispose()
} // end handler
...
} // end of method Program::Main

Image Note If you attempt to “use” an object that does not implement IDisposable, you will receive a compiler error.

While this syntax does remove the need to manually wrap disposable objects within try/finally logic, the C# using keyword unfortunately now has a double meaning (importing namespaces and invoking a Dispose() method). Nevertheless, when you are working with .NET types that support the IDisposable interface, this syntactical construct will ensure that the object “being used” will automatically have its Dispose() method called once the using block has exited.

Also, be aware that it is possible to declare multiple objects of the same type within a using scope. As you would expect, the compiler will inject code to call Dispose() on each declared object.

static void Main(string[] args)
{
Console.WriteLine("***** Fun with Dispose *****\n");

// Use a comma-delimited list to declare multiple objects to dispose.
using(MyResourceWrapper rw = new MyResourceWrapper(),
rw2 = new MyResourceWrapper())
{
// Use rw and rw2 objects.
}
}

Image Source Code The SimpleDispose project is included in the Chapter 13 subdirectory.

Building Finalizable and Disposable Types

At this point, you have seen two different approaches to constructing a class that cleans up internal unmanaged resources. On the one hand, you can use a finalizer. Using this technique, you have the peace of mind that comes with knowing the object cleans itself up when garbage-collected (whenever that may be) without the need for user interaction. On the other hand, you can implement IDisposable to provide a way for the object user to clean up the object as soon as it is finished. However, if the caller forgets to call Dispose(), the unmanaged resources may be held in memory indefinitely.

As you might suspect, it is possible to blend both techniques into a single class definition. By doing so, you gain the best of both models. If the object user does remember to call Dispose(), you can inform the garbage collector to bypass the finalization process by callingGC.SuppressFinalize(). If the object user forgets to call Dispose(), the object will eventually be finalized and have a chance to free up the internal resources. The good news is that the object’s internal unmanaged resources will be freed one way or another.

Here is the next iteration of MyResourceWrapper, which is now finalizable and disposable, defined in a C# Console Application project named FinalizableDisposableClass:

// A sophisticated resource wrapper.
public class MyResourceWrapper : IDisposable
{
// The garbage collector will call this method if the
// object user forgets to call Dispose().
~MyResourceWrapper()
{
// Clean up any internal unmanaged resources.
// Do **not** call Dispose() on any managed objects.
}
// The object user will call this method to clean up
// resources ASAP.
public void Dispose()
{
// Clean up unmanaged resources here.
// Call Dispose() on other contained disposable objects.
// No need to finalize if user called Dispose(),
// so suppress finalization.
GC.SuppressFinalize(this);
}
}

Notice that this Dispose() method has been updated to call GC.SuppressFinalize(), which informs the CLR that it is no longer necessary to call the destructor when this object is garbage-collected, given that the unmanaged resources have already been freed via theDispose() logic.

A Formalized Disposal Pattern

The current implementation of MyResourceWrapper does work fairly well; however, you are left with a few minor drawbacks. First, the Finalize() and Dispose() methods each have to clean up the same unmanaged resources. This could result in duplicate code, which can easily become a nightmare to maintain. Ideally, you would define a private helper function that is called by either method.

Next, you’d like to make sure that the Finalize() method does not attempt to dispose of any managed objects, while the Dispose() method should do so. Finally, you’d also like to be certain the object user can safely call Dispose() multiple times without error. Currently, theDispose() method has no such safeguards.

To address these design issues, Microsoft defined a formal, prim-and-proper disposal pattern that strikes a balance between robustness, maintainability, and performance. Here is the final (and annotated) version of MyResourceWrapper, which makes use of this official pattern:

class MyResourceWrapper : IDisposable
{
// Used to determine if Dispose()
// has already been called.
private bool disposed = false;

public void Dispose()
{
// Call our helper method.
// Specifying "true" signifies that
// the object user triggered the cleanup.
CleanUp(true);

// Now suppress finalization.
GC.SuppressFinalize(this);
}

private void CleanUp(bool disposing)
{
// Be sure we have not already been disposed!
if (!this.disposed)
{

// If disposing equals true, dispose all
// managed resources.
if (disposing)
{
// Dispose managed resources.
}
// Clean up unmanaged resources here.
}
disposed = true;
}
~MyResourceWrapper()
{
// Call our helper method.
// Specifying "false" signifies that
// the GC triggered the cleanup.
CleanUp(false);
}
}

Notice that MyResourceWrapper now defines a private helper method named CleanUp(). By specifying true as an argument, you indicate that the object user has initiated the cleanup, so you should clean up all managed and unmanaged resources. However, when the garbage collector initiates the cleanup, you specify false when calling CleanUp() to ensure that internal disposable objects are not disposed (as you can’t assume they are still in memory!). Last but not least, the bool member variable (disposed) is set to true before exiting CleanUp() to ensure that Dispose() can be called numerous times without error.

Image Note After an object has been “disposed,” it’s still possible for the client to invoke members on it, as it is still in memory. Therefore, a robust resource wrapper class would also need to update each member of the class with additional coding logic that says, in effect, “If I am disposed, do nothing and return from the member.”

To test the final iteration of MyResourceWrapper, add a call to Console.Beep() within the scope of your finalizer, like so:

~MyResourceWrapper()
{
Console.Beep();
// Call our helper method.
// Specifying "false" signifies that
// the GC triggered the cleanup.
CleanUp(false);
}

Next, update Main() as follows:

static void Main(string[] args)
{
Console.WriteLine("***** Dispose() / Destructor Combo Platter *****");

// Call Dispose() manually. This will not call the finalizer.
MyResourceWrapper rw = new MyResourceWrapper();
rw.Dispose();

// Don’t call Dispose(). This will trigger the finalizer
// and cause a beep.
MyResourceWrapper rw2 = new MyResourceWrapper();
}

Notice that you are explicitly calling Dispose() on the rw object, so the destructor call is suppressed. However, you have “forgotten” to call Dispose() on the rw2 object, and therefore, when the application terminates, you hear a single beep. If you were to comment out the call toDispose() on the rw object, you would hear two beeps.

Image Source Code The FinalizableDisposableClass project is included in the Chapter 13 subdirectory.

That concludes your investigation of how the CLR manages your objects via garbage collection. While there are additional (somewhat esoteric) details regarding the collection process I haven’t covered here (such as weak references and object resurrection), you are now in a perfect position for further exploration on your own. To wrap this chapter up, you will examine a programming feature called lazy instantiation of objects.

Understanding Lazy Object Instantiation

When you are creating classes, you might occasionally need to account for a particular member variable in code, which might never actually be needed, in that the object user might not call the method (or property) that makes use of it. Fair enough. However, this can be problematic if the member variable in question requires a large amount of memory to be instantiated.

For example, assume you are writing a class that encapsulates the operations of a digital music player. In addition to the expected methods, such as Play(), Pause(), and Stop(), you also want to provide the ability to return a collection of Song objects (via a class namedAllTracks), which represents every single digital music file on the device.

If you’d like to follow along, create a new Console Application project named LazyObjectInstantiation, and define the following class types:

// Represents a single song.
class Song
{
public string Artist { get; set; }
public string TrackName { get; set; }
public double TrackLength { get; set; }
}

// Represents all songs on a player.
class AllTracks
{
// Our media player can have a maximum
// of 10,000 songs.
private Song[] allSongs = new Song[10000];

public AllTracks()
{
// Assume we fill up the array
// of Song objects here.
Console.WriteLine("Filling up the songs!");
}
}

// The MediaPlayer has-an AllTracks object.
class MediaPlayer
{
// Assume these methods do something useful.
public void Play() { /* Play a song */ }
public void Pause() { /* Pause the song */ }
public void Stop() { /* Stop playback */ }
private AllTracks allSongs = new AllTracks();

public AllTracks GetAllTracks()
{
// Return all of the songs.
return allSongs;
}
}

The current implementation of MediaPlayer assumes that the object user will want to obtain a list of songs via the GetAllTracks() method. Well, what if the object user does not need to obtain this list? In the current implementation, the AllTracks member variable will still be allocated, thereby creating 10,000 Song objects in memory, as follows:

static void Main(string[] args)
{
Console.WriteLine("***** Fun with Lazy Instantiation *****\n");

// This caller does not care about getting all songs,
// but indirectly created 10,000 objects!
MediaPlayer myPlayer = new MediaPlayer();
myPlayer.Play();

Console.ReadLine();
}

Clearly, you would rather not create 10,000 objects that nobody will use, as that will add a good deal of stress to the .NET garbage collector. While you could manually add some code to ensure the allSongs object is created only if used (perhaps using the factory method design pattern), there is an easier way.

The base class libraries provide a useful generic class named Lazy<>, defined in the System namespace of mscorlib.dll. This class allows you to define data that will not be created unless your code base actually uses it. As this is a generic class, you must specify the type of item to be created on first use, which can be any type with the .NET base class libraries or a custom type you have authored yourself. To enable lazy instantiation of the AllTracks member variable, you can simply replace this:

// The MediaPlayer has-an AllTracks object.
class MediaPlayer
{
...
private AllTracks allSongs = new AllTracks();

public AllTracks GetAllTracks()
{
// Return all of the songs.
return allSongs;
}
}

with this:

// The MediaPlayer has-an Lazy<AllTracks> object.
class MediaPlayer
{
...
private Lazy<AllTracks> allSongs = new Lazy<AllTracks>();
public AllTracks GetAllTracks()
{
// Return all of the songs.
return allSongs.Value;
}
}

Beyond the fact that you are now representing the AllTracks member variable as a Lazy<> type, notice that the implementation of the previous GetAllTracks() method has also been updated. Specifically, you must use the read-only Value property of the Lazy<> class to obtain the actual stored data (in this case, the AllTracks object that is maintaining the 10,000 Song objects).

With this simple update, notice how the following updated Main() method will indirectly allocate the Song objects only if GetAllTracks() is indeed called:

static void Main(string[] args)
{
Console.WriteLine("***** Fun with Lazy Instantiation *****\n");

// No allocation of AllTracks object here!
MediaPlayer myPlayer = new MediaPlayer();
myPlayer.Play();

// Allocation of AllTracks happens when you call GetAllTracks().
MediaPlayer yourPlayer = new MediaPlayer();
AllTracks yourMusic = yourPlayer.GetAllTracks();

Console.ReadLine();
}

Image Note Lazy object instantiation is useful not only to decrease allocation of unnecessary objects. You can also use this technique if a given member has expensive creation code, such as invoking a remote method, communication with a relational database, or what not.

Customizing the Creation of the Lazy Data

When you declare a Lazy<> variable, the actual internal data type is created using the default constructor, like so:

// Default constructor of AllTracks is called when the Lazy<>
// variable is used.
private Lazy<AllTracks> allSongs = new Lazy<AllTracks>();

While this might be fine in some cases, what if the AllTracks class had some additional constructors and you want to ensure the correct one is called? Furthermore, what if you have some extra work to do (beyond simply creating the AllTracks object) when the Lazy<> variable is made? As luck would have it, the Lazy<> class allows you to specify a generic delegate as an optional parameter, which will specify a method to call during the creation of the wrapped type.

The generic delegate in question is of type System.Func<>, which can point to a method that returns the same data type being created by the related Lazy<> variable and can take up to 16 arguments (which are typed using generic type parameters). In most cases, you will not need to specify any parameters to pass to the method pointed to by Func<>. Furthermore, to greatly simplify the use of the required Func<>, I recommend using a lambda expression (see Chapter 10 to review the delegate/lambda relationship).

With this in mind, the following is a final version of MediaPlayer that adds a bit of custom code when the wrapped AllTracks object is created. Remember, this method must return a new instance of the type wrapped by Lazy<> before exiting, and you can use any constructor you choose (here, you are still invoking the default constructor of AllTracks).

class MediaPlayer
{
...
// Use a lambda expression to add additional code
// when the AllTracks object is made.
private Lazy<AllTracks> allSongs = new Lazy<AllTracks>( () =>
{
Console.WriteLine("Creating AllTracks object!");
return new AllTracks();
}
);

public AllTracks GetAllTracks()
{
// Return all of the songs.
return allSongs.Value;
}
}

Sweet! I hope you can see the usefulness of the Lazy<> class. Essentially, this generic class allows you to ensure expensive objects are allocated only when the object user requires them. If you find this topic useful for your projects, you might also want to look up the System.Lazy<>class in the .NET Framework 4.5 SDK documentation for further examples of how to program for lazy instantiation.

Image Source Code The LazyObjectInstantiation project is included in the Chapter 13 subdirectory.

Summary

The point of this chapter was to demystify the garbage collection process. As you saw, the garbage collector will run only when it is unable to acquire the necessary memory from the managed heap (or when a given AppDomain unloads from memory). When a collection does occur, you can rest assured that Microsoft’s collection algorithm has been optimized by the use of object generations, secondary threads for the purpose of object finalization, and a managed heap dedicated to hosting large objects.

This chapter also illustrated how to programmatically interact with the garbage collector using the System.GC class type. As mentioned, the only time you will really need to do so is when you are building finalizable or disposable class types that operate on unmanaged resources.

Recall that finalizable types are classes that have provided a destructor (effectively overriding the Finalize() method) to clean up unmanaged resources at the time of garbage collection. Disposable objects, on the other hand, are classes (or structures) that implement theIDisposable interface, which should be called by the object user when it is finished using said objects. Finally, you learned about an official “disposal” pattern that blends both approaches.

This chapter wrapped up with a look at a generic class named Lazy<>. As you saw, you can use this class to delay the creation of an expensive (in terms of memory consumption) object until the caller actually requires it. By doing so, you can help reduce the number of objects stored on the managed heap and also ensure expensive objects are created only when actually required by the caller.