Understanding Values and References - Microsoft® Visual C#® 2012 Step by Step (2012)

Microsoft® Visual C#® 2012 Step by Step (2012)

Chapter 8. Understanding Values and References

After completing this chapter, you will be able to

§ Explain the differences between a value type and a reference type.

§ Modify the way in which arguments are passed as method parameters by using the ref and out keywords.

§ Convert a value into a reference by using boxing.

§ Convert a reference back to a value by using unboxing and casting.

In Chapter 7, you learned how to declare your own classes and how to create objects by using the new keyword. You also saw how to initialize an object by using a constructor. In this chapter, you will learn about how the characteristics of the primitive types—such as int, double, and char—differ from the characteristics of class types.

Copying Value Type Variables and Classes

Most of the primitive types built into C#, such as int, float, double, and char (but not string, for reasons that will be covered shortly) are collectively called value types. These types have a fixed size, and when you declare a variable as a value type, the compiler generates code that allocates a block of memory big enough to hold a corresponding value. For example, declaring an int variable causes the compiler to allocate 4 bytes of memory (32 bits). A statement that assigns a value (such as 42) to the int causes the value to be copied into this block of memory.

Class types, such as Circle (described in Chapter 7), are handled differently. When you declare a Circle variable, the compiler does not generate code that allocates a block of memory big enough to hold a Circle—all it does is allot a small piece of memory that can potentially hold the address of (or a reference to) another block of memory containing a Circle. (An address specifies the location of an item in memory.) The memory for the actual Circle object is allocated only when the new keyword is used to create the object. A class is an example of a reference type. Reference types hold references to blocks of memory. To write effective C# programs that make full use of the Microsoft .NET Framework, you need to understand the difference between value types and reference types.

NOTE

The string type in C# is actually a class. This is because there is no standard size for a string (different strings can contain different numbers of characters), and it is far more efficient to allocate memory for them dynamically when the program runs rather than statically at compile time. The description of reference types such as classes in this chapter applies to the string type as well. In fact, thestring keyword in C# is just an alias for the System.String class.

Consider the situation in which you declare a variable named i as an int and assign it the value 42. If you declare another variable called copyi as an int and then assign i to copyi, copyi will hold the same value as i (42). However, even though copyi and i happen to hold the same value, there are two blocks of memory containing the value 42: one block for i and the other block for copyi. If you modify the value of i, the value of copyi does not change. Let’s see this in code:

int i = 42; // declare and initialize i

int copyi = i; /* copyi contains a copy of the data in i:

i and copyi both contain the value 42 */

i++; /* incrementing i has no effect on copyi;

i now contains 43, but copyi still contains 42 */

The effect of declaring a variable c as a class type, such as Circle, is very different. When you declare c as a Circle, c can refer to a Circle object; the actual value held by c is the address of a Circle object in memory. If you declare an additional variable named refc (also as a Circle) and you assign c to refc, refc will have a copy of the same address as c; in other words, there is only one Circle object, and both refc and c now refer to it. Here’s the example in code:

Circle c = new Circle(42);

Circle refc = c;

The following graphic illustrates both examples. The at sign (@) in the Circle objects represents a reference holding an address in memory:

image with no caption

This difference is very important. In particular, it means that the behavior of method parameters depends on whether they are value types or reference types. You’ll explore this difference in the next exercise.

COPYING REFERENCE TYPES AND DATA PRIVACY

If you actually want to copy the contents of a Circle object, c, into a different Circle object, refc, rather than just copying the reference, you must actually make refc refer to a new instance of the Circle class and then copy the data field by field from c into refc, like this:

Circle refc = new Circle();

refc.radius = c.radius; // Don't try this

However, if any members of the Circle class are private (like the radius field), you will not be able to copy this data. Instead, you could make the data in the private fields accessible by exposing them as properties, and then using these properties to read the data from c and copy it into refc. You will learn how to do this in Chapter 15

Alternatively, a class could provide a Clone method that returns another instance of the same class, but populated with the same data. The Clone method would have access to the private data in an object and could copy this data directly to another instance of the same class. For example, the Clone method for the Circle class could be defined like this:

class Circle

{

private int radius;

// Constructors and other methods omitted

...

public Circle Clone()

{

// Create a new Circle object

Circle clone = new Circle();

// Copy private data from this to clone

clone.radius = this.radius;

// Return the new Circle object containing the copied data

return clone;

}

}

This approach is straightforward if all the private data consists of values, but if one or more fields are themselves reference types (for example, the Circle class might be extended to contain a Point object from the previous chapter, indicating the position of the Circle on a graph), then these reference types also need to provide a Clone method as well, otherwise the Clone method of the Circleclass will simply copy a reference to these fields. This is a process known as a deep copy. The alternative approach, where the Clone method simply copies references, is known as a shallow copy.

The preceding code example also poses an interesting question: how private is private data? Previously, you saw that the private keyword renders a field or method inaccessible from outside of a class. However, this does not mean it can be accessed by only a single object. If you create two objects of the same class, they can each access the private data of the other. This sounds curious, but in fact methods such as Clone depend on this feature. The statement clone.radius = this.radius; only works because the private radius field in the clone object is accessible from within the current instance of the Circle class. So, private actually means “private to the class” rather than “private to an object.” However, don’t confuse private with static. If you simply declare a field as private, each instance of the class gets its own data. If a field is declared as static, each instance of the class shares the same data.

Use value parameters and reference parameters

1. Start Microsoft Visual Studio 2012 if it is not already running.

2. Open the Parameters project located in the \Microsoft Press\Visual CSharp Step By Step\Chapter 8\Windows X\Parameters folder in your Documents folder.

The project contains three C# code files: Pass.cs, Program.cs, and WrappedInt.cs.

3. Display the Pass.cs file in the Code and Text Editor window. This file defines a class called Pass that is currently empty apart from a // TODO: comment.

TIP

Remember that you can use the Task List window to locate all TODO comments in a solution.

4. Add a public static method called Value to the Pass class, replacing the // TODO: comment. This method should accept a single int parameter (a value type) called param and have the return type void. The body of the Value method should simply assign the value 42 to param, as shown in bold type in the following code example.

5. namespace Parameters

6. {

7. class Pass

8. {

9. public static void Value(int param)

10. {

11. param = 42;

12. }

13. }

}

NOTE

The reason you are defining this method as static is to keep the exercise simple. You can call the Value method directly on the Pass class rather than having to first create a new Pass object. The principles illustrated in this exercise apply in exactly the same manner to instance methods.

14.Display the Program.cs file in the Code and Text Editor window, and then locate the doWork method of the Program class.

The doWork method is called by the Main method when the program starts running. As explained in Chapter 7, the method call is wrapped in a try block and followed by a catch handler.

15.Add four statements to the doWork method to perform the following tasks:

a. Declare a local int variable called i, and initialize it to 0.

b. Write the value of i to the console by using Console.WriteLine.

c. Call Pass.Value, passing i as an argument.

d. Write the value of i to the console again.

With the calls to Console.WriteLine before and after the call to Pass.Value, you can see whether the call to Pass.Value actually modifies the value of i. The completed doWork method should look exactly like this:

static void doWork()

{

int i = 0;

Console.WriteLine(i);

Pass.Value(i);

Console.WriteLine(i);

}

16.On the DEBUG menu, click Start Without Debugging to build and run the program.

17.Confirm that the value 0 is written to the console window twice.

The assignment statement inside the Pass.Value method that updates the parameter and sets it to 42 uses a copy of the argument passed in, and the original argument i is completely unaffected.

18.Press the Enter key to close the application.

You will now see what happens when you pass an int parameter that is wrapped inside a class.

19.Display the WrappedInt.cs file in the Code and Text Editor window. This file contains the WrappInt class, which is empty apart from a // TODO: comment.

20.Add a public instance field called Number of type int to the WrappedInt class, as shown in bold type below:

21.namespace Parameters

22.{

23. class WrappedInt

24. {

25. public int Number;

26. }

}

27.Display the Pass.cs file in the Code and Text Editor window. Add a public static method called Reference to the Pass class. This method should accept a single WrappedInt parameter called param and have the return type void. The body of the Reference method should assign 42 toparam.Number, like this:

28.public static void Reference(WrappedInt param)

29.{

30. param.Number = 42;

}

31.Display the Program.cs file in the Code and Text Editor window. Comment out the existing code in the doWork method and add four more statements to perform the following tasks:

a. Declare a local WrappedInt variable called wi, and initialize it to a new WrappedInt object by calling the default constructor.

b. Write the value of wi.Number to the console.

c. Call the Pass.Reference method, passing wi as an argument.

d. Write the value of wi.Number to the console again.

As before, with the calls to Console.WriteLine, you can see whether the call to Pass.Reference modifies the value of wi.Number. The DoWork method should now look exactly like this (the new statements are shown in bold type):

static void doWork()

{

// int i = 0;

// Console.WriteLine(i);

// Pass.Value(i);

// Console.WriteLine(i);

WrappedInt wi = new WrappedInt();

Console.WriteLine(wi.Number);

Pass.Reference(wi);

Console.WriteLine(wi.Number);

}

32.On the DEBUG menu, click Start Without Debugging to build and run the application.

This time, the two values displayed in the console window correspond to the value of wi.Number before and after the call to the Pass.Reference method. You should see that the values 0 and 42 are displayed.

33.Press the Enter key to close the application and return to Visual Studio 2012.

To explain what the previous exercise shows, the value of wi.Number is initialized to 0 by the compiler-generated default constructor. The wi variable contains a reference to the newly created WrappedInt object (which contains an int). The wi variable is then copied as an argument to thePass.Reference method. Because WrappedInt is a class (a reference type), wi and param both refer to the same WrappedInt object. Any changes made to the contents of the object through the param variable in the Pass.Reference method are visible by using the wi variable when the method completes. The following diagram illustrates what happens when a WrappedInt object is passed as an argument to the Pass.Reference method:

image with no caption

Understanding Null Values and Nullable Types

When you declare a variable, it is always a good idea to initialize it. With value types, it is common to see code such as this:

int i = 0;

double d = 0.0;

Remember that to initialize a reference variable such as a class, you can create a new instance of the class and assign the reference variable to the new object, like this:

Circle c = new Circle(42);

This is all very well, but what if you don’t actually want to create a new object? Perhaps the purpose of the variable is simply to store a reference to an existing object at some later point in your program. In the following code example, the Circle variable copy is initialized, but later it is assigned a reference to another instance of the Circle class:

Circle c = new Circle(42);

Circle copy = new Circle(99); // Some random value, for initializing copy

...

copy = c; // copy and c refer to the same object

After assigning c to copy, what happens to the original Circle object with a radius of 99 that you used to initialize copy? Nothing refers to it anymore. In this situation, the runtime can reclaim the memory by performing an operation known as garbage collection, which you will learn more about in Chapter 14 The important thing to understand for now is that garbage collection is a potentially time-consuming operation; you should not create objects that are never used, as doing so is a waste of time and resources.

You could argue that if a variable is going to be assigned a reference to another object at some point in a program, there is no point initializing it. But this is poor programming practice and can lead to problems in your code. For example, you will inevitably find yourself in the situation where you want to refer a variable to an object only if that variable does not already contain a reference, as shown in the following code example:

Circle c = new Circle(42);

Circle copy; // Uninitialized !!!

...

if (copy == // only assign to copy if it is uninitialized, but what goes here?)

{

copy = c; // copy and c refer to the same object

...

}

The purpose of the if statement is to test the copy variable to see whether it is initialized, but to which value should you compare this variable? The answer is to use a special value called null.

In C#, you can assign the null value to any reference variable. The null value simply means that the variable does not refer to an object in memory. You can use it like this:

Circle c = new Circle(42);

Circle copy = null; // Initialized

...

if (copy == null)

{

copy = c; // copy and c refer to the same object

...

}

Using Nullable Types

The null value is useful for initializing reference types. Sometimes you need an equivalent value for value types, but null is itself a reference, and so you cannot assign it to a value type. The following statement is therefore illegal in C#:

int i = null; // illegal

However, C# defines a modifier that you can use to declare that a variable is a nullable value type. A nullable value type behaves in a similar manner to the original value type, but you can assign the null value to it. You use the question mark (?) to indicate that a value type is nullable, like this:

int? i = null; // legal

You can ascertain whether a nullable variable contains null by testing it in the same way as a reference type:

if (i == null)

...

You can assign an expression of the appropriate value type directly to a nullable variable. The following examples are all legal:

int? i = null;

int j = 99;

i = 100; // Copy a value type constant to a nullable type

i = j; // Copy a value type variable to a nullable type

You should note that the converse is not true. You cannot assign a nullable variable to an ordinary value type variable. So, given the definitions of variables i and j from the preceding example, the following statement is not allowed:

j = i; // Illegal

This makes sense if you consider that the variable i might contain null, and j is a value-type that cannot contain null. This also means that you cannot use a nullable variable as a parameter to a method that expects an ordinary value type. If you recall, the Pass.Value method from the preceding exercise expects an ordinary int parameter, so the following method call will not compile:

int? i = 99;

Pass.Value(i); // Compiler error

Understanding the Properties of Nullable Types

A nullable type exposes a pair of properties that you can use to determine whether the type actually has a non-null value, and what this value is. The HasValue property indicates whether a nullable type contains a value or is null, and you can retrieve the value of a non-null nullable type by reading the Value property, like this:

int? i = null;

...

if (!i.HasValue)

{

// If i is null, then assign it the value 99

i = 99;

}

else

{

// If i is not null, then display its value

Console.WriteLine(i.Value);

}

Recall from Chapter 4, that the NOT operator (!) negates a Boolean value. This code fragment tests the nullable variable i, and if it does not have a value (it is null), it assigns it the value 99; otherwise, it displays the value of the variable. In this example, using the HasValue property does not provide any benefit over testing for a null value directly. Additionally, reading the Value property is a long-winded way of reading the contents of the variable. However, these apparent shortcomings are caused by the fact that int? is a very simple nullable type. You can create more complex value types and use them to declare nullable variables where the advantages of using the HasValue and Value properties become more apparent. You will see some examples in Chapter 9

NOTE

The Value property of a nullable type is read-only. You can use this property to read the value of a variable but not to modify it. To update a nullable variable, use an ordinary assignment statement.

Using ref and out Parameters

Ordinarily, when you pass an argument to a method, the corresponding parameter is initialized with a copy of the argument. This is true regardless of whether the parameter is a value type (such as an int), a nullable type (such as int?), or a reference type (such as a WrappedInt). This arrangement means it’s impossible for any change to the parameter to affect the value of the argument passed in. For example, in the following code, the value output to the console is 42 and not 43. The doIncrement method increments a copy of the argument (arg) and not the original argument:

static void doIncrement(int param)

{

param++;

}

static void Main()

{

int arg = 42;

doIncrement(arg);

Console.WriteLine(arg); // writes 42, not 43

}

In the preceding exercise, you saw that if the parameter to a method is a reference type, any changes made by using that parameter change the data referenced by the argument passed in. The key point is that, although the data that was referenced changed, the argument passed in as the parameter did not—it still references the same object. In other words, although it is possible to modify the object that the argument refers to through the parameter, it’s not possible to modify the argument itself (for example, to set it to refer to a completely different object). Most of the time, this guarantee is very useful and can help to reduce the number of bugs in a program. Occasionally, however, you might want to write a method that actually needs to modify an argument. C# provides the ref and out keywords so that you can do this.

Creating ref Parameters

If you prefix a parameter with the ref keyword, the C# compiler generates code that passes a reference to the actual argument rather than a copy of the argument. When using a ref parameter, anything you do to the parameter you also do to the original argument because the parameter and the argument both reference the same data. When you pass an argument as a ref parameter, you must also prefix the argument with the ref keyword. This syntax provides a useful visual cue to the programmer that the argument might change. Here’s the preceding example again, this time modified to use the ref keyword:

static void doIncrement(ref int param) // using ref

{

param++;

}

static void Main()

{

int arg = 42;

doIncrement(ref arg); // using ref

Console.WriteLine(arg); // writes 43

}

This time, the doIncrement method receives a reference to the original argument rather than a copy, so any changes the method makes by using this reference actually change the original value. That’s why the value 43 is displayed on the console.

Remember that C# enforces the rule that you must assign a value to a variable before you can read it. This rule also applies to method arguments; you cannot pass an uninitialized value as an argument to a method even if an argument is defined as a ref argument. For example, in the following example, arg is not initialized, so this code will not compile. This failure occurs because the statement param++; inside the doIncrement method is really an alias for the statement arg++; and this operation is allowed only if arg has a defined value:

static void doIncrement(ref int param)

{

param++;

}

static void Main()

{

int arg; // not initialized

doIncrement(ref arg);

Console.WriteLine(arg);

}

Creating out Parameters

The compiler checks whether a ref parameter has been assigned a value before calling the method. However, there might be times when you want the method itself to initialize the parameter. You can do this with the out keyword.

The out keyword is syntactically similar to the ref keyword. You can prefix a parameter with the out keyword so that the parameter becomes an alias for the argument. As when using ref, anything you do to the parameter, you also do to the original argument. When you pass an argument to anout parameter, you must also prefix the argument with the out keyword.

The keyword out is short for output. When you pass an out parameter to a method, the method must assign a value to it before it finishes or returns, as shown in the following example:

static void doInitialize(out int param)

{

param = 42; // Initialize param before finishing

}

The following example does not compile because doInitialize does not assign a value to param:

static void doInitialize(out int param)

{

// Do nothing

}

Because an out parameter must be assigned a value by the method, you’re allowed to call the method without initializing its argument. For example, the following code calls doInitialize to initialize the variable arg, which is then displayed on the console:

static void doInitialize(out int param)

{

param = 42;

}

static void Main()

{

int arg; // not initialized

doInitialize(out arg); // legal

Console.WriteLine(arg); // writes 42

}

You will examine ref parameters in the next exercise.

Use ref parameters

1. Return to the Parameters project in Visual Studio 2012.

2. Display the Pass.cs file in the Code and Text Editor window.

3. Edit the Value method to accept its parameter as a ref parameter.

The Value method should look like this:

class Pass

{

public static void Value(ref int param)

{

param = 42;

}

...

}

4. Display the Program.cs file in the Code and Text Editor window.

5. Uncomment the first four statements. Notice that the third statement of the doWork method, Pass.Value(i); indicates an error. This is because the Value method now expects a ref parameter. Edit this statement so that the Pass.Value method call passes its argument as a ref parameter.

NOTE

Leave the four statements that create and test the WrappedInt object as they are.

The doWork method should now look like this:

class Program

{

static void doWork()

{

int i = 0;

Console.WriteLine(i);

Pass.Value(ref i);

Console.WriteLine(i);

...

}

}

6. On the DEBUG menu, click Start Without Debugging to build and run the program.

This time, the first two values written to the console window are 0 and 42. This result shows that the call to the Pass.Value method has successfully modified the argument i.

7. Press the Enter key to close the application and return to Visual Studio 2012.

NOTE

You can use the ref and out modifiers on reference type parameters as well as on value type parameters. The effect is exactly the same. The parameter becomes an alias for the argument.

How Computer Memory Is Organized

Computers use memory to hold programs being executed and the data that these programs use. To understand the differences between value and reference types, it is helpful to understand how data is organized in memory.

Operating systems and language runtimes such as that used by C# frequently divide the memory used for holding data in two separate areas, each of which is managed in a distinct manner. These two areas of memory are traditionally called the stack and the heap. The stack and the heap serve different purposes:

§ When you call a method, the memory required for its parameters and its local variables is always acquired from the stack. When the method finishes (because it either returns or throws an exception), the memory acquired for the parameters and local variables is automatically released back to the stack and is available for reuse when another method is called. Method parameters and local variables on the stack have a well-defined life span. They come into existence when the method starts, and they disappear as soon as the method completes.

NOTE

Actually, the same life span applies to variables defined in any block of code enclosed between opening and closing curly braces. In the following code example, the variable i is created when the body of the while loop starts, but it disappears when the while loop finishes and execution continues after the closing brace:

while (...)

{

int i = ...; // i is created on the stack here

...

}

// i disappears from the stack here

§ When you create an object (an instance of a class) by using the new keyword, the memory required to build the object is always acquired from the heap. You have seen that the same object can be referenced from several places by using reference variables. When the last reference to an object disappears, the memory used by the object becomes available for reuse (although it might not be reclaimed immediately). Chapter 14 includes a more detailed discussion of how heap memory is reclaimed. Objects created on the heap therefore have a more indeterminate life span; an object is created by using the new keyword but only disappears sometime after the last reference to the object is removed.

NOTE

All value types are created on the stack. All reference types (objects) are created on the heap (although the reference itself is on the stack). Nullable types are actually reference types, and they are created on the heap.

The names stack and heap come from the way in which the runtime manages the memory:

§ Stack memory is organized like a stack of boxes piled on top of one another. When a method is called, each parameter is put in a box that is placed on top of the stack. Each local variable is likewise assigned a box, and these are placed on top of the boxes already on the stack. When a method finishes, you can think of the boxes being removed from the stack.

§ Heap memory is like a large pile of boxes strewn around a room rather than stacked neatly on top of each other. Each box has a label indicating whether it is in use. When a new object is created, the runtime searches for an empty box and allocates it to the object. The reference to the object is stored in a local variable on the stack. The runtime keeps track of the number of references to each box. (Remember that two variables can refer to the same object.) When the last reference disappears, the runtime marks the box as not in use, and at some point in the future it will empty the box and make it available for reuse.

Using the Stack and the Heap

Now let’s examine what happens when the following method Method is called:

void Method(int param)

{

Circle c;

c = new Circle(param);

...

}

Suppose the argument passed into param is the value 42. When the method is called, a block of memory (just enough for an int) is allocated from the stack and initialized with the value 42. As execution moves inside the method, another block of memory big enough to hold a reference (a memory address) is also allocated from the stack but left uninitialized. (This is for the Circle variable, c.) Next, another piece of memory big enough for a Circle object is allocated from the heap. This is what the new keyword does. The Circle constructor runs to convert this raw heap memory to a Circle object. A reference to this Circle object is stored in the variable c. The following graphic illustrates the situation:

image with no caption

At this point, you should note two things:

§ Although the object is stored on the heap, the reference to the object (the variable c) is stored on the stack.

§ Heap memory is not infinite. If heap memory is exhausted, the new operator will throw an OutOfMemoryException exception and the object will not be created.

NOTE

The Circle constructor could also throw an exception. If it does, the memory allocated to the Circle object will be reclaimed and the value returned by the constructor will be null.

When the method ends, the parameters and local variables go out of scope. The memory acquired for c and for param is automatically released back to the stack. The runtime notes that the Circle object is no longer referenced and at some point in the future will arrange for its memory to be reclaimed by the heap. (See Chapter 14.)

The System.Object Class

One of the most important reference types in the Microsoft .NET Framework is the Object class in the System namespace. To fully appreciate the significance of the System.Object class requires that you understand inheritance, which is described in Chapter 12 For the time being, simply accept that all classes are specialized types of System.Object and that you can use System.Object to create a variable that can refer to any reference type. System.Object is such an important class that C# provides the object keyword as an alias for System.Object. In your code, you can useobject or you can write System.Object—they mean exactly the same thing.

TIP

Use the object keyword in preference to System.Object. It’s more direct, and it’s consistent with other keywords that are synonyms for classes (such as string for System.String and some others that you’ll discover in Chapter 9).

In the following example, the variables c and o both refer to the same Circle object. The fact that the type of c is Circle and the type of o is object (the alias for System.Object) in effect provides two different views of the same item in memory:

Circle c;

c = new Circle(42);

object o;

o = c;

The following diagram illustrates how the variables c and o refer to the same item on the heap.

image with no caption

Boxing

As you have just seen, variables of type object can refer to any item of any reference type. However, variables of type object can also refer to a value type. For example, the following two statements initialize the variable i (of type int, a value type) to 42 and then initialize the variable o (of type object, a reference type) to i:

int i = 42;

object o = i;

The second statement requires a little explanation to appreciate what is actually happening. Remember that i is a value type and that it lives on the stack. If the reference inside o referred directly to i, the reference would refer to the stack. However, all references must refer to objects on the heap; creating references to items on the stack could seriously compromise the robustness of the runtime and create a potential security flaw, so it is not allowed. Therefore, the runtime allocates a piece of memory from the heap, copies the value of integer i to this piece of memory, and then refers the object o to this copy. This automatic copying of an item from the stack to the heap is called boxing. The following graphic shows the result:

image with no caption

IMPORTANT

If you modify the original value of the variable i, the value on the heap referenced through o will not change. Likewise, if you modify the value on the heap, the original value of the variable will not change.

Unboxing

Because a variable of type object can refer to a boxed copy of a value, it’s only reasonable to allow you to get at that boxed value through the variable. You might expect to be able to access the boxed int value that a variable o refers to by using a simple assignment statement such as this:

int i = o;

However, if you try this syntax, you’ll get a compile-time error. If you think about it, it’s pretty sensible that you can’t use the int i = o; syntax. After all, o could be referencing absolutely anything and not just an int. Consider what would happen in the following code if this statement were allowed:

Circle c = new Circle();

int i = 42;

object o;

o = c; // o refers to a circle

i = o; // what is stored in i?

To obtain the value of the boxed copy, you must use what is known as a cast. This is an operation that checks whether it is safe to convert an item of one type to another before it actually makes the copy. You prefix the object variable with the name of the type in parentheses, as in this example:

int i = 42;

object o = i; // boxes

i = (int)o; // compiles okay

The effect of this cast is subtle. The compiler notices that you’ve specified the type int in the cast. Next, the compiler generates code to check what o actually refers to at run time. It could be absolutely anything. Just because your cast says o refers to an int, that doesn’t mean it actually does. If o really does refer to a boxed int and everything matches, the cast succeeds and the compiler-generated code extracts the value from the boxed int and copies it to i. (In this example, the boxed value is then stored in i.) This is called unboxing. The following diagram shows what is happening:

image with no caption

On the other hand, if o does not refer to a boxed int, there is a type mismatch, causing the cast to fail. The compiler-generated code throws an InvalidCastException exception at run time. Here’s an example of an unboxing cast that fails:

Circle c = new Circle(42);

object o = c; // doesn't box because Circle is a reference variable

int i = (int)o; // compiles okay but throws an exception at run time

The following diagram illustrates this case.

image with no caption

You will use boxing and unboxing in later exercises. Keep in mind that boxing and unboxing are expensive operations because of the amount of checking required and the need to allocate additional heap memory. Boxing has its uses, but injudicious use can severely impair the performance of a program. You will see an alternative to boxing in Chapter 17

Casting Data Safely

By using a cast, you can specify that, in your opinion, the data referenced by an object has a specific type and that it is safe to reference the object by using that type. The key phrase here is “in your opinion.” The C# compiler will not check that this is the case, but the runtime will. If the type of object in memory does not match the cast, the runtime will throw an InvalidCastException, as described in the preceding section. You should be prepared to catch this exception and handle it appropriately if it occurs.

However, catching an exception and attempting to recover in the event that the type of an object is not what you expected it to be is a rather cumbersome approach. C# provides two more very useful operators that can help you perform casting in a much more elegant manner: the is and asoperators.

The is Operator

You can use the is operator to verify that the type of an object is what you expect it to be, like this:

WrappedInt wi = new WrappedInt();

...

object o = wi;

if (o is WrappedInt)

{

WrappedInt temp = (WrappedInt)o; // This is safe; o is a WrappedInt

...

}

The is operator takes two operands: a reference to an object on the left and the name of a type on the right. If the type of the object referenced on the heap has the specified type, is evaluates to true; otherwise, is evaluates to false. The preceding code attempts to cast the reference to the objectvariable o only if it knows that the cast will succeed.

The as Operator

The as operator fulfills a similar role to is but in a slightly truncated manner. You use the as operator like this:

WrappedInt wi = new WrappedInt();

...

object o = wi;

WrappedInt temp = o as WrappedInt;

if (temp != null)

{

... // Cast was successful

}

Like the is operator, the as operator takes an object and a type as its operands. The runtime attempts to cast the object to the specified type. If the cast is successful, the result is returned and, in this example, it is assigned to the WrappedInt variable temp. If the cast is unsuccessful, the asoperator evaluates to the null value and assigns that to temp instead.

There is a little more to the is and as operators than described here, and you will meet them again in Chapter 12.

POINTERS AND UNSAFE CODE

This section is purely for your information and is aimed at developers who are familiar with C or C++. If you are new to programming, feel free to skip this section.

If you have already written programs in languages such as C or C++, much of the discussion in this chapter concerning object references might be familiar. Although neither C nor C++ has explicit reference types, both languages have a construct that provides similar functionality:a pointer.

A pointer is a variable that holds the address of, or a reference to, an item in memory (on the heap or on the stack). A special syntax is used to identify a variable as a pointer. For example, the following statement declares the variable pi as a pointer to an integer:

int *pi;

Although the variable pi is declared as a pointer, it does not actually point anywhere until you initialize it. For example, to use pi to point to the integer variable i, you can use the following statements and the address-of operator (&), which returns the address of a variable:

int *pi;

int i = 99;

...

pi = &i;

You can access and modify the value held in the variable i through the pointer variable pi like this:

*pi = 100;

This code updates the value of the variable i to 100 because pi points to the same memory location as the variable i.

One of the main problems that developers learning C and C++ have is understanding the syntax used by pointers. The * operator has at least two meanings (in addition to being the arithmetic multiplication operator), and there is often great confusion about when to use & rather than *. The other issue with pointers is that it is easy to point somewhere invalid, or to forget to point somewhere at all, and then try to reference the data pointed to. The result will be either garbage or a program that fails with an error because the operating system detects an attempt to access an illegal address in memory. There is also a whole range of security flaws in many existing systems resulting from the mismanagement of pointers; some environments (not Microsoft Windows) fail to enforce checks that a pointer does not refer to memory that belongs to another process, opening up the possibility that confidential data could be compromised.

Reference variables were added to C# to avoid all these problems. If you really want to, you can continue to use pointers in C#, but you must mark the code as unsafe. The unsafe keyword can be used to mark a block of code, or an entire method, as shown here:

public static void Main(string [] args)

{

int x = 99, y = 100;

unsafe

{

swap (&x, &y);

}

Console.WriteLine("x is now {0}, y is now {1}", x, y);

}

public static unsafe void swap(int *a, int *b)

{

int temp;

temp = *a;

*a = *b;

*b = temp;

}

When you compile programs containing unsafe code, you must specify the Allow Unsafe Code option when building the project. To do this, right-click the project in Solution Explorer and then click Properties. In the Properties window, click the Build tab, select Allow Unsafe Code, and then on the FILE menu, click Save All.

Unsafe code also has a bearing on how memory is managed. Objects created in unsafe code are said to be unmanaged. Although not common, you might find some situations that require you to access memory in this way, especially if you are writing code that needs to perform some low-level Windows operations.

You will learn about the implications of using code that accesses unmanaged memory in more detail in Chapter 14.

Summary

In this chapter, you learned about some important differences between value types that hold their value directly on the stack and reference types that refer indirectly to their objects on the heap. You also learned how to use the ref and out keywords on method parameters to gain access to the arguments. You saw how assigning a value (such as the int 42) to a variable of the System.Object class creates a boxed copy of the value on the heap and then causes the System.Object variable to refer to this boxed copy. You also saw how assigning a variable of a value type (such as an int) to a variable of the System.Object class copies (or unboxes) the value in the System.Object class to the memory used by the int.

§ If you want to continue to the next chapter

Keep Visual Studio 2012 running, and turn to Chapter 9.

§ If you want to exit Visual Studio 2012 now

On the FILE menu, click Exit. If you see a Save dialog box, click Yes and save the project.

Chapter 8 Quick Reference

To

Do this

Copy a value type variable

Simply make the copy. Because the variable is a value type, you will have two copies of the same value. For example:

int i = 42;

int copyi = i;

Copy a reference type variable

Simply make the copy. Because the variable is a reference type, you will have two references to the same object. For example:

Circle c = new Circle(42);

Circle refc = c;

Declare a variable that can hold a value type or the null value

Declare the variable using the ? modifier with the type. For example:

int? i = null;

Pass an argument to a ref parameter

Prefix the argument with the ref keyword. This makes the parameter an alias for the actual argument rather than a copy of the argument. The method may change the value of the parameter, and this change is made to the actual argument rather than a local copy. For example:

static void Main()

{

int arg = 42;

DoWork(ref arg);

Console.WriteLine(arg);

}

Pass an argument to an out parameter

Prefix the argument with the out keyword. This makes the parameter an alias for the actual argument rather than a copy of the argument. The method must assign a value to the parameter, and this value is made to the actual argument. For example:

static void Main()

{

int arg;

DoWork(out arg);

Console.WriteLine(arg);

}

Box a value

Initialize or assign a variable of type object to the value. For example:

object o = 42;

Unbox a value

Cast the object reference that refers to the boxed value to the type of the value variable. For example:

int i = (int)o;

Cast an object safely

Use the is operator to test whether the cast is valid. For example:

WrappedInt wi = new WrappedInt();

...

object o = wi;

if (o is WrappedInt)

{

WrappedInt temp = (WrappedInt)o;

...

}

Alternatively, use the as operator to perform the cast, and test whether the result is null. For example:

WrappedInt wi = new WrappedInt();

...

object o = wi;

WrappedInt temp = o as WrappedInt;

if (temp != null)

...