Enumerating Collections - Microsoft® Visual C#® 2012 Step by Step (2012)

Microsoft® Visual C#® 2012 Step by Step (2012)

Chapter 19. Enumerating Collections

After completing this chapter, you will be able to

§ Manually define an enumerator that can be used to iterate over the elements in a collection.

§ Implement an enumerator automatically by creating an iterator.

§ Provide additional iterators that can step through the elements of a collection in different sequences.

In Chapter 10, and Chapter 18, you learned about arrays and collection classes for holding sequences or sets of data. Chapter 10 also introduced the foreach statement that you can use for stepping through, or iterating over, the elements in a collection. In these chapters, you used the foreachstatement as a quick and convenient way of accessing the contents of an array or collection, but now it is time to learn a little more about how this statement actually works. This topic becomes important when you define your own collection classes, and this chapter describes how you can make collections enumerable.

Enumerating the Elements in a Collection

In Chapter 10, you saw an example of using the foreach statement to list the items in a simple array. The code looked like this:

int[] pins = { 9, 3, 7, 2 };

foreach (int pin in pins)

{

Console.WriteLine(pin);

}

The foreach construct provides an elegant mechanism that greatly simplifies the code you need to write, but it can be exercised only under certain circumstances—you can use foreach only to step through an enumerable collection.

So, what exactly is an enumerable collection? The quick answer is that it is a collection that implements the System.Collections.IEnumerable interface.

NOTE

Remember that all arrays in C# are actually instances of the System.Array class. The System.Array class is a collection class that implements the IEnumerable interface.

The IEnumerable interface contains a single method called GetEnumerator:

IEnumerator GetEnumerator();

The GetEnumerator method should return an enumerator object that implements the System.Collections.IEnumerator interface. The enumerator object is used for stepping through (enumerating) the elements of the collection. The IEnumerator interface specifies the following property and methods:

object Current { get; }

bool MoveNext();

void Reset();

Think of an enumerator as a pointer pointing to elements in a list. Initially, the pointer points before the first element. You call the MoveNext method to move the pointer down to the next (first) item in the list; the MoveNext method should return true if there actually is another item and false if there isn’t. You use the Current property to access the item currently pointed to, and you use the Reset method to return the pointer back to before the first item in the list. By creating an enumerator by using the GetEnumerator method of a collection and repeatedly calling the MoveNextmethod and retrieving the value of the Current property by using the enumerator, you can move forward through the elements of a collection one item at a time. This is exactly what the foreach statement does. So if you want to create your own enumerable collection class, you must implement the IEnumerable interface in your collection class and also provide an implementation of the IEnumerator interface to be returned by the GetEnumerator method of the collection class.

IMPORTANT

At first glance, it is easy to confuse the IEnumerable and IEnumerator interfaces because of the similarity of their names. Don’t get them mixed up.

If you are observant, you will have noticed that the Current property of the IEnumerator interface exhibits non-type-safe behavior in that it returns an object rather than a specific type. However, you should be pleased to know that the Microsoft .NET Framework class library also provides the generic IEnumerator<T> interface, which has a Current property that returns a T instead. Likewise, there is also an IEnumerable<T> interface containing a GetEnumerator method that returns an Enumerator<T> object. Both of these interfaces are defined in the System.Collections.Genericnamespace, and if you are building applications for the .NET Framework version 2.0 or later, you should make use of these generic interfaces when defining enumerable collections rather than using the nongeneric version.

NOTE

The generic IEnumerator<T> interface has some further differences from the nongeneric IEnumerator interface: it does not contain a Reset method but extends the IDisposable interface.

Manually Implementing an Enumerator

In the next exercise, you will define a class that implements the generic IEnumerator<T> interface and create an enumerator for the binary tree class that you built in Chapter 17

In Chapter 17, you saw how easy it is to traverse a binary tree and display its contents. You would therefore be inclined to think that defining an enumerator that retrieves each element in a binary tree in the same order would be a simple matter. Sadly, you would be mistaken. The main problem is that when defining an enumerator you need to remember where you are in the structure so that subsequent calls to the MoveNext method can update the position appropriately. Recursive algorithms, such as that used when walking a binary tree, do not lend themselves to maintaining state information between method calls in an easily accessible manner. For this reason, you will first preprocess the data in the binary tree into a more amenable data structure (a queue) and actually enumerate this data structure instead. Of course, this deviousness is hidden from the user iterating through the elements of the binary tree!

Create the TreeEnumerator class

1. Start Microsoft Visual Studio 2012 if it is not already running.

2. Open the BinaryTree solution located in the \Microsoft Press\Visual CSharp Step By Step\Chapter 19\Windows X\BinaryTree folder in your Documents folder. This solution contains a working copy of the BinaryTree project you created in Chapter 17. You will add a new class to this project, to implement the enumerator for the BinaryTree class.

3. In Solution Explorer, click the BinaryTree project. On the PROJECT menu, click Add Class. In the middle pane of the Add New Item – BinaryTree dialog box, select the Class template, type TreeEnumerator.cs in the Name text box, and then click Add.

The TreeEnumerator class will generate an enumerator for a Tree<TItem> object. To ensure that the class is type safe, you must provide a type parameter and implement the IEnumerator<T> interface. Also, the type parameter must be a valid type for the Tree<TItem> object that the class enumerates, so it must be constrained to implement the IComparable<TItem> interface (the BinaryTree class requires that items in the tree provide a means to allow them to be compared, for sorting purposes).

4. In the Code and Text Editor window displaying the TreeEnumerator.cs file, modify the definition of the TreeEnumerator class to satisfy these requirements, as shown in bold in the following example:

5. class TreeEnumerator<TItem> : IEnumerator<TItem> where TItem : IComparable<TItem>

6. {

}

7. Add the following three private variables shown below in bold to the TreeEnumerator<TItem> class:

8. class TreeEnumerator<TItem> : IEnumerator<TItem> where TItem : IComparable<TItem>

9. {

10. private Tree<TItem> currentData = null;

11. private TItem currentItem = default(TItem);

12. private Queue<TItem> enumData = null;

}

The currentData variable will be used to hold a reference to the tree being enumerated, and the currentItem variable will hold the value returned by the Current property. You will populate the enumData queue with the values extracted from the nodes in the tree, and the MoveNextmethod will return each item from this queue in turn. The default keyword is explained in the section titled Initializing a Variable Defined with a Type Parameter later in this chapter.

13.Add a TreeEnumerator constructor that takes a single Tree<TItem> parameter called data. In the body of the constructor, add a statement that initializes the currentData variable to data:

14.class TreeEnumerator<TItem> : IEnumerator<TItem> where TItem : IComparable<TItem>

15.{

16. public TreeEnumerator(Tree<TItem> data)

17. {

18. this.currentData = data;

19. }

20. ...

}

21.Add the following private method, called populate, to the TreeEnumerator<TItem> class immediately after the constructor:

22.private void populate(Queue<TItem> enumQueue, Tree<TItem> tree)

23.{

24. if (tree.LeftTree != null)

25. {

26. populate(enumQueue, tree.LeftTree);

27. }

28.

29. enumQueue.Enqueue(tree.NodeData);

30.

31. if (tree.RightTree != null)

32. {

33. populate(enumQueue, tree.RightTree);

34. }

}

This method walks the binary tree, adding the data it contains to the queue. The algorithm used is similar to that used by the WalkTree method in the Tree<TItem> class, which was described in Chapter 17. The main difference is that rather than the method appending NodeData values to a string, it stores these values in the queue.

35.Return to the definition of the TreeEnumerator<TItem> class. Right-click anywhere in the text IEnumerator<TItem> in the class declaration, point to Implement Interface, and then click Implement Interface Explicitly.

This action generates stubs for the methods of the IEnumerator<TItem> interface and the IEnumerator interface, and adds them to the end of the class. It also generates the Dispose method for the IDisposable interface.

NOTE

The IEnumerator<TItem> interface inherits from the IEnumerator and IDisposable interfaces, which is why their methods also appear. In fact, the only item that belongs to the IEnumerator<TItem> interface is the generic Current property. The MoveNext and Reset methods belong to the nongeneric IEnumerator interface. Chapter 14, describes the IDisposable interface.

36.Examine the code that has been generated. The bodies of the properties and methods contain a default implementation that simply throws a NotImplementedException exception. You will replace this code with a real implementation in the following steps.

37.Update the body of the MoveNext method with the code shown in bold here:

38.bool System.Collections.IEnumerator.MoveNext()

39.{

40. if (this.enumData == null)

41. {

42. this.enumData = new Queue<TItem>();

43. populate(this.enumData, this.currentData);

44. }

45. if (this.enumData.Count > 0)

46. {

47. this.currentItem = this.enumData.Dequeue();

48. return true;

49. }

50. return false;

}

The purpose of the MoveNext method of an enumerator is actually twofold. The first time it is called, it should initialize the data used by the enumerator and advance to the first piece of data to be returned. (Prior to MoveNext being called for the first time, the value returned by theCurrent property is undefined and should result in an exception.) In this case, the initialization process consists of instantiating the queue and then calling the populate method to fill the queue with data extracted from the tree.

Subsequent calls to the MoveNext method should just move through data items until there are no more left, dequeuing items until the queue is empty, as in this example. It is important to bear in mind that MoveNext does not actually return data items—that is the purpose of the Current property. All MoveNext does is update the internal state in the enumerator (that is, the value of the currentItem variable is set to the data item extracted from the queue) for use by the Current property, returning true if there is a next value and false otherwise.

51.Modify the definition of the get accessor of the generic Current property as follows in bold:

52.TItem IEnumerator<TItem>.Current

53.{

54. get

55. {

56. if (this.enumData == null)

57. {

58. throw new InvalidOperationException("Use MoveNext before calling Current");

59. }

60.

61. return this.currentItem;

62. }

}

IMPORTANT

Be sure to add the code to the correct implementation of the Current property. Leave the nongeneric version, System.Collections.IEnumerator.Current, with its default implementation that throws a NotImplementedException exception.

The Current property examines the enumData variable to ensure that MoveNext has been called. (This variable will be null prior to the first call to MoveNext.) If this is not the case, the property throws an InvalidOperationException—this is the conventional mechanism used by .NET Framework applications to indicate that an operation cannot be performed in the current state. If MoveNext has been called beforehand, it will have updated the currentItem variable, so all the Current property needs to do is return the value in this variable.

63.Locate the IDisposable.Dispose method. Comment out the throw new NotImplementedException(); statement as follows in bold. The enumerator does not use any resources that require explicit disposal, so this method does not need to do anything. It must still be present, however. For more information about the Dispose method, refer to Chapter 14.

64.void IDisposable.Dispose()

65.{

66. // throw new NotImplementedException();

}

67.Build the solution, and correct any errors that are reported.

INITIALIZING A VARIABLE DEFINED WITH A TYPE PARAMETER

You should have noticed that the statement that defines and initializes the currentItem variable uses the default keyword. The currentItem variable is defined by using the type parameter TItem. When the program is written and compiled, the actual type that will be substituted for TItem might not be known—this issue is resolved only when the code is executed. This makes it difficult to specify how the variable should be initialized. The temptation is to set it to null. However, if the type substituted for TItem is a value type, this is an illegal assignment. (You cannot set value types to null, only reference types.) Similarly, if you set it to 0 with the expectation that the type will be numeric, this will be illegal if the type used is actually a reference type. There are other possibilities as well—TItem could be a boolean, for example. The default keyword solves this problem. The value used to initialize the variable will be determined when the statement is executed. If TItem is a reference type, default(TItem) returns null; if TItem is numeric, default(TItem) returns 0; if TItem is a boolean, default(TItem) returns false. If TItem is a struct, the individual fields in the struct are initialized in the same way. (Reference fields are set to null, numeric fields are set to 0, and boolean fields are set to false.)

Implementing the IEnumerable Interface

In the following exercise, you will modify the binary tree class to implement the IEnumerable<T> interface. The GetEnumerator method will return a TreeEnumerator<TItem> object.

Implement the IEnumerable<TItem> interface in the Tree<TItem> class

1. In Solution Explorer, double-click the file Tree.cs to display the Tree<TItem> class in the Code and Text Editor window.

2. Modify the definition of the Tree<TItem> class so that it implements the IEnumerable<TItem> interface, as shown in bold in the following code:

public class Tree<TItem> : IEnumerable<TItem> where TItem : IComparable<TItem>

Notice that constraints are always placed at the end of the class definition.

3. Right-click the IEnumerable<TItem> interface in the class definition, point to Implement Interface, and then click Implement Interface Explicitly.

This action generates implementations of the IEnumerable<TItem>.GetEnumerator and IEnumerable.GetEnumerator methods and adds them to the class. The nongeneric IEnumerable interface method is implemented because the generic IEnumerable<TItem> interface inherits fromIEnumerable.

4. Locate the generic IEnumerable<TItem>.GetEnumerator method near the end of the class. Modify the body of the GetEnumerator() method, replacing the existing throw statement as shown in bold below:

5. IEnumerator<TItem> IEnumerable<TItem>.GetEnumerator()

6. {

7. return new TreeEnumerator<TItem>(this);

}

The purpose of the GetEnumerator method is to construct an enumerator object for iterating through the collection. In this case, all you need to do is build a new TreeEnumerator<TItem> object by using the data in the tree.

8. Build the solution. Correct any errors that are reported, and rebuild the solution if necessary.

You will now test the modified Tree<TItem> class by using a foreach statement to iterate through a binary tree and display its contents.

Test the enumerator

1. In Solution Explorer, right-click the BinaryTree solution, point to Add, and then click New Project. Add a new project by using the Console Application template. Name the project EnumeratorTest, set the Location to \Microsoft Press\Visual CSharp Step By Step\Chapter 19 in your Documents folder, and then click OK.

NOTE

Make sure that you select the Console Application template from the list of Visual C# templates. Sometimes the Add New Project dialog box displays the templates for Visual Basic or C++ by default.

2. Right-click the EnumeratorTest project in Solution Explorer, and then click Set as StartUp Project.

3. On the PROJECT menu, click Add Reference. In the Add Reference dialog box, in the left pane click Solution, in the middle pane select the BinaryTree project, and then click OK.

The BinaryTree assembly appears in the list of references for the EnumeratorTest project in Solution Explorer.

4. In the Code and Text Editor window displaying the Program class, add the following using directive to the list at the top of the file:

using BinaryTree;

5. Add the following statements shown in bold to the Main method. These statements create and populate a binary tree of integers:

6. static void Main(string[] args)

7. {

8. Tree<int> tree1 = new Tree<int>(10);

9. tree1.Insert(5);

10. tree1.Insert(11);

11. tree1.Insert(5);

12. tree1.Insert(-12);

13. tree1.Insert(15);

14. tree1.Insert(0);

15. tree1.Insert(14);

16. tree1.Insert(-8);

17. tree1.Insert(10);

}

18.Add a foreach statement, as follows in bold, that enumerates the contents of the tree and displays the results:

19.static void Main(string[] args)

20.{

21. ...

22. foreach (int item in tree1)

23. {

24. Console.WriteLine(item);

25. }

}

26.On the DEBUG menu, click Start Without Debugging.

The program runs and displays the values in the following sequence:

–12, –8, 0, 5, 5, 10, 10, 11, 14, 15

image with no caption

27.Press Enter to return to Visual Studio 2012.

Implementing an Enumerator by Using an Iterator

As you can see, the process of making a collection enumerable can become complex and potentially error-prone. To make life easier, C# provides iterators that can automate much of this process.

An iterator is a block of code that yields an ordered sequence of values. An iterator is not actually a member of an enumerable class; rather, it specifies the sequence that an enumerator should use for returning its values. In other words, an iterator is just a description of the enumeration sequence that the C# compiler can use for creating its own enumerator. This concept requires a little thought to understand properly, so consider the following simple example.

A Simple Iterator

The following BasicCollection<T> class illustrates the principles of implementing an iterator. The class uses a List<T> object for holding data and provides the FillList method for populating this list. Notice also that the BasicCollection<T> class implements the IEnumerable<T> interface. The GetEnumerator method is implemented by using an iterator:

using System;

using System.Collections.Generic;

using System.Collections;

class BasicCollection<T> : IEnumerable<T>

{

private List<T> data = new List<T>();

public void FillList(params T [] items)

{

foreach (var datum in items)

{

data.Add(datum);

}

}

IEnumerator<T> IEnumerable<T>.GetEnumerator()

{

foreach (var datum in data)

{

yield return datum;

}

}

IEnumerator IEnumerable.GetEnumerator()

{

// Not implemented in this example

throw new NotImplementedException();

}

}

The GetEnumerator method appears to be straightforward, but it warrants closer examination. The first thing you should notice is that it doesn’t appear to return an IEnumerator<T> type. Instead, it loops through the items in the data array, returning each item in turn. The key point is the use of the yield keyword. The yield keyword indicates the value that should be returned by each iteration. If it helps, you can think of the yield statement as calling a temporary halt to the method, passing back a value to the caller. When the caller needs the next value, the GetEnumerator method continues at the point it left off, looping around and then yielding the next value. Eventually, the data is exhausted, the loop finishes, and the GetEnumerator method terminates. At this point, the iteration is complete.

Remember that this is not a normal method in the usual sense. The code in the GetEnumerator method defines an iterator. The compiler uses this code to generate an implementation of the IEnumerator<T> class containing a Current method and a MoveNext method. This implementation exactly matches the functionality specified by the GetEnumerator method. You don’t actually get to see this generated code (unless you decompile the assembly containing the compiled code), but that is a small price to pay for the convenience and reduction in code that you need to write. You can invoke the enumerator generated by the iterator in the usual manner, as shown in this block of code, which sorts the words in the first line of the poem “Jabberwocky” by Lewis Carroll:

BasicCollection<string> bc = new BasicCollection<string>();

bc.FillList("Twas", "brillig", "and", "the", "slithy", "toves");

foreach (string word in bc)

{

Console.WriteLine(word);

}

This code simply outputs the contents of the bc object in this order:

Twas, brillig, and, the, slithy, toves

If you want to provide alternative iteration mechanisms presenting the data in a different sequence, you can implement additional properties that implement the IEnumerable interface and that use an iterator for returning data. For example, the Reverse property of the BasicCollection<T>class, shown here, emits the data in the list in reverse order:

class BasicCollection<T> : IEnumerable<T>

{

...

public IEnumerable<T> Reverse

{

get

{

for (int i = data.Count - 1; i >= 0; i--)

{

yield return data[i];

}

}

}

}

You can invoke this property as follows:

BasicCollection<string> bc = new BasicCollection<string>();

bc.FillList("Twas", "brillig", "and", "the", "slithy", "toves");

foreach (string word in bc.Reverse)

{

Console.WriteLine(word);

}

This code outputs the contents of the bc object in reverse order:

toves, slithy, the, and, brillig, Twas

Defining an Enumerator for the Tree<TItem> Class by Using an Iterator

In the next exercise, you will implement the enumerator for the Tree<TItem> class by using an iterator. Unlike the preceding set of exercises, which required the data in the tree to be preprocessed into a queue by the MoveNext method, you can define an iterator that traverses the tree by using the more naturally recursive mechanism, similar to the WalkTree method discussed in Chapter 17.

Add an enumerator to the Tree<TItem> class

1. Using Visual Studio 2012, open the BinaryTree solution located in the \Microsoft Press\Visual CSharp Step By Step\Chapter 19\Windows X\IteratorBinaryTree folder in your Documents folder. This solution contains another copy of the BinaryTree project you created in Chapter 17.

2. Open the file Tree.cs in the Code and Text Editor window. Modify the definition of the Tree<TItem> class so that it implements the IEnumerable<TItem> interface, as shown in bold below:

3. public class Tree<TItem> : IEnumerable<TItem> where TItem : IComparable<TItem>

4. {

5. ...

}

6. Right-click the IEnumerable<TItem> interface in the class definition, point to Implement Interface, and then click Implement Interface Explicitly.

The IEnumerable<TItem>.GetEnumerator and IEnumerable.GetEnumerator methods are added to the end of the class.

7. Locate the generic IEnumerable<TItem>.GetEnumerator method. Replace the contents of the GetEnumerator method as shown in bold in the following code:

8. IEnumerator<TItem> IEnumerable<TItem>.GetEnumerator()

9. {

10. if (this.LeftTree != null)

11. {

12. foreach (TItem item in this.LeftTree)

13. {

14. yield return item;

15. }

16. }

17.

18. yield return this.NodeData;

19.

20. if (this.RightTree != null)

21. {

22. foreach (TItem item in this.RightTree)

23. {

24. yield return item;

25. }

26. }

}

It might not be obvious at first glance, but this code follows the same recursive algorithm that you used in Chapter 17 for listing the contents of a binary tree. If LeftTree is not empty, the first foreach statement implicitly calls the GetEnumerator method (which you are currently defining) over it. This process continues until a node is found that has no left subtree. At this point, the value in the NodeData property is yielded, and the right subtree is examined in the same way. When the right subtree is exhausted, the process unwinds to the parent node, outputting the parent’s NodeData property and examining the right subtree of the parent. This course of action continues until the entire tree has been enumerated and all the nodes have been output.

Test the new enumerator

1. In Solution Explorer, right-click the BinaryTree solution, point to Add, and then click Existing Project. In the Add Existing Project dialog box, move to the folder \Microsoft Press\Visual CSharp Step By Step\Chapter 19\Windows X\BinaryTree\EnumeratorTest, select the EnumeratorTest project file, and then click Open.

This is the project that you created to test the enumerator you developed manually earlier in this chapter.

2. Right-click the EnumeratorTest project in Solution Explorer, and then click Set as StartUp Project.

3. In Solution Explorer, expand the References folder for the EnumeratorTest project. Right-click the BinaryTree reference and then click Remove.

4. On the PROJECT menu, click Add Reference. In the Add Reference dialog box, in the left pane click Solution, in the middle pane select the BinaryTree project, and then click OK.

NOTE

These two steps ensure that the EnumeratorTest project references the correct version of the BinaryTree assembly. It should use the assembly that implements the enumerator by using the iterator rather than the version created in the previous set of exercises in this chapter.

5. Display the Program.cs file for the EnumeratorTest project in the Code and Text Editor window. Review the Main method in the Program.cs file. Recall from testing the earlier enumerator that this method instantiates a Tree<int> object, fills it with some data, and then uses a foreachstatement to display its contents.

6. Build the solution, correcting any errors if necessary.

7. On the DEBUG menu, click Start Without Debugging.

The program runs and displays the values in the same sequence as before:

–12, –8, 0, 5, 5, 10, 10, 11, 14, 15

8. Press Enter and return to Visual Studio 2012.

Summary

In this chapter, you saw how to implement the IEnumerable<T> and IEnumerator<T> interfaces with a collection class to enable applications to iterate through the items in the collection. You also saw how to implement an enumerator by using an iterator.

§ If you want to continue to the next chapter

Keep Visual Studio 2012 running, and turn to Chapter 20.

§ If you want to exit Visual Studio 2012 now

On the FILE menu, click Exit. If you see a Save dialog box, click Yes and save the project.

Chapter 19 Quick Reference

To

Do this

Make a collection class enumerable, allowing it to support the foreach construct

Implement the IEnumerable interface, and provide a GetEnumerator method that returns an IEnumerator object. For example:

public class Tree<TItem> : IEnumerable<TItem>

{

...

IEnumerator<TItem> GetEnumerator()

{

...

}

}

Implement an enumerator without using an iterator

Define an enumerator class that implements the IEnumerator interface and that provides the Current property and the MoveNext method (and optionally the Reset method). For example:

public class TreeEnumerator<TItem> : IEnumerator<TItem>

{

...

TItem Current

{

get

{

...

}

}

bool MoveNext()

{

...

}

}

Define an enumerator by using an iterator

Implement the enumerator to indicate which items should be returned (using the yield statement) and in which order. For example:

IEnumerator<TItem> GetEnumerator()

{

for (...)

{

yield return ...

}

}