Working with Arrays and Collections - Programming in C# - Sams Teach Yourself C# 5.0 in 24 Hours (2013)

Sams Teach Yourself C# 5.0 in 24 Hours (2013)

Part II: Programming in C#

Hour 10. Working with Arrays and Collections


What You’ll Learn in This Hour

Single and multidimensional arrays

Indexers

Generic collections

Collection initializers

Collection interfaces

Enumerable objects and iterators


The majority of problems solved by computer programs involve working with large amounts of data. Sometimes there is a lot of individual and unrelated datum, but many times, there are large amounts of related datum. C# provides a rich set of collection types that enable you to manage large amounts of data.

An array is the simplest type of collection and is the only one that has direct language support. Although the other collection types offer more flexibility than arrays, including the ability to create your own specialized collections, they are generally used in similar ways.

In this hour, you learn to work with different types of arrays, including multidimensional and jagged arrays. When you understand how to work with arrays, you move to using some of the different collection classes provided by the .NET Framework, of which there are more than 40 different classes, base classes, and interfaces.

Single and Multidimensional Arrays

An array is a numerically indexed collection of objects that are all the same type. Although C# provides direct language support for creating arrays, the common type system means that you are implicitly creating a type of System.Array. As a result, arrays in C# are reference types, and the memory allocated to refer to the array itself is allocated on the managed heap. However, the elements of the array, which are the individual items contained in the array, are allocated based on their own type.

In its simplest form, the declaration of a variable of array type looks like this:

type[] identifier;


Note: Arrays

Arrays in C# are different from arrays in C because they are actually objects of type System.Array. As a result, C# arrays provide the power and flexibility afforded by classes through properties and methods with the simple syntax offered by C style arrays.


The type indicates the type of each element that will be contained in the array. Because the type is declared only once, all the elements in the array must be of that same type. The square brackets are the index operator and tell the compiler that you are declaring an array of the given type; they are the only difference between an array declaration and a regular variable declaration. In contrast to other languages, such as C, the size of a dimension is specified when the array is instantiated rather than when it is declared.

To create an array that can contain five integer values, you can specify it like this:

int[] array = new int[5];


Caution: Array Sizes

In C#, the size of an array, obtained through the Length property, is the total number of elements in all the dimensions of the array, not the upper bound of the array.

Because C# uses a zero-based indexing scheme for arrays, the first element in an array is at position 0. Therefore, the following statement declares an array of five elements with indices 0 through 4:

int[] array = new int[5];

The length of the array is 5, but the upper bound is 4.


This type of array declaration creates a single-dimensional rectangular array. The length of each row in a rectangular array must be the same size. This restriction results in a rectangular shape and is what gives rectangular arrays their name. To declare a multidimensional rectangular array, you specify the number of dimensions, or rank of the array, using commas inside the square brackets. The most common multidimensional arrays are two-dimensional arrays, which can be thought of as rows and columns. An array cannot have more than 32 dimensions. For example, to create a two-dimensional array that represents a 5 x 2 table of integer values, you can specify it like this:

int[,] array = new int[5, 2];

In addition to rectangular multidimensional arrays, C# also supports jagged arrays. Because each element of a jagged array is an array itself, each row of the array does not need to be the same size like it does for a rectangular array.


Note: Jagged Rectangular Arrays

In the following code, you create a two-dimensional array a with six elements (three rows of two elements) and a one-dimensional array j whose elements are a one-dimensional array with one, two, and three elements, respectively:

int[,] a = {
{10, 20},
{30, 40},
{50, 60} };

int[][] j = {
new[] {10},
new[] {20, 30},
new[] {40, 50, 60} };

When you try to make a jagged array of rectangular arrays is when things can get confusing. What is the type of the following?

int[,][] j2;

This is actually a two-dimensional array (three rows of two elements) whose elements are each a one-dimensional array. To initialize such an array, you would write

j2 = new int[3,2][];

For this reason, it is almost always better if you can use one of the generic collections. It is completely clear that List<int[,]> means a list of two-dimensional arrays, whereas List<int>[,] means a two-dimensional array whose elements are List<int>.


The type system requires that all variables be initialized before use and provides default values for each data type. Arrays are no different. For arrays containing numeric elements, each element is initially set to 0; for arrays containing reference types, including string types, each element is initially null. Because the elements of a jagged array are other arrays, they are initially null as well.

Array Indexing

For arrays to be useful, it is necessary to access specific elements of the array. This is done by enclosing the numeric position of the desired element in the index operator. To access an element of a multidimensional or a jagged array, both index locations must be provided.


Try It Yourself: Array Indexing

By following these steps, you see how to access an array by index:

1. Create a new console application.

2. In the Main method of the Program.cs file, declare an integer array named array of five elements.

3. Write a for statement that initializes each element of the array to its index position times 2.

4. Write another for statement that prints the value of each element so that it follows this format, where index and value correspond to the array index and the value at that index:

array[index] = value

5. Run the application using Ctrl+F5 and observe that the output matches what is shown in Figure 10.1.

Image

Figure 10.1. Results of working with array indexers.


Array Initialization

Using the new operator creates an array and initializes the array elements to their default values. In this case, the rank specifier is required for the compiler to know the size of the array. When declaring an array in this manner, additional code must be written.

Listing 10.1 shows what you should have written for step 3 of the previous exercise.

Listing 10.1. Traditional Array Initialization


class Program
{
static void Main()
{
int[] array = new int[5];
for(int i = 0; i < array.Length; i++)
{
array[i] = i * 2;
}
}
}


Fortunately, C# provides a shorthand form that enables the array declaration and initialization to be written so that the array type does not need to be restated. This shorthand notation is called an array initializer and can be used for local variable and field declarations of arrays, or immediately following an array constructor call. An array initializer is a sequence of variable initializers separated by a comma and enclosed by curly braces. Each variable initializer is an expression or a nested array initializer when used with multidimensional arrays.

When an array initializer is used for a single-dimensional array, it must consist of a sequence of expressions that are assignment-compatible with the element type of the array. The expressions initialize elements starting at index 0, and the number of expressions determines the length of the array being created.

Listing 10.2 shows how to use a simple array initializer that results in the same initialized array as shown in Listing 10.1.

Listing 10.2. Array Initialization


class Program
{
static void Main()
{
int[] array = {0, 2, 4, 6, 8};
}
}


Although array initializers are useful for single-dimensional arrays, they become powerful for multidimensional arrays that use nested array initializers. In this case, the levels of nesting in the array initializer must be equal to the number of dimensions in the array. The leftmost dimension is represented by the outermost nesting level, and the rightmost dimension is represented by the innermost nesting level.

For example, the statement

int[,] array = { {0, 1}, {2, 3}, {4, 5}, {6, 7}, {8, 9} };

is equivalent to the following:

int[,] array = new int[5, 2];
array[0, 0] = 0; array[0, 1] = 1;
array[1, 0] = 2; array[1, 1] = 3;
array[2, 0] = 4; array[2, 1] = 5;
array[3, 0] = 6; array[3, 1] = 7;
array[4, 0] = 8; array[4, 1] = 9;

The System.Array Class

The System.Array class is the base class for arrays, but only the runtime and compilers can explicitly derive from it. Despite this restriction, there are more than 25 different static methods available for you to use. These methods operate primarily on one-dimensional arrays, but because those are the most common type of array, this restriction generally isn’t very limiting. The more common methods are shown in Table 10.1.

Table 10.1. Common Static Methods of System.Array

Image

Indexers

You have seen how useful and simple accessing arrays can be through the index operator. Although it is not possible to override the index operator, your own classes can provide an indexer that enables them to be indexed in the same way as an array.

Indexers are declared in a similar manner as properties, but there are some important differences. The most important differences are as follows:

• Indexers are identified by signature rather than by name.

• Indexers must be an instance member only.

The signature for an indexer is the number and types of its formal parameters. Because indexers are identified by signature, it is possible to include overloaded indexers as long as their signatures are unique within the class.

To declare an indexer, you use the following syntax:

type this [type parameter]
{
get;
set;
}

The modifiers allowed for an indexer are new, virtual, sealed, override, abstract, and a valid combination of the four access modifiers. Remember, because an indexer must be an instance member, it is not allowed to be static. The formal parameter list for an indexer must contain at least one parameter, but can contain more than one separated by a comma. This is similar to the formal parameter list for methods. The type for the indexer determines the type of the object returned by the get accessor.

An indexer should always provide a get accessor (although it isn’t required) but does not need to provide a set accessor. Indexers that provide only a get accessor are read-only indexers because they do not allow assignments to occur.


Try It Yourself: Indexers

To create an indexer for a custom class, follow these steps.

1. Create a new console application.

2. Add a new class file named IndexerExample.cs, and replace the contents with this code:

class IndexerExample
{
private string[] array = new string[4] { "now", "is", "the", "time" };

public int Length
{
get
{
return this.array.Length;
}
}

public string this[int index]
{
get
{
return this.array[index];
}
set
{
this.array[index] = value;
}
}
}

3. In the Main method of the Program.cs file, declare a new variable of type IndexerExample named example1.

4. Write a for statement that prints the value of each element so that it follows this format, in which index and value correspond to the array index and the value at that index:

index[index] = value

5. Run the application using Ctrl+F5, and observe that the output matches what is shown in Figure 10.2.

Image

Figure 10.2. Results of using a custom indexer.

6. Now, declare a new variable of type IndexerExample named example2 and write a for statement that sets the values of example2 to those of example1 in reverse order. This for statement should look like:

for (int i = example1.Length - 1, j = 0; i >= 0; i--, j++)
{
example2[j] = example1[i];
}

7. Copy the for statement you wrote in step 4 below the code you just wrote, and change it to print the values of example2. You might want to print a blank line before the for statement starts.

8. Run the application using Ctrl+F5, and observe that the output matches what is shown in Figure 10.3.

Image

Figure 10.3. Results of using a custom indexer over two instances.


Generic Collections

Although arrays are the only built-in data structure in C# that supports the concept of a collection of objects, the base class library provides a rich set of collection and collection-related types to supplement them, which provide much more flexibility and enable you to derive custom collections for your own data types.


Go To

HOUR 12, “UNDERSTANDING GENERICS,” for more information on generic types.


These types are separated into classes and interfaces, and are further separated into nongeneric and generic collections. The nongeneric collections are not type safe because they work only with the object type, and are available for backward compatibility with older versions of the .NET Framework. The generic collections are preferred because they provide type safety and better performance than the nongeneric collections. There are almost twice as many generic collections as there are nongeneric collections.


Tip: Generic Collections and Backward Compatibility

The .NET for Windows Store apps and Silverlight versions of the .NET Framework remove the nongeneric collections because there was no legacy code to be compatible with.


Lists

Within the set of collection types, List<T> is the one that could be considered to be closest to an array and is probably the most commonly used collection. Like an array, it is a numerically indexed collection of objects. Unlike an array, a List<T> is dynamically sized as needed.

The default capacity for a list is 16 elements. When you add the 17th element, the size of the list is automatically doubled. If the number (or an approximate number) of elements the list contains is known ahead of time, it can be beneficial to set the initial capacity using one of the overloaded constructors or by setting the Capacity property before adding the first item.

Table 10.2 lists some of the commonly used properties and methods of List<T>. If you compare this with the common static methods of System.Array (refer to Table 10.1), you should see a great deal of similarity.

Table 10.2. Common Members of List<T>

Image

Related to List<T> is LinkedList<T>, which is a general-purpose doubly linked list and might be the better choice when elements will be most commonly added at specific locations in the list and access to the elements will always be sequential.


Try It Yourself: Working with List<T>

To see how List<T> works, follow these steps. Keep Visual Studio open at the end of this exercise because you will use this application later.

1. Create a new console application.

2. In the Main method of the Program.cs file, declare a new integer list named list using the following statement:

List<int> list = new List<int>();

3. Write a for statement that initializes 16 elements of the list to its index position times 2.

4. Write a foreach statement that prints the value of each element.

5. Now, print the value of the Capacity property.

6. Run the application using Ctrl+F5, and observe that the output matches what is shown in Figure 10.4.

Image

Figure 10.4. Results of working with List<T>.

7. Now, duplicate the code you wrote in steps 3 through 5.

8. Run the application again using Ctrl+F5, and observe that the output matches what is shown in Figure 10.5.

Image

Figure 10.5. Results of List<T> showing an increased capacity.


Collections

Although List<T> is powerful, it has no virtual members and does not enable a way to prevent modification of the list. Because there are no virtual members, it is not easily extended, which can limit its usefulness as a base class for your own collections.

To create your own collections that can be accessed by a numeric index like an array, you can derive your collection from Collection<T>. It is also possible to use the Collection<T> class immediately by creating an instance and supplying the type of object to be contained in the collection.Table 10.3 shows some of the common members of Collection<T>.

Table 10.3. Common Members of Collection<T>

Image


Try It Yourself: Working with Collection<T>

By following these steps, you see how Collection<T> works. If you closed Visual Studio, repeat the previous exercise first. Be sure to keep Visual Studio open at the end of this exercise because you will use this application later.

1. Change the declaration of list from type List<int> to be of type Collection<int>. You might need to include the System.Collections.ObjectModel namespace.

2. Correct the two compiler errors by changing the Console.WriteLine statements to print the value of the Count property instead.

3. Run the application using Ctrl+F5, and observe that the output matches what is shown in Figure 10.6.

Image

Figure 10.6. Results of working with Collection<T>.


To derive your own collection class, Collection<T> provides several protected virtual methods you can override to customize the behavior of the collection. These virtual methods are shown in Table 10.4.

Table 10.4. Protected Virtual Members of Collection<T>

Image


Try It Yourself: Deriving Your Own Collection

To derive a concrete (closed) integer collection, follow these steps. If you closed Visual Studio, repeat the previous exercise first.

1. Add a new class named Int32Collection, which derives from Collection<int> and overrides the InsertItem method. The body of the overridden InsertItem method should look like this:

protected override void InsertItem(int index, int item)
{
Console.WriteLine("Inserting item {0} at position {1}", item, index);
base.InsertItem(index, item);
}

2. Change the declaration of list from type Collection<int> to be of type Int32Collection.

3. Run the application using Ctrl+F5, and observe that the output matches what is shown in Figure 10.6. If you scroll the console window up, you should see the output from the overridden InsertItem method, as shown in Figure 10.7.

Image

Figure 10.7. Results of a custom defined Collection<int>.


Related to Collection<T> is ReadOnlyCollection<T>, which can be used immediately just like Collection<T> or can be used to create your own read-only collections. It is also possible to create a read-only collection from an instance of a List<T>. The most commonly used members are shown in Table 10.5.

Table 10.5. Common Members of ReadOnlyCollection<T>

Image


Tip: ReadOnlyCollection<T>

You can think of ReadOnlyCollection<T> as a wrapper around an existing mutable collection, which throws exceptions if you try to change it. The underlying collection is still mutable.


Dictionaries

List<T> and Collection<T> are useful for general-purpose collections, but sometimes it is necessary to have a collection that can provide a mapping between a set of keys and a set of values, not allowing duplicate keys.

The Dictionary<TKey, TValue> class provides such a mapping and enables access using the key rather than a numeric index. To add an element to a Dictionary<TKey, TValue> instance, you must provide both a key and a value. The key must be unique and cannot be null, but if TValue is a reference type, the value can be null. Table 10.6 shows the common members of the Dictionary<TKey, TValue> class.

Table 10.6. Common Members of Dictionary<TKey, TValue>

Image

Unlike List<T> and Collection<T> when the elements of a dictionary are enumerated, the dictionary returns a KeyValuePair<TKey, TValue> structure that represents a key and its associated value. Because of this, the var keyword is useful when using the foreach statement to iterate over the elements of a dictionary.

Normally, a dictionary does not provide any order to the elements it contains, and those elements are returned in an arbitrary order during enumeration. Although List<T> provides a Sort method to sort the elements of the list, dictionaries do not. If you need a collection that maintains sorting as you add or remove elements, you actually have two different choices: a SortedList<TKey, TValue> or a SortedDictionary<TKey, TValue>. The two classes are similar and provide the same performance when retrieving an element but are different in memory use and performance of element insertion and removal:

• SortedList<TKey, TValue> uses less memory.

• SortedDictionary<TKey, TValue> provides faster insertion and removal for unsorted data.

• SortedList<TKey, TValue> is faster when populating the list at one time from sorted data.

• SortedList<TKey, TValue> is more efficient for indexed retrieval of keys and values.

The common members of SortedList<TKey, TValue> and SortedDictionary<TKey, TValue> are shown in Table 10.7.

Table 10.7. Common Members of SortedList and SortedDictionary

Image


Try It Yourself: Working with Dictionaries

To use a Dictionary<TKey, TValue> and a SortedDictionary<TKey, TValue> for storing and retrieving data by an arbitrary key, follow these steps:

1. Create a new console application.

2. In the Main method of the Program.cs file, declare a new Dictionary<string, double> named dictionary.

3. Add the following lines to initialize the dictionary:

dictionary.Add("Speed Of Light", 2.997924580e+8F);
dictionary.Add("Gravitational Constant", 6.67428e-11F);
dictionary.Add("Planck's Constant", 6.62606896e-34F);
dictionary.Add("Atomic Mass Constant", 1.660538782e-27F);
dictionary.Add("Avogadro's number", 6.02214179e+23F);
dictionary.Add("Faraday Constant", 9.64853399e+4F);
dictionary.Add("Electron Volt", 1.602176487e-19F);

4. Write a foreach statement that prints the name of the key and its associated value. You can use either var or KeyValuePair<string, double> for the type of the iteration variable.

5. Run the application using Ctrl+F5, and observe that the output matches what is shown in Figure 10.8. Notice that the output is displayed in the same order the values were entered in the dictionary.

Image

Figure 10.8. Result of using Dictionary<TKey, TValue>.

6. Change the declaration of dictionary to be a SortedDictionary<string, Double>.

7. Run the application using Ctrl+F5, and observe that the output matches what is shown in Figure 10.9. Notice that by simply changing the declaration, the output is displayed in alphabetically sorted order by key.

Image

Figure 10.9. Result of using SortedDictionary<TKey, TValue>.


Sets

In mathematics, a set is a collection containing no duplicate elements that are stored and accessed in random order. In addition to standard insert and remove operations, sets support superset, subset, intersection, and union operations.

Sets in .NET are available through the HashSet<T> and SortedSet<T> classes. HashSet<T> is equivalent to a mathematical set, whereas SortedSet<T> maintains a sorted order through insertion and removal of elements without affecting performance. Both classes do not allow duplicate elements. These classes share an almost identical public interface, shown in Table 10.8.

Table 10.8. Common Members of HashSet<T> and SortedSet<T>

Image

Image


Tip: Why HashSet<T>?

Unlike most of the other generic collections, HashSet<T> has a name that is based on its implementation details rather than its purpose. The reason is that Set is a reserved word in Visual Basic, so it could be used only by escaping it:

Dim s as [Set] of Int

Rather than requiring this syntax, the designers of the .NET Framework chose a name that would not conflict with any reserved words.



Try It Yourself: Working with Sets

By following these steps, you see how to work with HashSet<T> and SortedSet<T>:

1. Create a new console application.

2. In the Main method of the Program.cs file, declare two integer sets, as follows:

HashSet<int> even = new HashSet<int>() { 0, 2, 4, 6, 8, 10, 12 };
HashSet<int> odd = new HashSet<int>() { 1, 3, 5, 7, 9, 11 };

3. Print the result of performing a SetEquals call between even and odd.

4. Run the application using Ctrl+F5, and observe that the output is False.

5. Calculate the union between even and odd by executing the UnionWith method.

6. Print the contents of the even set using a foreach statement.

7. Run the application again using Ctrl+F5, and observe that the output contains the numbers from the original even set and the numbers from the odd set.

8. Calculate the intersection between even and odd by executing the IntersectWith method.

9. Repeat the line of code entered from step 3.

10. Run the application again using Ctrl+F5 and observe that the final output matches what is shown in Figure 10.10.

Image

Figure 10.10. Result of using HashSet<T>.

11. Change the declarations of even and odd to use SortedSet<int>.

12. Run the application again using Ctrl+F5, and observe that the final output matches what is shown in Figure 10.11.

Image

Figure 10.11. Result of using SortedSet<T>.


Stacks and Queues

Stacks and queues are relatively simple collections that represent either a last-in-first-out (LIFO) collection or first-in-first-out (FIFO) collection. Even though these are simple collections, they are nevertheless very useful as well. Queues are useful for storing data in the order received for sequential processing, whereas stacks are useful for operations such as statement parsing. In general, stacks and queues are used when operations should be restricted to either the beginning or end of the list.

The Stack<T> class provides a stack implemented as an array in which operations always occur at the end of that array and can contain duplicate elements and null elements. Stack<T> provides a simple public interface, shown in Table 10.9.

Table 10.9. Common Members of Stack<T>

Image


Try It Yourself: Working with Stack<T>

To implement an integer stack, follow these steps:

1. Create a new console application.

2. In the Main method of the Program.cs file, declare an integer stack named stack.

3. Push the values 0, 2, and 4 on to the stack.

4. Print the current top of the stack by calling the Peek() method.

5. Push the values 6, 8, and 10 on to the stack.

6. Print the current top of the stack by calling the Pop() method and then again by calling the Peek() method.

7. Run the application using Ctrl+F5, and observe that the output matches what is shown in Figure 10.12.

Image

Figure 10.12. Results of working with Stack<T>.


The Queue<T> class provides a queue implemented as an array in which insert operations always occur at one end of the array and remove operations occur at the other. Queue<T> also allows duplicate elements and null as an element. Queue<T> provides a simple public interface, as shown inTable 10.10.

Table 10.10. Common Members of Queue<T>

Image


Try It Yourself: Working with Queue<T>

By following these steps, you implement an integer queue:

1. Create a new console application.

2. In the Main method of the Program.cs file, declare an integer queue named queue.

3. Add the values 1, 3, and 5 to the end of the queue by calling the Enqueue() method.

4. Print the current beginning of the queue by calling the Peek() method.

5. Add the values 7, 9, and 11 to the queue.

6. Print the current beginning of the queue by calling the Dequeue() method and then again by calling the Peek() method.

7. Run the application using Ctrl+F5, and observe that the output matches what is shown in Figure 10.13.

Image

Figure 10.13. Results of working with Queue<T>.


Collection Initializers

Just as arrays provide array initialization syntax and objects provide object initialization syntax, collections also provide collection initialization syntax, in the form of a collection initializer. If you look back through the exercises for this hour, you can notice the use of collection initializers already.

Collection initializers enable you to specify one or more element initializers, which enable you to initialize a collection that implements IEnumerable. Using a collection initializer enables you to omit multiple calls to the Add method of the collection, instead letting the compiler add the calls for you. An element initializer can be a value, an expression, or an object initializer.


Caution: Collection Initializers Work Only with Add Methods

Collection initializers can be used only with collections that contain an Add method. This means they cannot be used for collections such as Stack<T> and Queue<T>.


The syntax for a collection initializer is similar to an array initializer. You must still call the new operator but can then use the array initialization syntax to populate the collection. Listing 10.3 shows the same example as Listing 10.2, but uses List<int> and a collection initializer.

Listing 10.3. Collection Initialization


class Program
{
static void Main()
{
List<int> list = new List<int>() {0, 2, 4, 6, 8 };
}
}


Collection initializers can also be more complex, enabling you to use object initializers for the element initializers, as shown in Listing 10.4.

Listing 10.4. Complex Collection Initialization


class Program
{
static void Main()
{
List<Contact> list = new List<Contact>()
{
new Contact() { FirstName = "Scott", LastName = "Dorman" },
new Contact() { FirstName = "Jim", LastName = "Morrison" },
new Contact() { FirstName = "Ray", LastName = "Manzarek" }
};

foreach(Contact c in list)
{
Console.WriteLine(c);
}
}
}


By enabling object initializers inside a collection initializer, it becomes possible to use collection initializers for any type of collection, including dictionaries, whose Add method takes multiple parameters.

Collection Interfaces

If you look back over the common members and properties for all the different collections, you should notice some similarities and consistency in their public interfaces. This consistency stems from a set of interfaces that these collection classes implement. Some classes implement more interfaces than others do, but all of them implement at least one interface.

The collection interfaces can be divided in to those that provide specific collection implementation contracts and those that provide supporting implementations, such as comparison and enumeration features. In the first group, those that provide specific collection behavior, there are just four interfaces:

• ICollection<T> defines the methods and properties for manipulating generic collections.

• IList<T> extends ICollection<T> to provide the methods and properties for manipulating generic collections whose elements can be individually accessed by index.

• IDictionary<TKey, TValue> extends ICollection<T> to provide the methods and properties for manipulating generic collections of key/value pairs in which each pair must have a unique key.

• ISet<T> extends ICollection<T> to provide the methods and properties for manipulating generic collections that have unique elements and provide set operations.

The second group also contains only four interfaces, as follows:

• IComparer<T> defines a method to compare two objects. This interface is used with the List<T>.Sort and List<T>.BinarySearch methods and provides a way to customize the sort order of a collection. The Comparer<T> class provides a default implementation of this interface and is usually sufficient for most requirements.

• IEnumerable<T> extends IEnumerable and provides support for simple iteration over a collection by exposing an enumerator. This interface is included for parity with the nongeneric collections, and by implementing this interface, a generic collection can be used any time an IEnumerableis expected.

• IEnumerator<T> also provides support for simple iteration over a collection.

• IEqualityComparer<T> provides a way for you to provide your own definition of equality for type T. The EqualityComparer<T> class provides a default implementation of this interface and is usually sufficient for most requirements.

Enumerable Objects and Iterators

If you look at the definitions given for IEnumerable<T> and IEnumerator<T>, you notice that they are similar. These interfaces (and their nongeneric counterparts) compose what is commonly referred to as the iterator pattern.

This pattern essentially enables you to ask an object that implements IEnumerable<T>, otherwise known as an enumerable object, for an enumerator (an object that implements IEnumerator<T>). When you have an IEnumerator<T>, you can then enumerate (or iterate) over the data one element at a time.

For example, the code in Listing 10.5 shows a typical foreach statement that iterates over the contents of an integer list, printing each value.

Listing 10.5. A Simple foreach Statement


List<int> list = new List<int>() { 0, 2, 4, 6, 8 };
foreach(int i in list)
{
Console.WriteLine(i);
}


This code is actually translated by the compiler into something that looks approximately like the code shown in Listing 10.6.

Listing 10.6. The Compiler Expanded foreach Statement


List<int> list = new List<int>() { 0, 2, 4, 6, 8 };
IEnumerator<int> iterator = ((IEnumerable<int>)list).GetEnumerator();
while (iterator.MoveNext())
{
Console.WriteLine(iterator.Current);
}


The GetEnumerator method is defined by the IEnumerable<T> interface (in fact, it’s the only member defined by the interface) and simply provides a way for the enumerable object to expose an enumerator. It is the enumerator, through the MoveNext method and Current property (defined by theIEnumerator<T> and IEnumerator interfaces), that enables you to move forward through the data contained in the enumerable object.


Note: Why Two Interfaces?

By keeping a distinction between an enumerable object (IEnumerable<T>) and an enumerator over that object (IEnumerator<T>), you can perform multiple iterations over the same enumerable source. Obviously, you don’t want the iterations interfering with each other as they move through the data, so you need two independent representations (enumerators) providing you the current data element and the mechanism to move to the next one.


Fortunately, all the generic collections (and the nongeneric ones as well) implement IEnumerable<T>, IEnumerable, IEnumerator<T>, and IEnumerator already, so you can use them without the extra complexity of writing the iterator mechanism yourself.

What happens if you want your own class to offer similar behavior? You can certainly make your class implement IEnumerable<T> and then write your own IEnumerator<T> derived class, but doing so isn’t necessary if you use an iterator. This enables the compiler to do all the hard work for you.

An iterator is a method, property get accessor, or operator that returns an ordered sequence of values all of which are the same type. The yield keyword tells the compiler the block it appears in is an iterator block. A yield return statement returns each element, and a yield break statement ends the iteration. The body of such an iterator method, accessor, or operator has the following restrictions:

• Unsafe blocks are not permitted.

• Parameters cannot be ref or out parameters.

• A yield return statement cannot be located anywhere within a try-catch block. It can be located in a try block if it is followed by a finally block.

• A yield break statement can be located in either a try block or a catch block, but not a finally block.

• A yield statement cannot be in an anonymous method.

Listing 10.7 shows code that would produce the same output as that shown in Listing 10.6 using an iterator instead.

Listing 10.7. A Simple Iterator


class Program
{
static IEnumerable<int> GetEvenNumbers()
{
yield return 0;
yield return 2;
yield return 4;
yield return 6;
yield return 8;
}

static void Main()
{
foreach(int i in GetEvenNumbers())
{
Console.WriteLine(i);
}
}
}


Iterators can do much more than simply yield values. As long as you abide by the restrictions previously mentioned, iterators can have any degree of sophistication in how they determine the values to yield. Listing 10.8 produces the same output as that from Listing 10.7 but does so using a more complex iterator.

Listing 10.8. A More Complex Iterator


class Program
{
static IEnumerable<int> GetEvenNumbers()
{
for (int i = 0; i <= 9; i++)
{
if (i % 2 == 0)
{
yield return i;
}
}
}

static void Main()
{
foreach(int i in GetEvenNumbers())
{
Console.WriteLine(i);
}
}
}


Summary

In this hour, you have learned how to work with large amounts of data using arrays and collections. Arrays are the simplest type of collection and have direct language support, whereas the other collection types, such as List<T> and Stack<T>, offer more flexibility and enable you to create your own specialized collections. You also learned how indexers enable your own classes to provide the same type of index-based access that arrays and collections offer. Finally, you looked at the different collection interfaces available and saw how you can easily declare and initialize both arrays and collections using initializer syntax.

This completes your programming foundation. You now know the language syntax and fundamentals of creating your own classes, structs, enums, and collections, and how to manipulate string data and control program flow.

Q&A

Q. What is an array?

A. An array is a numerically indexed collection of objects that are all of the same compile-time type.

Q. Are arrays in C# zero-based or one-based?

A. Arrays in C# use a zero-based indexing scheme.

Q. Can arrays in C# be resized?

A. Arrays in C# can be indirectly resized using the Array.Resize static method, which creates a new array, if needed, and copying the values from the old array to the new one.

Q. What is an indexer?

A. An indexer is a special type of property that enables a class to be indexed like an array. Indexers are identified by signature rather than by name and must be an instance member.

Q. What is the List<T> collection?

A. The List<T> collection is the most commonly used collection and is similar to an array but is dynamically sized as needed.

Q. What is a dictionary?

A. A dictionary is a collection that does not enable duplicates that can provide a mapping between a set of keys and a set of values.

Q. What is the difference between HashSet<T> and SortedSet<T>?

A. HashSet<T> is equivalent to a mathematical set, whereas SortedSet<T> maintains a sorted order through insertion and removal of elements without affecting performance. Both classes do not allow duplicate elements.

Q. What is the IEnumerable<T> interface?

A. IEnumerable<T> extends IEnumerable and provides support for simple iteration over a collection by exposing an enumerator. This interface is included for parity with the nongeneric collections, and by implementing this interface, a generic collection can be used any time an IEnumerableis expected.

Q. What are an array initializer and a collection initializer?

A. An array initializer and a collection initializer are a special syntax that enables you to easily declare and initialize the contents of an array or collection. Array initializers enable you to omit the array type and simply provide the values for the array. Collection initializers enable you to omit multiple calls to the Add method of the collection by providing the values for each element in the initializer.

Workshop

Quiz

1. How is memory for an array allocated?

2. What modifiers are allowed for an indexer?

3. When would you use Collection<T> instead of List<T>?

4. What object is returned when the elements of a dictionary are enumerated?

5. What operations are available on Stack<T> that modify state?

6. What interface does IDictionary<TKey, TValue> extend?

7. What is the difference between IComparer<T> and IEqualityComparer<T>?

8. Can collection initializers be used with collections whose Add method takes more than one parameter?

Answers

1. Arrays are implicitly an instance of a System.Array object, so they are reference types, and the memory allocated to refer to the array itself is allocated on the managed heap. However, the elements of the array are allocated based on their own type.

2. The modifiers allowed for an indexer are new, virtual, sealed, override, abstract, and a valid combination of the four access modifiers.

3. List<T> is not designed to be easily extended because there are no virtual members that a derived class can override to modify behavior. Collection<T> does have virtual members that can be overridden, so Collection<T> is preferred when you need to customize behavior of the collection.

4. When the elements of a dictionary are enumerated, the dictionary returns a KeyValuePair<TKey, TValue> structure that represents a key and its associated value.

5. The Stack<T> class provides the Clear, Pop, and Push methods that remove all elements, remove the top element, or insert an element.

6. IDictionary<TKey, TValue> extends the ICollection<T> interface.

7. IComparer<T> defines a method to compare two objects, whereas IEqualityComparer<T> provides a way for you to provide your own definition of equality for type T.

8. Yes, collection initializers can be used with collections whose Add method takes more than one parameter.

Exercises

1. Add a new public class named PhotoCollection that derives from ObservableCollection<IPhoto> to the PhotoViewer project. This class should have a string field named path, a public constructor that accepts a string parameter and sets the field to the parameter value, and a public property named Path that gets or sets the value of the directory field.

2. Add a private PhotoCollection field named photos and a read-only property named Photos to the MainWindow class. In the constructor of the MainWindow class, add the following code after the InitializeComponent() method call:

var folder = Environment.GetFolderPath(Environment.SpecialFolder.MyPictures);
this.photos = new PhotoCollection(folder);